Integrative analysis of genomic variants in carcinogenesis

Download Report

Transcript Integrative analysis of genomic variants in carcinogenesis

Artificial Intelligence and Computational Biology Group
Integrative Analysis of Genomic Variants in Carcinogenesis
Syed Haider, Arek Kasprzyk, Pietro Lio
t(9;12)(p24;p13)
Introduction
Cancer is a group of complex diseases that involve
the changes of DNA either in sequence or
modifications of the cytosine bases or histone
proteins that package the genome. In the last
decade, many important genes responsible for the
genesis of various cancers have been discovered and
the pathways through which they act are
characterized. Generally, three classes of genes have
been linked to the tumorigenesis: oncogenes, tumorsuppressor genes and stability genes (Vogelstein et
al., 2004). Clinical studies have revealed that cancer
is a highly heterogeneous disease. Many cancer
types/sub types exist depending on the classification
criteria and each cancer type/subtype has very
different characteristics and reacts differently to
treatment. Even for patients diagnosed with the same
cancer type/subtype, they follow different clinical
courses and show different responses to therapy
therefore require individualized treatment to
maximize efficacy and minimize toxicity. A key
approach to conquer cancer is to identify cancer
biomarkers that can be used for early diagnosis,
accurate cancer type classification, and accurately
predicting therapeutic outcome. Therefore, cancer
biomarker discovery has been at the centre cancer
research since the last decade.
Figure 1. Spread of cancer associated
genes in chromosome 9 and 12. Each
carrying number of mutations as well
as significant copy number variation.
Chr12, gene TSPAN31 is found highly
amplified in number of Glioblastoma
multiforme patients. Red stretches
represent band p24 and p13 that
upon translocation results in Acute
lymphoblastic leukemia.
Goals
1- Establish correlation based models enabling
integrative analyses of genomic variants in
carcinogenesis.
2- Analyze the relative impact of genomic variants
towards Glioblastoma Multiforme as a representative
cancer type.
3- Functional analysis and computational modeling
of significant patterns identified in step 2. This would
involve parameter estimation from molecular
information towards modeling pathways using
differential equations.
The availability of variation data (SNPs, CNVs,
structural aberrations, epigenetic changes, gene
expression) produced by case-control studies is
beginning to grow. We are interested in designing a
multivariate based framework to identify the relative
impact of groups of variations (figure 1) that could
possibly result in better diagnostic and prognostic
significance. As a preliminary step we take a gene
centric approach towards identification of multidimensional gene signatures. An example case study
of gene MTAP in Glioblastoma multiforme is shown in
figure 2.
Figure 2. Gene MTAP’s Copy number
variation (homozygous deletion -2,
hemizygous deletion -1, neutral 0, gain
1, amplified 2) Vs fold change in
expression. The dataset consists of 91
GBM samples obtained from TCGA.
Expression
Methodology
References
Samples
1. Vogelstein B and Kinzler KW. Cancer genes and the pathways they control. Nature med. 10, 789-799, 2004.
2. The Cancer Genome Atlas Research Network. Comprehensive genomic characterization defines human glioblastoma genes and
core pathways. Nature. 455, 1061-1068, 2008.
3. Lacronique V, Boureux A, Valle, VD, Poirel H, Quang CT, Mauchauffé M, Berthou C, Lessard M, Berger R, Ghysdael J, Bernard OA.
A TEL-JAK2 fusion protein with constitutive kinase activity in human leukemia. Science. 278(5341):1309-12, 1997.
Syed Haider YAM
[email protected]
Computer Laboratory
University of Cambridge
United Kingdom
Tel: 00 44 1223 763 698
www.cl.cam.ac.uk/~pl219/CSB/