Transcript ppt for
Identification and characterization of
copy number variation in Indian population
and its association with disease
Pankaj Kumar
CAS-MPG Presentation
07 May 2012
Introduction
CNVs are
- variations in the # of copies of genomic regions
- Can be insertions, deletions and duplications
- have size ranging from > 1 Kb to Mbs
CNV vs. SNPS
CNV
SNP
Total Number
38,406
14,708,752
% of Reference
Genome
29.74%
<1%
Introduction contd..
A
B
C
D
F
B
C
D
E
Origin
A
Types
A
B
C
D
D
E
A
B
Duplication
C
E
Deletion
Frequency
Occurrence
Polymorphism
Phenotypic Variability
Mutation
Disease Susceptibility
Introduction contd..
Consequence of CNVs
Unmask recessive alleles
Alter regulation
Disrupt genes
Cumulative effects
Scherer et al. Nature Review Genetics 2006
Objectives:
1. To identify CNVs in diverse Indian populations
2. To map CNV regions with disease susceptibility
3. To study consequence of CNV in disease
4. To explore the role of CNV in Spinocerebellar Ataxia
CNV & Diseases
Proof -of-concept study
APOBEC3b: insertion/ deletion polymorphism
Cytidine deaminase family of proteins
29 kb insertion/deletion polymorphism
Kidds et al. PLoS Genetics, 2007
Spectrum of APOBEC3B deletion frequency in Indian populations studied
APOBEC3b insertion/deletion polymorphism & malaria endemicity
Insertion
deletion
White - insertion
Dark - deletion
Significant association of APOBEC3b with falciparum malaria
Malaria cohort
Endemic
Non-endemic
Comparisons (Fisher's
test)
Genotypes
Odds Ratio
(95 % CI)
P value
Non-severe vs. control
AB & AA
7.11
(3.20 to 15.97)
1x10-7
Severe vs. control
AB & AA
8.13
(2.62 to 26.59)
1.7x10-5
Severe vs. non-severe
AB & AA
1.14
(0.37 to 3.81)
0.8
Severe vs. control
AB & AA
0.39
(0.16 to 0.93)
0.0211
Severe vs. control
BB & AB
6.44
(1.76 to 24.99)
0.0012
Severe vs. control
BB &
(AA+AB)
3.17
(1.10 to 10.32)
0.0177
A - insertion allele
B- deletion allele
Insertion allele of APOBEC3B seems to be protective for malaria
Positive Selection for APOBEC3B locus in Malaria
500 Kb upstream
5'
500 Kb downstream
3'
APOBEC3B
markers
markers
EHH and Haplotype
Analysis
Positive selection
???
Haplotype based analysis for larger linkage disequilibrium
Endemic case
Non-endemic case
Endemic control
Non-endemic control
Selection for ABOPEC3B region has not been observed in malaria
Schematic representation of APOBEC gene cluster and segmental duplication region
Segmental
duplication regions
Due to large no. of segmental duplication regions in this locus selection for
APOBEC3B was not observed
Conclusions
• Insertion allele of APOBEC3B seems to be protective for malaria
• APOBEC3B locus has not Shown signature of positive selection by conventional
methods may be due to high recombination events
• Since this gene is expressed in liver & spleen this might provide a new
mechanism of host protective response
Identification of CNVs in the Indian population
A basal Database
Identification of large CNVs (>100k) in the Indian population : Methodology
Sampling of IGV populations
Affy 50k array
(~58000 SNPs with av. inter-marker
distance 50 kb)
IE-N-LP5
TB-N-SP1
TB-N-IP1
IE-N-LP9
IE-N-LP1
IE-N-LP18
IE-N-IP2
IE -W-IP2 IE-N-SP4
IE-W-LP3
IE-N-LP10
IE-NE-IP1 TB-NE-LP1
IE-E-IP1
AA-C-IP5
IE-W-LP4
OG-W-IP
Raw intensity files
IE-NE-LP1
AA-NE-IP1
DR-C-IP2
IE-W-LP1
IE-E-LP2
IE-E-LP4
CNV calling and QC
(Genotyping Console+SVS7)
AA-E-IP3
IE-W-LP2
Cluster 1
DR-S-LP
DR-S-LP
Cluster 2
Cluster 3
Cluster 4
Cluster 5
Retrieve segments >100 kb length
& minimum 10 probes using GConsole
DR-S-LP3
477 samples, 26 populations
Validation using Sequenom
massARRAY QGE assay
Results
Instances of genomic segment prone to CNVs
Raw CNV deletion = 70174 (<1Mb segment size) and 212 (>1Mb segment size)
Raw CNV duplication = 73580 (<1Mb segment size) and 60 (>1Mb segment size)
Total CNVRs deletions = 1425
Total CNVRs duplications = 1337
result contd..
Extent of CNVs in IGV populations
Chromosomal landscape of common CNV regions in all the populations pooled together
result contd..
Concordance of dataset using two independent algorithms
GTC 3.0.2
1006
(11%)
5750
(65%)
Deletion
2048
(23%)
Duplication
SVS 7
1515
(25%)
2986
(50%)
Deletion
1461
(25%)
Duplication
~ 60% of copy number variable regions showed deletion and duplication both
Comparison using both the software shown 50% concordance prone to CNVs
CNV Validation and Heterogeneity
result contd..
Validation using Sequenom MassARRAY QGE
Less validation due to heterogeneity in CNV boundaries
Selection of probe for validation is a also key factor
Deletion
Amplification
CNVs and Population Structure
result contd..
TB populations
and isolated
Himalayan
populations
AA and DR isolated
populations
IE large
populations
Populations clustered according to genetic and linguistic affinity
CNVs present in IGV map to genes that are associated with diseases
SN
1
2
3
4
GENE_SYMBOL
KDR
IRF4
BRAF
KCNE2
5
AGT,AGTR1
6
ADRB1
7
KRT6A
8
GTF2H5
9
10
11
12
PRSS2
IL23R
ABCG5
HGD
13
PPM2C
14
A2M,APP
15
16
17
18
ATXN8OS
ATXN1
PRKCH
BFSP1
19
HTRA1
20
HMCN1
21
PTGDR,IL12B,HNMT,PTGER2
Disorder name
Hemangioma, capillary infantile, somatic
Multiple myeloma
Adenocarcinoma of lung, somatic
Atrial fibrillation, familial, Long QT syndrome-6
Hypertension, essential, Renal tubular
dysgenesis
Congestive heart failure, susceptibility to,
Resting heart rate
Pachyonychia congenita, JadassohnLewandowsky type
Trichothiodystrophy, complementation group
A,
Pancreatitis, chronic
Crohn disease
Sitosterolemia
Alkaptonuria
Pyruvate dehydrogenase phosphatase
deficiency
Alzheimer disease, susceptibility to,
Emphysema due to alpha-2-macroglobulin
deficiency
Spinocerebellar ataxia 8
Spinocerebellar ataxia-1
Cerebral infarction
Cataract, cortical, juvenile-onset
Macular degeneration, age-related, 7, Macular
degeneration, age-related, neovascular type
Macular degeneration, age-related, 1, Posterior
column ataxia with retinitis pigmentosa
Asthma
Class
Cancer
Cancer
Cancer
Cardiovascular
Cardiovascular
Cardiovascular
Dermatological
Dermatological
Gastrointestinal
Gastrointestinal
Metabolic
Metabolic
Metabolic
Neurological
Neurological
Neurological
Neurological
Ophthamological
Ophthamological
Ophthamological
Respiratory
Conclusions
Observed 0.05 % to 1.46% of genomic fraction per individual
• A set of genes that are encompassed in CNVRs are novel and not reported
in DGV (database of genomic variation).
• Validation process of individual CNVs showed substantial heterogeneity
in the boundaries of CNVs within a gene.
• CNVs can be shared between genetically related populations
• Basal data for genomic region prone to CNVs in Indian population
• CNV regions predispose to many diseases in Indian populations.
Role of CNVs as a genetic modifier in SCA12 phenotype
Investigating the involvement of CNV in sub-phenotypes of SCA12
SCA12
Neuro-degenerative disorder
CAG repeat expansion in 5’ UTR region of PPP2R2B gene
Two distinct sub-phenotypes have been observed
Tremor dominant
Gait dominant
Could CNV be involved????
Workflow of CNV Identification
IE large populations
SCA12
(CAG repeat in
PPP2R2B)
10 index cases of Gait
14 index cases of Tremor
Affymetrix 6.0 SNP
array
Data QC
CNV calling
(PennCNV)
Gene Annotation
Validation (RealTime
method)
Functional annotation
clustering
Copy number state distribution in SCA12 and IE population
CN state
Count in SCA12
Count in IE
0
987
389
1
2697
1226
3
257
465
4
158
257
Case control association analysis between gait and tremor groups
Chr
CNV end Sizes
in
Kb
Genes
10582072 10582389
3.17
8
8
Non
genic
1
4
2
0
0.017
2
Inf
chr1 10560946 10564162
32.1
4
8
1
Non
genic
6
1
1
1
0.004
4
25.144
2
GOLPH
3
0
5
0
0
0.004
8
Inf
chr1
CNV
start
chr5 32142841 32208250
51
Gait Gait HT HT
p
Del Dup Del Dup value
odds
ratio
(OR)
Amplification of chr5p13.3 region in Gait Ataxia
GOLPH3 amplification
5/8 of gait samples
0/14 of HT samples
Real Time validation
GOLPH3 (golgi phosphoprotein 3 (coat-protein))
A Golgi localized protein
Have a regulatory role in Golgi trafficking
Identified as potent oncogene
modulates mTOR signaling
Inhibition of mTOR induces autophagy and reduces toxicity of polyglutamine
expansions in fly and mouse models of Huntington disease
Brinda Ravikumar et al. Nature Genetics (2004)
Autophagy induction reduces mutant ataxin-3 levels and toxicity in a mouse model of
spinocerebellar ataxia type 3
Fiona M. Menzies et al. Brain (2009)
Functional annotation clustering of genes under CNV specific to SCA12
Term
GO; 0005216~ ion
channel activity
GO:0022838~substr
ate specific channel
activity
GO:0015267~chann
el activity
GO:0022803~passiv
e transmembrane
transpore activity
Count
%
P value
Bonferron Benjamin
Fold
i
i
Enrichme
nt
18
6.593 3.74E-05
0.0172
0.0172
3.2549
18
6.593 5.48E-05
0.0252
0.0084
3.1568
18
6.593 8.39E-05
0.0383
0.0097
3.0495
18
6.593 8.64E-05
0.0394
0.0080
3.0421
significant enrichment of ion channel activity processes in SCA12
A multigene enrichment analysis for dissection of biological system
Biological process
Molecular functions
Cellular components
CNV in ion channel genes and its involvement in different biological, molecular
and cellular functions suggest physiological impairment in SCA12
Future direction
Conclusions
• Although SCA12 is a monogenic disorder, phenotypic variability could be
due to other Genetic factors.
• Amplification in GOLPH3 gene could be a modifier gene that leads to gait
ataxia feature.
• As Autophagy pathway is influenced by GOLPH3 through mTOR pathway
that finally leads to Autophagolysis of inclusion bodies.
• GOLPH3 could be good intervention molecule for SCA12 pathogenesis.
• Ion channel genes and its implication in different neurological diseases,
suggests physiochemical abnormalities in SCA12
Conclusion of my PhD work ……………
“Any two individual genomes taken from nature, in any species, will have dozens
to hundreds of differences in their total number of functional genes.”
[Daniel R. Schrider and Matthew W. Hahn, Proc. R. Soc. B; 2010]
In conclusion our genome is less static and CNVs could play an
important role in dynamics of the genome that facilitates evolution,
adaptation and selection in populations and diseases due to dosage
effect of functional genes/regions.
Publications
Jha P, Sinha S, Kanchan K, Qidwai T, Narang A, Singh PK, Pati SS, Mohanty S,
Mishra SK, Sharma SK, Awasthi S, Venkatesh V, Jain S, Basu A, Xu S; Indian
Genome Variation Consortium, Mukerji M, Habib S. Deletion of the APOBEC3B
gene strongly impacts susceptibility to falciparum malaria. Infect Genet Evol. 2012
Jan;12(1):142-8.
Datta S, Chowdhury A, Ghosh M, Das K, Jha P, Colah R, Mukerji M, Majumder PP.
A Genome-Wide Search for Non-UGT1A1 Markers Associated with Unconjugated
Bilirubin Level Reveals Significant Association with a Polymorphic Marker Near a
Gene of the Nucleoporin Family. Ann Hum Genet. 2012 Jan;76(1):33-41.
Abhimanyu, Indian Genome variation consortium, Jha P and Mridula Bose.
Footprints of genetic susceptibility to pulmonary tuberculosis: Cytokine gene
variants in north Indians. Indian J Med Res., 2011 (accepted)
Lall M, Thakur S, Puri R, Verma I, Mukerji M, Jha P. A 54 Mb 11qter duplication
and 0.9 Mb 1q44 deletion in a child with laryngomalacia and agenesis of corpus
callosum. Mol Cytogenet. 2011 Sep 21;4:19.
Gautam P*, Jha P*, Kumar D, Tyagi S, Varma B, Dash D, Mukhopadhyay A;
Indian Genome Variation Consortium, Mukerji M. Spectrum of large copy number
variations in 26 diverse Indian populations: potential involvement in phenotypic diversity.
Hum Genet. 2011 Jul 9. * Equal contributing authors.
Ankita Narang*, Jha P*, Vimal Rawat, Arijit Mukhopadhayay, Debasis Dash, Analabha Basu,
Mitali Mukerji. Recent admixture in an Indian population of African ancestry.
Am. J. Hum. Genet. 2011 Jul 5. * Equal contributing authors.
Jha P, Suri V, Sharma V, Singh G, Sharma MC, Pathak P, Chosdol K, Jha P, Suri A,
Mahapatra AK, Kale SS, Sarkar C. IDH1 mutations in gliomas: First series from a tertiary care
centre in India with comprehensive review of literature. Exp Mol Pathol. 2011
May 3;91(1):385-393.
Abhimanyu, Jha P, Jain A, Arora K, Bose M. Genetic association study suggests a role for
SP110 variants in lymph node tuberculosis but not pulmonary tuberculosis in north Indians.
Hum Immunol. 2011 Apr 20.
Abhimanyu, Mangangcha IR, Jha P, Arora K, Mukerji M, Banavaliker JN, Consortium IG,
Brahmachari V, Bose M. Differential serum cytokine levels are associated with cytokine gene
polymorphisms in north Indian populations with active pulmonary tuberculosis.
Infect Genet Evol. 2011 Apr 1.
Jha P, Suri V, Jain A, Sharma MC, Pathak P, Jha P, Srivastava A, Suri A, Gupta D,
Chosdol K, Chattopadhyay P, Sarkar C. O6-methylguanine DNA methyltransferase
gene promoter methylation status in gliomas and its correlation with other molecular
alterations: first Indian report with review of challenges for use in customized
treatment. Neurosurgery. 2010 Dec; 67(6):1681-91.
Jha P, Jha P, Pathak P, Chosdol K, Suri V, Sharma MC, Kumar G, Singh M,
Mahapatra AK, Sarkar C. TP53 polymorphisms in gliomas from Indian patients:
Study of codon 72 genotype, rs1642785, rs1800370, and 16 base pair insertion in
intron-3. Exp Mol Pathol. 2011 Apr;90(2):167-72. (2010) Nov 27.
Aggarwal S, Negi S, Jha P, Singh PK, Stobdan T, Pasha MA, Ghosh S, Agrawal A;
Indian Genome Variation Consortium, Prasher B, Mukerji M. EGLN1 involvement in
high-altitude adaptation revealed through genetic analysis of extreme constitution
types defined in Ayurveda. Proc Natl Acad Sci U S A. (2010) Nov 2;107(44):189616.
HUGO Pan-Asian SNP Consortium, Mapping human genetic diversity in Asia.
Science. (2009) Dec 11;326(5959):1541-5
Indian Genome Variation Consortium. Genetic landscape of the people of India: a
canvas for disease gene exploration. J Genet. (2008) Apr;87(1):3-20.
Acknowledgements
Qui ckTi me™ and a
decompressor
are needed to see this pictur e.
Quick Time™ an d a
d eco mp res sor
ar e n eed ed to s ee this pic tur e.
CSIR
TCGA for Genotyping Facility
Indian Genome Variation Consortium
Thank you
Extra slides
Copy Number Variation in Indian Population
547 healthy individuals from26 Reference Population
from Indian Genome Variation Consortium
Affymetrix 50k Xba 240 array
(raw intensity file)
Genotype QC
Reference Sample(30)
Test Sample(447)
≥ 10 probes
≥ 100 kb segment
CNV calling and QC
(Genotyping Console+SVS7)
Common CNV
(> 5% of samples)
Validation using Sequenom
massARRAY QGE assay
(a subset of 12 genes)
Rare CNV
(< 5% of samples)
Functional Enrichment
Analysis
Mapping with Disease
Associated regions
Test for HWE
Ins Homo
Heterozygote Del Homo
HWE test pvalue
Endemic case
29
41
3
0.018
Endemic
control
64
18
0
0.586
Non-endemic
case
56
11
17
7.95 × 10-9
Non-endemic
control
51
25
5
0.508
Too many
heterozygote
s
Loss of too
many
heterozygote
s
HWD generally indicates some kind of natural selection, after data quality control
for genotyping error and population stratification
Future direction
SCA12 modifier genes
GOLPH3
mTOR Pathway
AUTOPHAGY
Amplification
Induction of
mTOR pathway
Autophagy Inhibition
Aggregate formation
Neurodegeneration