Presentation

Download Report

Transcript Presentation

Ruderfer et al, Nat Genet 48(10):1107-1111, 2016
Overview
• Background Copy Number Variants (CNV)
• Definition
• Detection
• Paper presentation
• Purpose
• Methods
• Results
• Conclusions
• Limitations
• Examples in ExAC
Definition of Copy Number Variant (CNV)
•
•
•
•
•
•
•
•
•
CNV = >50 bp
4.8-9.5% of the human genome
CNVs encompass more nucleotide
variation than SNPs
Adaptive - embryonic lethal
Pathogenic - Benign
10-20% de novo CNV
• Intellectual disability
• Multiple congenital anomalies
Autism
Non-syndromic deafness
Congenital heart disease
Zarrei et al Nat Rev 16:172-183, 2015
CNV, Pathways and Genes
Zarrei et al Nat Rev 16:172-183, 2015
Detection of CNVs
No single method can capture all structural genomic variation
Comparative Genomic Hybridization (CGH)
• Lower resolution
• Better at duplications
• Large CNVs
• Low breakpoint accuracy
Detection of CNVs
SNP or Microarrays
• Cannot determine exact breakpoints
• Detect large CNVs
• Dependent on probe coverage
Siretean et al. Balkan J. Med. Genet. 16(2):67-72, 2013.
Detection of CNVs
Next-Generation Sequencing (NGS)
•
Discordant read pair
• Distance between read pair
• Small CNVs
• Inversions/translocations
•
Split-read
• High breakpoint accuracy
• Small CNVs
https://bioinfo.uth.edu/CNVannotator/links.cgi
Hehir-Kwa et al, Exper Rev Mol Diagn 15: 1023-1032, 2015
Detection of CNVs
Next-Generation Sequencing (NGS)
•
Assembly-based methods
• Local assembly (contigs)
• Breakpoint accuracy
•
Depth of Coverage
• Larger events
• More tolerant of repetitive regions
• Lack breakpoint accuracy
• Do not require continuous sequence
https://bioinfo.uth.edu/CNVannotator/links.cgi
Hehir-Kwa et al, Exper Rev Mol Diagn 15: 1023-1032, 2015
Purpose of the Study
Exome sequences of ~60,000 individuals
1. Characterize the rates of rare CNVs
2. Characterize the properties of rare CNVs
3. Application to disease
Methods: Populations Studied
Calling CNVs:
Methods
exome hidden Markov model (XHMM)
• Sequencing read depth
• Principal-component analysis to normalize data
• Calculate posterior probability of
individual being diploid vs
deleted or duplicated
Fromer et al. Am J Hum Genet 91:597–607, 2012
Methods
Data used:
>20 quality reads
Filtered out targets:
• Mean sequencing depth <10X
• Low-complexity sequence over >25%
• GC content <10% or >90%
• Covering <10bp or spanning >10kb
Removed
• Samples with CNV 3 standard deviations above the mean (>24).
• CNVs observed in <600 individuals (0.5%)
Confidence measures for deletion, duplication and diploid status
for every individual at every gene.
https://bioinfo.uth.edu/CNVannotator/images/Figure1.png
Methods
Filtering of Genes:
19,430 autosomal genes
CNVs on X and Y
> 50% of gene was removed
<30X mean coverage
>200X mean coverage
CNV freq >0.5%
Multialleleic
15,734 autosomal genes
Methods
Quality Assessment:
• Parent-child trios
• 241 from Bulgaria
• Schizophrenic probands
• 0.058 de novo rate (0.051 with 622 arrays)
• 44% transmission rate for deletions
• 42% transmission rate for duplications
Renton and Traynor, Nat Neurosci 16:774-775, 2013
Comparison to SNP arrays
10,091 arrays
• 10 markers
• >100kb length
• <1% frequency
Comparison to SNP arrays
Concordantly called CNVs
• Array encompassed more exons 70% of time
• 83% of exons were included in both technologies
• Exome seq detected 2.2X more CNVs per individual
Results: Characterizing CNV Calls
Number of Rare CNVs:
59,898 individuals
126,771 CNVs in autosomal protein coding genes
• Individual: 2.1 rare CNVs
• 0.82 deletion
• 1.29 duplications
• Number of CNVs
• 0: 21%
• 1: 29%
• >5: 6%
Results: Characterizing CNV Calls
Size of Rare CNVs:
• Total mean extent: 154 kb
• 107 kb dup
• 46 kb del
• Avg length 73 kb
• 83 kb dup
• 56 kb del
• 84% <100kb
• 56% <20kb
Results: Characterizing
CNV Calls
Genes with CNVs
• 70%: at least one gene
• 37% at least one del gene
• (Avg:0.81 gene/person)
• 54% at least one dup gene
• (Avg: 1.75 gene/person)
• 16% >100kb (Avg: 79kb)
• 59 kb del
• 91 kb dup
• 13 exons
• 9.7 del
• 15 dup
Results: Characterizing
CNV Calls
Genes:
• Each gene del in 3.1 individuals
• Each gene dup in 6.6 individuals
• 1,872 genes with no CNV
• 6,578 w/o del
• 3,038 w/o dup
• 55% CNV = single gene
• 65% del
• 48% dup
• 62% partial
Results: Characterizing CNV Calls
By Population/Gender
African
Latino
East Asian
Finnish
Non-Finnish European
Other
South Asian
Results: Characterizing CNV Calls
By Population/Gender
African
Latino
East Asian
Finnish
Non-Finnish European
Other
South Asian
Methods
Intolerance Scores:
• Dup/del separately
• Dup from UCSC to predict freq
• Linear regression model for
observed CNV rate/gene
• Gene length
• Coding sequence length
• Number of targets
• Read depth
• GC content
• Sequence complexity
• Genomic location within pairs
of segmental duplications
• Greater tolerance
• Lower than expected CNV rate
Results: Genic Intolerance
• Scores for deletions correlated with duplications
• Correlated with increases in SNV scores (missense/LOF)
• Stronger for deletions
• Associated with evolutionary constraint (GERP)
Results: Characterization
of CNV Tolerance
7754 genes
• Higher the expression, higher the
intolerance
• Brain harbors more in intolerant
genes
• Less loss of function/short indels
SNV
CNV Intolerance
Duplication Intolerance
Deletion Intolerance
TOP 10 Most Intolerant Genes
•
•
•
•
Haploinsufficient
Essential
Neuronal and axon development
Synapse organization/assembly
EP300
MTOR
PTEN
ATXN2
MYO5A
PLK1
PSME4
RELN
SIN3A
YWHAG
http://media.salon.com/2011/04/letterman_snubs-1280x900.jpg
TOP 10 Most Tolerant Genes
•
•
•
•
•
Recessive disorders
No phenotype in null mice
Aldo-keto reductases
Metallothioneins
Protocadherin
D2HGDH
ACAD10
AGMO
AKR1B1
AKR1B10
AKR1B15
AKR1C1
AKR1C2
ALDH1A1
ALDH1L2
http://www.rawstory.com/wp-content/uploads/2015/05/screen-shot-2014-04-03-at-3-26-10-pm-800x430.jpg
Results: Application to Schizophrenia
ExAC:
• 4,793 schizophrenia cases
• Higher number of genes with CNVs
• 2.12 genes with CNVs
• Higher mean intolerance
• CNV intolerance scores more predictive than missense and LOF.
• 6,102 controls
• 1.78 genes with CNVs
• Greater normalized intolerance
Conclusions
• Examined frequency of rare CNVs
• 2/person
• Dup more frequent
• Characterization of rare CNVs
• 73-79 kb in length
• 13 exons
• CNV intolerance correlate with
•
•
•
•
Tissue expression
Genic constraint
Risk of disease
Use all data to get better estimates of pathogenicity
• Data on website
Limitations
• Using short-read sequence
• Inability to accurately call common or more complex variants
• Reference bias
• Only examined rare CNVs
• Frequent CNVs not in ExAC
• Rarity
• Some exons lack coverage
• Confounded by capture technology
• Confounded by cohort and population
• Lack of knowledge about CNV mutational mechanisms
• Did not conduct validation using alternative method(s)
• Participants have disease
• Deflated intolerance estimates
• Populations missing
ExAC: Tolerant CNV Gene
ExAC: Intolerant CNV Gene
ExAC: Gene of Interest