Introduction
Download
Report
Transcript Introduction
Presented by:
Andrew McMurry
Boston University Bioinformatics
Children’s Hospital Informatics Program
Harvard Medical School Center for BioMedical Informatics
This Presentation Available at:
http://pixelshelf.com/~justandy/f-snp.ppt
Outline
Incidental Findings and Disconnected Patient Cohorts
Disease Association Studies Using SNPs
How SNPs cause disease
Computationally predict affect of SNPs within introns, exons,
and regulatory regions
The Future Is Now:
SNPs, Personalized Medicine, and Translational Research
Incidental Findings and Disconnected Patient Cohorts
IF the central dogma of Biology is:
“From DNA ->RNA ->Protein”
THEN where is the patient data for association studies?
Very little patient data spanning DNA/RNA/
protein/phenotype across a single cohort
Need to obtain “robust” sample sizes to avoid incidental
findings due to multiple testing [1]
[1] Isaac Kohane, Daniel Masys, and Russ Altman.
"The Incidentalome: A Threat to Genomic Medicine"
JAMA 296(2): 212-215. July 12, 2006.
Disease Association Studies Using SNPs
DNA sequencing technologies still very expensive
Stunningly few patients
Minimal sequence coverage
Could change in time with Solexa/454
Even with solexa/454 there is a massive task of piecing together
the results (often max sequence read shorter than single repeated
gene)
Rate limiting step: Adoption rate of DNA sequencing
Use what is available in abundance! SNP chips
Abundance of SNP chips in public repos on many diseases
Whole genome coverage 500k SNPs for $250
Disease Association Studies Using SNPs
DNA to RNA to Protein
Associating DNA & RNA
GEO alone well over 100k Gene Expression Arrays
What if we could correlate SNPs affect on Gene Expression?
Associating DNA & Gene Product (protein)
Countless public protein databases
What if we could correlate SNPs affect on Protein Coding?
Association studies involving multiple genomic measurements
What are the existing studies and models (HMMs/Bayes nets)
that could be strengthened with evidence from SNP chips?
How SNPs cause disease
Intron
Protein Coding
•
•
Incorrect final mRNA transcript
Transcriptional Regulation
•
•
Missense
• Synonymous
Same Amino Acid
• Non Synonymous Different Amino Acid
Nonsense
• Premature STOP
Splicing Regulation
•
•
Likely no affect
Differential gene expression
Post Translational
•
Protein phosphorylation
So how do we measure all these affects of SNPs?
F-SNP : integrated approach
1.
Classify SNP site using dbSNP
•
•
•
•
•
Intron
Coding Region
Splice Site
TF binding Site
Post-Translational Site
2. Evaluate using the specialized algorithms/dbs
•
•
•
•
3.
Coding region
Splice Site
TF binding Site
Post-Translational Site
(missense/nonsense mutations)
(intronic/exonic sites)
(promoter/repressor/etc)
(Phospho/Tyrosine/0-glycosylation)
“Majority Vote” across algorithms
F-SNP decision procedure for functional SNPs
F-SNP: User Interfaces & Data Download
Public Web Site
Federated Query =
entire database cannot be downloaded
Currently:
no SOAP (webservice) support
no RSS support
No source code available
However:
Paper gives explicit instructions on how to reproduce the
algorithm and construct the database using dbSNP, OMIM,
etc.
“Large N Study” using F-SNP
Functional Category
# of Assessed SNPs # of Functional SNPs
Protein Coding
154,140
66,899
Splicing Regulation
73,051
8,075
Transcriptional Regulation
453,710
78,296
Post Translation
64,736
4,477
Total
559,322
115,356
Evaluate Individual SNP (rs28897699)
SNP summary and Functional Predictions
SNP Primary Information (rs28897699)
Locus
Alleles
Ancestral Allele
Validation (if any)
Region
Link to References
F-SNP: Functional Predictions
F-SNP Prediction Detail:
PolyPhen = benign affect on protein coding
F-SNP Prediction Detail:
SNPs3D = deleterious to protein coding
NCBI Gene Information
Product breast cancer 1, early onset
Other names,BRCA1,BRCAI,BRCC1,IRIS,PSCP,RNF53
NCBI Entrez Gene Summary: This gene encodes a nuclear phosphoprotein that plays a role in
maintaining genomic stability and acts as a tumor suppressor. (…) Mutations in this gene are
responsible for approximately 40% of inherited breast cancers and more than 80% of inherited
breast and ovarian cancers. Alternative splicing plays a role in modulating the
subcellularlocalization and physiological function of this gene. Many alternatively spliced
transcript variants have been described for this gene but only some have had their full-length
natures identified. (…)
F-SNP functional prediction
on Protein Coding
2 votes benign, 1 deleterious, 1 nonsynonymous
on Splicing Regulation
predicted functional impact (by majority vote)
Gene level view of BRCA1
Query by gene name = “BRCA1”
Returns list of SNPs in BRCA1
Returns list of Cancers associated with BRCA1
Gene level view of BRCA1
our SNP has functional impact
our SNP has neighboring functional SNPS
Disease Level View : Breast Cancer
Disease Level View : Breast Cancer
Show all disease genes associated with breast cancer
Denote if SNPs are present in those genes (5k up/downstream)
Recap of Disease Level View
The Future Is Now:
SNPs, Personalized Medicine, and Translational Research
SNP profiling becoming part of routine care [2]
Increase # of clinically annotated SNP chips
Increase # of disease association studies using SNPs
Increase in NIH focus on “translational research” that bridges routine
care delivery with research efforts
Genome Wide Association Studies (GWAS) that actually get funded
[2] Kohane IS, Mandl KD, Taylor PL, Holm IA, Nigrin DJ, Kunkel
“LM. Medicine. Reestablishing the researcher-patient compact.”
Science. 2007 Nov 16;318(5853):1068.
F-SNP Summary
Incidental Findings and Disconnected Patient Cohorts
Central dogma of biology DNA->RNA-Protein, yet we lack cohort spans all measurements
Using limited sample size will inevitably lead to incidental outcomes
Disease Association Studies Using SNPs
Don’t wait for DNA sequencing to become widespread
SNPs are becoming an abundant resource and not going to disappear
How SNPs cause disease
Protein Coding
Splicing Regulation
Transcription Regulation
Post Translation
Computationally predict affect of SNPs within introns, exons, and regulatory
regions
Multitude of existing SNP analysis tools and resources
F-SNP provides a single web based resource to mine SNP disease associations
Query and analysis by SNP, Gene, Disease
The role of SNPs in Personalized Medicine & and Translational Research