Transcript Document
SNP Resources: Finding SNPs,
Databases and Data Extraction
Debbie Nickerson
NIEHS SNPs Workshop
Genotype - Phenotype Studies
You have candidate gene/region/pathway of interest and
samples ready to study:
What SNPs are available?
How do I find the common SNPs?
What is the validation/quality of the SNPs?
Are these SNPs informative in my population/samples?
What can I download information?
How do I pick the “best” SNPs? - Dana Crawford
Minimal SNP information for genotyping/characterization
• What is the SNP? Flanking sequence and alleles.
FASTA format
>snp_name
ACCGAGTAGCCAG
[A/G]
ACTGGGATAGAAC
•
•
•
•
•
•
•
•
dbSNP reference SNP # (rs #)
Where is the SNP mapped? Exon, promoter, UTR, etc
How was it discovered? Method
What assurances do you have that it is real? Validated how?
What population – African, European, etc?
What is the allele frequency of each SNP? Common (>5%), rare
Are other SNPs associated - redundant?
Is genotyping data for control populations available?
Finding SNPs: Databases and Extraction
How do I find and download SNP data for analysis/genotyping?
1. NIEHS Environmental Genome Project (EGP)
Candidate gene website
2. NIEHS web applications and other tools
GeneSNPS, PolyDoms, PolyPhen, GVS
3. HapMap Genome Browser
4. Entrez Gene
- dbSNP
- Entrez SNP
Finding SNPs: Databases and Extraction
How do I find and download SNP data for analysis/genotyping?
1. NIEHS Environmental Genome Project (EGP)
Candidate gene website
2. NIEHS web applications and other tools
GeneSNPS, PolyDoms, PolyPhen, GVS
3. HapMap Genome Browser
4. Entrez Gene
- dbSNP
- Entrez SNP
Finding SNPs: NIEHS SNPs Candidate Genes
QuickTime™ and a
TIFF (LZW) decompressor
are needed to see this picture.
egp.gs.washington.edu
Finding SNPs: NIEHS SNPs Candidate Genes
Finding SNPs: NIEHS SNPs Candidate Genes
QuickTime™ and a
TIFF (LZW) decompressor
are needed to see this picture.
Finding SNPs: NIEHS SNPs Candidate Genes
African American
African YRI
European CEU
Hispanic
Asian CHB JPT
SNP_pos <tab> Ind_ID <tab> allele1 <tab> allele2
Repeat for all individuals
Repeat for next SNP
PolyPhen - Polymorphism Phenotyping
Structural protein characteristics and evolutionary comparison
SIFT = Sorting Intolerant From Tolerant
Evolutionary comparison of non-synonymous SNPs
Finding SNPs: NIEHS SNPs Candidate Genes
Finding SNPs: NIEHS SNPs Candidate Genes
egp.gs.washington.edu
Finding SNPs: NIEHS SNPs Candidate Genes
QuickTime™ and a
TIFF (LZW) decompressor
are needed to see this picture.
Finding SNPs: Databases and Extraction
How do I find and download SNP data for analysis/genotyping?
1. NIEHS Environmental Genome Project (EGP)
Candidate gene website
2. NIEHS web applications and other tools
GeneSNPS, PolyDoms, PolyPhen, GVS
3. HapMap Genome Browser
4. Entrez Gene
- dbSNP
- Entrez SNP
GeneSNPs
http://www.genome.utah.edu/genesnps/
Graphic view of SNPs in context of gene elements
All NIEHS genes presented
- organized by pathway/function
SNPs from dbSNP
- organized by submitter handle
Link-outs to EntrezSNP pages and other resources
Multiple views of SNPs in contexts of gene elements, protein
domains, linkage disequilibrium
Tutorial available from OpenHelix (http://www.openhelix.com)
Gene SNPs - http://www.genome.utah.edu/genesnps/
GeneSNPs navigation
GeneSNPs links to other resouces
GeneSNPs: multiple views of SNPS in
context of gene elements
Polydoms
A web-based application that maps synonymous and
non-synonymous SNPs onto known functional protein
domains
•
•
•
•
SNPs are from dbSNP and GeneSNPs
Domain structures from NCBI's Conserved Domain
Database
Functional predictions based on SIFT and
PolyPhen
3 dimensional mapping of SNPs on protein
structure using Chime viewer
http://polydoms.cchmc.org/polydoms/
Polydoms -
http://polydoms.cchmc.org/polydoms/
Polydoms -
http://polydoms.cchmc.org/polydoms/
Scroll Down
PolyPhen: Polymorphism Phenotypingprediction of functional effect of human nsSNPs
Physical and comparative analyses used to make
predictions
Uses SwissProt annotations to identify known
domains
Calculates a substitution probability from BLAST
alignments of homologous and orthologous
sequences
Ranks substitutions on scale of predicted functional
effects from “benign” to “probably damaging”
http://genetics.bwh.harvard.edu/pph/
PolyPhen: Polymorphism Phenotypingprediction of functional effect of human nsSNPs
GVS: Genome Variation Server
http://gvs.gs.washington.edu/GVS/
Provides rapid analysis of 4.5 million genotyped SNPs from
dbSNP and the HapMap
Mapped to human genome build 36 (hg18)
Displays genotype data in text and image formats
Displays tagSNPs or clusters of informative SNPs in text and
image formats
Displays linkage disequilibrium (LD) in text and image
formats
Online tutorial provided at OpenHelix.com
GVS: Genome Variation Server
ADH4
http://gvs.gs.washington.edu/GVS/
GVS: Genome Variation Server
GVS: Genome Variation Server
•Table of genotypes
•Image of visual genotypes
GVS: Genome Variation Server
Genotypes displayed in prettybase table and visual genotype graphic
GVS: Genome Variation Server
GVS: Genome Variation Server
Dense genotypes around a candidate gene can be integrated
with broader HapMap genotypes
High Density Genic Coverage (EGP)
Low Density Genome Coverage (HapMap)
= EGP SNP discovery (1/200 bp)
= HapMap SNPs (~1/1000 bp)
GVS: Genome Variation Server
Dense genotypes around a candidate gene can be integrated
with lower-density HapMap genotypes
Qu i c k T i m e™ an d a
TIF F (L ZW) dec om p res s or
are nee ded to s ee t hi s p i c tu re.
QuickTime™ and a
TIFF (LZW) decompressor
are needed to see this picture.
Qu i c k T i m e™ an d a
TIF F (L ZW) dec om p res s or
are nee ded to s ee t hi s p i c tu re.
GVS: Genome Variation Server
A.
B.
C.
Common samplescombined variations
Combined samplescommon variations
Combined samplescombined variations
Common
Combined
GVS: Genome Variation Server
-Common samples-
A. Common samples- combined variations
Combined variations
GVS: Genome Variation Server
HapMap
-Combined samples-
EGP
B. Combined samples- common variations
QuickTime™ and a
TIFF (LZW) decompressor
are needed to see this picture.
GVS: Genome Variation Server
C. Combined samples- combined variations
-Combined samples-
Combined variations
Finding SNPs: Databases and Extraction
How do I find and download SNP data for analysis/genotyping?
1. NIEHS Environmental Genome Project (EGP)
Candidate gene website
2. NIEHS web applications and other tools
GeneSNPS, PolyDoms, PolyPhen, GVS
3. HapMap Genome Browser
4. Entrez Gene
- dbSNP
- Entrez SNP
www.hapmap.org
Finding SNPs: HapMap Browser
Finding SNPs: HapMap Browser
Finding SNPs: HapMap Genotypes
Finding SNPs: HapMap Browser
1. HapMap data sets are useful because individual genotype
data in deeply sampled populations can be used to
determine optimal genotyping strategies (tagSNPs) or
perform population genetic analyses (linkage
disequilbrium)
2. Data are specific to the HapMap project (not all dbSNP)
HapMap data is available in dbSNP
3. Visualization of data and direct access to
SNP data, individual genotypes, and LD analysis
possible in the browser and formats can be saved
for Haploview
Finding SNPs: Databases and Extraction
How do I find and download SNP data for analysis/genotyping?
1. NIEHS Environmental Genome Project (EGP)
Candidate gene website
2. NIEHS web applications and other tools
GeneSNPS, PolyDoms, PolyPhen, GVS
3. HapMap Genome Browser
4. Entrez Gene
- dbSNP
- Entrez SNP
NCBI - Database Resource
NOS2A
www.ncbi.nlm.nih.gov
Finding SNPs using NCBI databases
http://www.ncbi.nlm.nih.gov/
Default
View cSNPs
Finding SNPs using NCBI databases
http://www.ncbi.nlm.nih.gov/
Entrez SNP - Query Term Capabilities
Finding SNPs - Entrez SNP Summary
1. dbSNP is useful for investigating detailed information on a
small number SNPs - and it’s good for a picture of the gene
2. Entrez SNP is a direct, fast database for querying SNP data
3. Data from Entrez SNP can be retrieved in batches for many SNPs
4. Entrez SNP data can be “limited” to specific subsets of SNPs
and formatted in plain text for easy parsing and manipulation
5. More detailed queries can be formed using specific “field tags”
for retrieving SNP data
Summary
Finding SNPs: Databases and Extraction
Reviewing candidate genes using views and resources in
- NIEHS SNPs
- GeneSNPs
Prediction of functional variations
- Polydoms and PolyPhen
Integration of dense, gene-centric SNP maps with genomic
HapMap SNPs
- GVS
HapMap viewer
NCBI databases through Entrez portal
-Entrez Gene, dbSNP, Entrez SNP
-many ways to retrieve and format data