diversity_tutorial
Download
Report
Transcript diversity_tutorial
Introduction to the Gramene
Genetic Diversity module
5/2010
Build #31
• The Gramene Genetic
Diversity database
module specializes in
storage of data sets that
study genetic variation
and genotypephenotype association
in populations of plants.
• Our focus:
Maize
Rice
Arabidopsis
Sorghum
Wheat
– Storage of large-scale SNP and
indel genotype datasets and
accompanying phenotype
measurements
– Facilitate discovery of
associations between genes
and traits
– Species of primary interest:
maize, rice, Arabidopsis. We
also house some important
wheat and sorghum data
To find the Genetic Diversity module:
- Go to: www.gramene.org, and click on one of the Diversity links
- Or, navigate go there directly: www.gramene.org.diversity
Click on
“Diversity”
link
Click on “Diversity”
thumbnail
What’s on the Diversity main page
Browse data
Access latest Diversity
datasets
Large-scale SNP datasets
Browse through SSR, RFLP and
other smaller scale diversity or
QTL studies here
What’s on the Diversity main page
Search data
SNP Query
Tool for mining
SNP data
Quick search
for germplasm
or markers
What’s on the Diversity main page
Downloading data
Download subsets of
SNP data by using the
SNP Query tool (more
about this in a few
slides)
All download options,
including full Diversity
MySQL db dumps
Download SNP data in
hapmap, plink and flapjack format. Click
on the DOWNLOAD link and we’ll take a
look at the Download Data page on the
next slide…
Launch Diversity
datasets live in
TASSEL. Download
graphs and results
from analyses
SNP download page
www.gramene.org/diversity/download_data.html
• The large-scale SNP datasets are offered for download in hapmap, plink (.map and .ped files)
and flapjack format. The plink files are loadable in the PLINK analysis program
(pngu.mgh.harvard.edu/~purcell/plink) and the flapjack project files are loadable in Flapjack
visualization tool (bioinf.scri.ac.uk/flapjack).
The Diversity module stores all
of its data in MySQL using the
GDPDM database model.
GDPDM links genotype,
phenotype, germplasm, field
and environment information in
one resource.
GDPDM v 4.0 is optimized for
efficient handling of large scale
SNP data by using BLOBs (binary
database objects)
For full GDPDM documentation,
go here:
http://www.maizegenetics.net/g
dpdm/documentation_list.html
Genotype
Field/Plant
information
Germplasm
Phenotype
SNP QUERY
Web-based tool for mining SNP datasets in Gramene Genetic Diversity
Features of the ‘SNP Data’ area of Diversity:
SNP QUERY TOOL
Click SEARCH
link to go to
SNP Query
Select a species
Select a dataset
to load
Select subset of
plants in the
experiment by
holding down the
SHIFT, ALT, or APPLECMD key while
clicking. The default
(selecting none) will
return data for all
accessions.
Select
chromosome
Select output format. ‘Text’ and
‘HTML’ will return the results in the
browser, ‘File download’ will return
results as a downloadable csv text
file.
Optionally enter
start and stop
coordinate for
range of interest
Finally, click
Submit
Here’s what the HTML results of SNP Query
look like:
Hyperlinked to
Gramene Genomes
Ensembl browser
You can save the results to
your hard disk as text
(some browsers allow you
to save in HTML), or you
can redo your query and
select File Download option
TASSEL
Java program for evaluating trait associations, evolutionary patterns,
and linkage disequilibrium
TASSEL
Launch Diversity SNP datasets with Java web-start
Click on the
LAUNCH TASSEL
link to access the
web-start links
TASSEL
Launch Diversity SNP datasets with Java web-start
www.gramene.org/diversity/tassel_launch.html
When you click on one of the hyperlinks, each of which
represent one chromosome’s worth of SNP data, a dialog box
should appear asking you if you want to open Java Web Start to
open the file – click OK. TASSEL will take a minute or two to
launch…
Once TASSEL launches, here is what you should see:
Notice the dataset tag
‘chr1’ appears in the
left bar. Click on ‘chr1’
to see the data….
When you click ‘chr1’ in
left Data column, the
data will be displayed
in main window:
With TASSEL you can perform many
analyses. Please see TASSEL
documentation for much more
information:
maizegenetics.net/tassel
LD v Distance
Phylogeny
LD
PCA of genotypes
Diversity
sliding window