Joslynn Lee – Data Science Educator

Download Report

Transcript Joslynn Lee – Data Science Educator

Transforming Science Through Data-driven Discovery
Tools and Services Workshop
DNA Subway
Joslynn Lee – Data Science Educator
Cold Spring Harbor Laboratory, DNA Learning Center
[email protected]
Welcome to DNA Subway
High-level genome analysis broadly available to students and educators
Genomics in Education
High-level genome analysis broadly available to students and educators
About DNA Subway
Using the intuitive metaphor of a subway map, DNA Subway organizes research-grade
bioinformatics analysis tools into logical workflows and presents them in an appealing
interface.
By "riding" different lines users can:
• Predict and annotate genes in up to 150,000 base pairs of DNA sequence (Red Line)
• Prospect entire plant genomes for related genes and sequences (Yellow Line)
• Determine sequence relationships, view phylogenetic trees and analyze "DNA
barcodes" (Blue Line)
• Analyze RNA-Seq data for differential expression (Green Line)
Updating instruction big data biology
•
•
•
•
•
•
•
•
•
1866 – Mendel publishes work on inheritance
1869 – DNA discovered
1915 – Hunt Morgan describes linkage and recombination
1953 – Structure of DNA described
1956 – Human chromosome number determined
1968 – First gene mapped to autosome
1977 – Dideoxy sequencing
1983 – PCR
1986 – Human Genome Project proposed
Updating instruction big data biology
•
•
•
•
1993 – First MicroRNAs described
2003 – First ‘Gold Standard’ human genome sequence
2005 – First draft of human haplotype map (HapMap)
2007 – ENCODE project
Challenge – bringing students into the fold
Research
Education
Students can work with the same data at the same
time and with the same tools as research scientists.
Challenge – bringing students into the fold
What are your challenges
in teaching bioinformatics
in the classroom?
Take the Subway
DNA Subway
Classroom friendly bioinformatics
Faculty identified guiding requirements
that shaped the development of CyVerse educational platforms:
• Mix lecture and lab – have a wet bench “hook”
• Student-scientist partnerships – someone has to care about the data
• Co-investigation – projects should potentially lead to publications
• Scale – platforms should support projects multiple classrooms can join.
DNA Subway
Red Line: Genome annotation
Red Line
• Analyze up to 150 KB of DNA sequence
• De novo gene prediction
• Construct evidence-based gene models
• Visualize genome sequence in browser
DNA Subway
Yellow Line: Genome prospecting
Yellow Line
• Analyze DNA or protein sequence
• Search plant genomes using TARGeT
• Explore gene duplications, transposons, and non-coding sequences not detectable in conventional BLAST
searches
DNA Subway
Blue Line: DNA barcoding, and phylogenetics
Blue Line
• Analyze DNA or protein sequence
• Search plant genomes using TARGeT
• Explore gene duplications, transposons, and non-coding sequences not detectable in conventional BLAST
searches
DNA Subway
Green Line: Transcriptome analysis
Green Line
• Examine RNA-Seq data for differential
expression
• Use High-performance computing to analyze complete datasets
• Generate lists of genes and fold-changes; add results to Red Line projects
Transforming Science Through Data-driven Discovery
Executive Team
Parker Antin
Nirav Merchant
Eric Lyons
Matt Vaughn
Doreen Ware
Dave Micklos
CyVerse is supported by the National Science Foundation under Grant No. DBI-0735191 and DBI-1265383.