slides - QUBES hub
Download
Report
Transcript slides - QUBES hub
BIOINFORMATICS IN THE
DYNAMIC GENOME
COURSE
Introducing Freshmen to computational
biology
University of California, Riverside
Dynamic Genome Course
Give Freshmen a taste of research:
• First Half
• Core Biological Concepts
• Key molecular biology skills
• Basic Bioinformatics
• Second Half
• Open ended guided research project
Some history…DG at UGA
• Research Project: Identify transposable elements in newly
sequenced genomes by homology
• Steps:
• BLAST exemplar(s) against genome
• Align top scoring “hits”
• Build gene trees of newly identified elements
History con’t…
>Copia_RT
WVYRVKHKQDGSIDRYKARLVAKGYTQVEGLDYLDTFSP
VAKTTTLRLLLALAASQGWFLHQLDVDNAFLHGTLDEEI
YMRLPPGVSSPRPNQVCLLQKSLYGLK
POL
Next design primers and verify location in genome.
Simple, right…...?
Students hated it
• File Formats
• BLAST
• FASTA
• Command Line
• Clunky web based tools (2007)
• It took several weeks to get to the gene tree
• Solution…
TARGeT: Tree Analysis of Related Genes
and TEs.
• Graduate student
wrote a scritp
• First chapter in thesis
• Yujun Han, James Burnette, and
Susan Wessler (2009). TARGeT:
a web-based pipeline for
retrieving and characterizing
gene and transposable element
families from genomic
sequences. Nucleic Acids
Research
Hosted by CyVerse
(a.k.a iPlant
Collaborative)
• target.iplantcollaborativ
e.org
Quick TARGeT Demo
TARGeT Recap
• Removed tedium
• Results mostly within attention span
• Spend more time on biology in class…
Ping protein query against Rice and Soybean
Rice
Soybean
Ping Transposase query against Rice genome
Protein query
Nucleotide query
Ask students:
Why does the protein query find
more putative homologs?
Gene Families: Actin
Maize
Rice
Query: Rice Actin
Actin compared with Ping Tpase
An aside: This summer
• 17 rising sophomores
• Investigating genetic variation within gene families in
Citrus and related genera
Transpose to Riverside
• Quarter system
• 20 class meetings of three hours each
• 4-5 weeks for background
• 5-6 weeks for project
• As of fall 2016, 6 sections per quarter
• Neil A. Campbell Science Learning Laboratory
• Most diverse R1 University
• 60% First to college
• Must provide the technology
• Be very careful with terminology
Module 1: Genetic Information Flow
• Students review central dogma outside of class
• Review in class with concept maps
• Experiment: Amplify the Actin gene from gDNA and cDNA
Module 1: Locating Introns: Step 1
1. BLAST gDNA sequence vs cDNA sequence using
BLAST2Sequences
Step 2: Find locations
Step 3: Draw gene structure
This analysis can be done on tablets!!
Module 2: DNA Sequence Polymorphism
• Experiment: Amplify a locus from many strains of maize
• Introduce idea of reference genome (B73)
• Sometimes introduce genome browsers, PCR primer
design
Sequence Analysis
• Multiple Sequence Alignment
Burnette and Wessler
Genetics 2013
Electronic Laboratory Notebook
• The Dynamic Genome Laboratory Notebook is completely
electronic
• We developed our own:
• FERPA compliant
• FREE
• Allows combining bioinformatics and wet lab data
• Allows collecting “big data”
Robb et al. Course Source
Demo eNotebook, data collection
Robb et al. Course Source
Example Research Projects
• Verify predicted TE insertions in rice and maize
• Phenotypes of transcription factor knock-outs in planaria
and C. elegans. Verify knock out with PCR
• Characterize Ruby alleles in Citrus
• Polyembryony in Citrus and Poncirus (if time show data
collection 3326)
Challenges
• Students are not as computer literate as we are lead to believe.
• Simple curosity – “Did you google it?” “Did you Pubmed it?”
• Resistance to anything non-Facebook
• Good interfaces: DNA Subway
• Lots of support
• Diversity of examples
• TERMINOLOGY
• Three vocabularies
• Biological Words (transcription, translation) (gene, locus, ROI)
• Laboratory Words (PCR, gel, mini-prep)
• Computer Words (parameter, input, format, file type)
Acknowledgments
• Dr. Sofia Robb
• Dr. Matthew Collin
• Dr. Yujan Han
• Alex Cortez
• Rochelle Campbell