Bioinformatics for the classroom

Download Report

Transcript Bioinformatics for the classroom

Bioinformatics for your
classroom
NCBI
BLAST
Seth Bordenstein
Department of Biological Sciences
Vanderbilt University
Advantages
1. No programming skills needed
2.Familiarity with personal computer
and internet browser
3.Customizable and free
Bioinformatics is like using ‘Google’ for DNA sequences
National Center for Biotechnology
Information (NCBI)
http://www.ncbi.nlm.nih.gov
50
Growth of NCBI - GenBank
50
40
45
35
40
30
25
Sequence records
Total base pairs
Release 148:
35
45.2 million records
49.4 billion nucleotides
30
25
20
Average doubling time ≈ 14 months
20
15
15
10
10
5
0
5
’83 ’84 ’85 ’86 ’87 ’88 ’89 ’90 ’91 ’92 ’93 ’94 ’95 ’96 ’97 ’98 ’99 ’00 ’01 ’02 ’03 ’04 ’05 ’06
0
Total Base Pairs
(billions)
Sequence Records
(millions)
45
55
Bioinformatics is NOT just information technology.
It can teach the central dogmas of molecular biology
DNA
DNA sequences
genomes
RNA
cDNA
ESTs
protein
protein
sequence
databases
phenotype
Target database: Adjustable using the pull-down menu
LOCUS
DEFINITION
AY182241
1931 bp
mRNA
linear
PLN 04-MAY-2004
Malus x domestica (E,E)-alpha-farnesene synthase (AFS1) mRNA,
complete cds.
ACCESSION
AY182241
VERSION
AY182241.2 GI:32265057
KEYWORDS
.
SOURCE
Malus x domestica (cultivated apple)
ORGANISM Malus x domestica
Eukaryota; Viridiplantae; Streptophyta; Embryophyta; Tracheophyta;
Spermatophyta; Magnoliophyta; eudicotyledons; core eudicots;
rosids; eurosids I; Rosales; Rosaceae; Maloideae; Malus.
REFERENCE
1 (bases 1 to 1931)
AUTHORS
Pechous,S.W. and Whitaker,B.D.
TITLE
Cloning and functional expression of an (E,E)-alpha-farnesene
synthase cDNA from peel tissue of apple fruit
JOURNAL
Planta 219, 84-94 (2004)
REFERENCE
2 (bases 1 to 1931)
AUTHORS
Pechous,S.W. and Whitaker,B.D.
TITLE
Direct Submission
JOURNAL
Submitted (18-NOV-2002) PSI-Produce Quality and Safety Lab,
USDA-ARS, 10300 Baltimore Ave. Bldg. 002, Rm. 205, Beltsville, MD
20705, USA
REFERENCE
3 (bases 1 to 1931)
AUTHORS
Pechous,S.W. and Whitaker,B.D.
TITLE
Direct Submission
JOURNAL
Submitted (25-JUN-2003) PSI-Produce Quality and Safety Lab,
USDA-ARS, 10300 Baltimore Ave. Bldg. 002, Rm. 205, Beltsville, MD
20705, USA
REMARK
Sequence update by submitter
COMMENT
On Jun 26, 2003 this sequence version replaced gi:27804758.
FEATURES
Location/Qualifiers
source
1..1931
/organism="Malus x domestica"
/mol_type="mRNA"
/cultivar="'Law Rome'"
/db_xref="taxon:3750"
/tissue_type="peel"
gene
1..1931
/gene="AFS1"
CDS
54..1784
/gene="AFS1"
/note="terpene synthase"
/codon_start=1
/product="(E,E)-alpha-farnesene synthase"
/protein_id="AAO22848.2"
/db_xref="GI:32265058"
/translation="MEFRVHLQADNEQKIFQNQMKPEPEASYLINQRRSANYKPNIWK
NDFLDQSLISKYDGDEYRKLSEKLIEEVKIYISAETMDLVAKLELIDSVRKLGLANLF
EKEIKEALDSIAAIESDNLGTRDDLYGTALHFKILRQHGYKVSQDIFGRFMDEKGTLE
DFLHKNEDLLYNISLIVRLNNDLGTSAAEQERGDSPSSIVCYMREVNASEETARKNIK
GMIDNAWKKVNGKCFTTNQVPFLSSFMNNATNMARVAHSLYKDGDGFGDQEKGPRTHI
LSLLFQPLVN"
ORIGIN
1 ttcttgtatc ccaaacatct cgagcttctt gtacaccaaa ttaggtattc actatggaat
61 tcagagttca cttgcaagct gataatgagc agaaaatttt tcaaaaccag atgaaacccg
121 aacctgaagc ctcttacttg attaatcaaa gacggtctgc aaattacaag ccaaatattt
181 ggaagaacga tttcctagat caatctctta tcagcaaata cgatggagat gagtatcgga
241 agctgtctga gaagttaata gaagaagtta agatttatat atctgctgaa acaatggatt
//
A Traditional
GenBank Record
Header
The Flatfile Format
Feature Table
Sequence
The Header
LOCUS
DEFINITION
ACCESSION
VERSION
KEYWORDS
SOURCE
ORGANISM
REFERENCE
AUTHORS
TITLE
JOURNAL
REFERENCE
AUTHORS
TITLE
JOURNAL
REFERENCE
AUTHORS
TITLE
JOURNAL
REMARK
COMMENT
AY182241
1931 bp
mRNA
linear
PLN 04-MAY-2004
Malus x domestica (E,E)-alpha-farnesene synthase (AFS1) mRNA,
complete cds.
AY182241
AY182241.2 GI:32265057
.
Malus x domestica (cultivated apple)
Malus x domestica
Eukaryota; Viridiplantae; Streptophyta; Embryophyta; Tracheophyta;
Spermatophyta; Magnoliophyta; eudicotyledons; core eudicots;
rosids; eurosids I; Rosales; Rosaceae; Maloideae; Malus.
1 (bases 1 to 1931)
Pechous,S.W. and Whitaker,B.D.
Cloning and functional expression of an (E,E)-alpha-farnesene
synthase cDNA from peel tissue of apple fruit
Planta 219, 84-94 (2004)
2 (bases 1 to 1931)
Pechous,S.W. and Whitaker,B.D.
Direct Submission
Submitted (18-NOV-2002) PSI-Produce Quality and Safety Lab,
USDA-ARS, 10300 Baltimore Ave. Bldg. 002, Rm. 205, Beltsville, MD
20705, USA
3 (bases 1 to 1931)
Pechous,S.W. and Whitaker,B.D.
Direct Submission
Submitted (25-JUN-2003) PSI-Produce Quality and Safety Lab,
USDA-ARS, 10300 Baltimore Ave. Bldg. 002, Rm. 205, Beltsville, MD
20705, USA
Sequence update by submitter
On Jun 26, 2003 this sequence version replaced gi:27804758.
Header: Locus Line
LOCUS
AY182241
1931 bp
mRNA
linear
PLN 04-MAY-2004
DEFINITION
Malus x domestica
synthase (AFS1)
mRNA,
LOCUS
AY182241
1931 (E,E)-alpha-farnesene
bp
mRNA
linear
PLN 04-MAY-2004
complete cds.
ACCESSION
AY182241
VERSION
AY182241.2 GI:32265057
KEYWORDS
.
SOURCE
Malus x domestica (cultivated apple)
ORGANISM Malus x domestica
Eukaryota; Viridiplantae; Streptophyta; Embryophyta; Tracheophyta;
Spermatophyta; Magnoliophyta; eudicotyledons; core eudicots;
rosids; eurosids I; Rosales; Rosaceae; Maloideae; Malus.
REFERENCE
1 (bases 1 to 1931)
AUTHORS
Pechous,S.W. and Whitaker,B.D.
TITLE
Cloning and functional expression of an (E,E)-alpha-farnesene
synthase cDNA from peel tissue of apple fruit
JOURNAL
Planta 219, 84-94 (2004)
REFERENCE
2 (bases 1 to 1931)
AUTHORS
Pechous,S.W. and Whitaker,B.D.
TITLE
Direct Submission
JOURNAL
Submitted (18-NOV-2002) PSI-Produce Quality and Safety Lab,
USDA-ARS, 10300 Baltimore Ave. Bldg. 002, Rm. 205, Beltsville, MD
20705, USA
REFERENCE
3 (bases 1 to 1931)
AUTHORS
Pechous,S.W. and Whitaker,B.D.
TITLE
Direct Submission
JOURNAL
Submitted (25-JUN-2003) PSI-Produce Quality and Safety Lab,
USDA-ARS, 10300 Baltimore Ave. Bldg. 002, Rm. 205, Beltsville, MD
20705, USA
REMARK
Sequence update by submitter
COMMENT
On Jun 26, 2003 this sequence version replaced gi:27804758.
Length
Locus name
Molecule type
Division
Modification Date
Header: Database Identifiers
LOCUS
DEFINITION
ACCESSION
VERSION
KEYWORDS
SOURCE
ORGANISM
AY182241
1931 bp
mRNA
linear
PLN 04-MAY-2004
Malus x domestica (E,E)-alpha-farnesene synthase (AFS1) mRNA,
Accession
complete cds.
AY182241
•Stable
AY182241.2 GI:32265057
•Reportable
.
•Universal
Malus x domestica (cultivated apple)
Malus x domestica
Eukaryota; Viridiplantae; Streptophyta; Embryophyta; Tracheophyta;
Spermatophyta; Magnoliophyta; eudicotyledons; core eudicots;
rosids; eurosids I; Rosales; Rosaceae; Maloideae; Malus.
1 (bases 1 to 1931)
Pechous,S.W. and Whitaker,B.D.
Cloning and functional expression of an (E,E)-alpha-farnesene
synthase cDNA from peel tissue of apple fruit
Planta 219, 84-94 (2004)
2 (bases 1 to 1931)
Pechous,S.W. and Whitaker,B.D.
Direct Submission
Submitted (18-NOV-2002) PSI-Produce Quality and Safety Lab,
USDA-ARS, 10300 Baltimore Ave. Bldg. 002, Rm. 205, Beltsville, MD
20705, USA
3 (bases 1 to 1931)
Pechous,S.W. and Whitaker,B.D.
Direct Submission
Submitted (25-JUN-2003) PSI-Produce Quality and Safety Lab,
USDA-ARS, 10300 Baltimore Ave. Bldg. 002, Rm. 205, Beltsville, MD
20705, USA
Sequence update by submitter
On Jun 26, 2003 this sequence version replaced gi:27804758.
ACCESSION
AY182241
VERSION
AY182241.2
REFERENCE
AUTHORS
TITLE
JOURNAL
REFERENCE
AUTHORS
TITLE
JOURNAL
REFERENCE
AUTHORS
TITLE
JOURNAL
REMARK
COMMENT
GI:32265057
Header: Organism
LOCUS
DEFINITION
AY182241
1931 bp
mRNA
linear
PLN 04-MAY-2004
Malus x domestica (E,E)-alpha-farnesene synthase (AFS1) mRNA,
complete cds.
ACCESSION
AY182241
VERSION
AY182241.2 GI:32265057
KEYWORDS
.
SOURCE
Malus x domestica (cultivated apple)
SOURCE
(cultivated apple)
ORGANISMMalus
Malusxx domestica
domestica
ORGANISM Malus
x domestica
Eukaryota;
Viridiplantae; Streptophyta; Embryophyta; Tracheophyta;
Spermatophyta;
Magnoliophyta; Streptophyta;
eudicotyledons; core
eudicots;
Eukaryota;
Viridiplantae;
Embryophyta;
rosids; eurosids I; Rosales; Rosaceae; Maloideae; Malus.
Tracheophyta;
Spermatophyta; Magnoliophyta; eudicotyledons;
REFERENCE
1 (bases 1 to 1931)
eudicots;
eurosids I; Rosales; Rosaceae;
AUTHORS core
Pechous,S.W.
androsids;
Whitaker,B.D.
TITLE Maloideae;
Cloning andMalus.
functional expression of an (E,E)-alpha-farnesene
synthase cDNA from peel tissue of apple fruit
JOURNAL
Planta 219, 84-94 (2004)
REFERENCE
2 (bases 1 to 1931)
AUTHORS
Pechous,S.W. and Whitaker,B.D.NCBI-controlled taxonomy
TITLE
Direct Submission
JOURNAL
Submitted (18-NOV-2002) PSI-Produce Quality and Safety Lab,
USDA-ARS, 10300 Baltimore Ave. Bldg. 002, Rm. 205, Beltsville, MD
20705, USA
REFERENCE
3 (bases 1 to 1931)
AUTHORS
Pechous,S.W. and Whitaker,B.D.
TITLE
Direct Submission
JOURNAL
Submitted (25-JUN-2003) PSI-Produce Quality and Safety Lab,
USDA-ARS, 10300 Baltimore Ave. Bldg. 002, Rm. 205, Beltsville, MD
20705, USA
REMARK
Sequence update by submitter
COMMENT
On Jun 26, 2003 this sequence version replaced gi:27804758.
The Feature Table
FEATURES
source
gene
CDS
start (atg)
Coding sequence
Location/Qualifiers
1..1931
/organism="Malus x domestica"
/mol_type="mRNA"
/cultivar="'Law Rome'"
/db_xref="taxon:3750"
/tissue_type="peel"
1..1931
/gene="AFS1"
stop (tag)
54..1784
/gene="AFS1"
/note="terpene synthase"
/codon_start=1
/product="(E,E)-alpha-farnesene synthase"
/protein_id="AAO22848.2"
/db_xref="GI:32265058"
/translation="MEFRVHLQADNEQKIFQNQMKPEPEASYLINQRRSANYKPNIWK
NDFLDQSLISKYDGDEYRKLSEKLIEEVKIYISAETMDLVAKLELIDSVRKLGLANLF
EKEIKEALDSIAAIESDNLGTRDDLYGTALHFKILRQHGYKVSQDIFGRFMDEKGTLE
NHHFAHLKGMLELFEASNLGFEGEDILDEAKASLTLALRDSGHICYPDSNLSRDVVHS
LELPSHRRVQWFDVKWQINAYEKDICRVNATLLELAKLNFNVVQAQLQKNLREASRWW
ANLGIADNLKFARDRLVECFACAVGVAFEPEHSSFRICLTKVINLVLIIDDVYDIYGS
EEELKHFTNAVDRWDSRETEQLPECMKMCFQVLYNTTCEIAREIEEENGWNQVLPQLT
KVWADFCKALLVEAEWYNKSHIPTLEEYLRNGCISSSVSVLLVHSFFSITHEGTKEMA
DFLHKNEDLLYNISLIVRLNNDLGTSAAEQERGDSPSSIVCYMREVNASEETARKNIK
GMIDNAWKKVNGKCFTTNQVPFLSSFMNNATNMARVAHSLYKDGDGFGDQEKGPRTHI
The Sequence:
What do you do with it?
ORIGIN
//
1
61
121
181
ttcttgtatc
tcagagttca
aacctgaagc
ggaagaacga
ccaaacatct
cttgcaagct
ctcttacttg
tttcctagat
cgagcttctt
gataatgagc
attaatcaaa
caatctctta
gtacaccaaa
agaaaatttt
gacggtctgc
tcagcaaata
ttaggtattc
tcaaaaccag
aaattacaag
cgatggagat
actatggaat
atgaaacccg
ccaaatattt
gagtatcgga
1741
1801
1861
1921
ggacccacat
aataaatagc
tgtaacgttg
aaaaaaaaaa
cctgtcttta ctattccaac ctcttgtaaa ctagtactca tatagtttga
agcaaaagtt tgcggttcag ttcgtcatgg ataaattaat ctttacagtt
ttgccaaaga ttatgaataa aaagttgtag tttgtcgttt aaaaaaaaaa
a
BLAST:
Query a database for sequences similar to an
input sequence.
GATGCCATAGAGCTGTAGTCGTACCCT <—
—>
CTAGAGAGC-GTAGTCAGAGTGTCTTTGAGTTCC




Compare new genes to old ones
Compare genes from different species or
hosts
Investigate the transcriptome (cDNAs)
Identify possible functions based on
similarities to known sequences.
What are the broad goals of this lab?
 To provide an introduction to bioinformatics
with a focus on NCBI
 To introduce you to searching for articles,
sequences, scientists (perhaps yourself ;))
 To use the most powerful and reliable
method to determine evolutionary
relationships between genes
To combine your Wolbachia research with
computational biology
What are the specific goals of this lab?
 To look for brand new W strains
 To make a phylogenetic tree of W
 To ultimately compare the W tree to an
insect phylogeny to infer lateral vs. vertical
transmission of your W strains
 To contribute to a national sequence
database on the genetic diversity of W 16S
rRNA gene
Outcomes: A New Wolbachia Species?
QuickTime™ and a
TIFF (Uncompressed) decompressor
are needed to see this picture.
Insect Phylogeny
Top 5 Wolbachia BLAST matches
GATGCCATAGAGCTGTAGTCGTACCCT <-
100%
GATGCCATAGAGCTGTAGTCGTACCCT <-
100%
GATGCCATAGAGCTGTAGTCGTACCCT <- 100%
GATGCCATAGAGCTGTAGTCGTACCCT <- 100%
GATGCCATAGAGCTGTAGTCGTACCCT <- 100%
Let’s Begin Our Bioinformatic Exercise
Lab 5