Power point presentation - the laboratory for genomics and

Download Report

Transcript Power point presentation - the laboratory for genomics and

Bioinformatics
Edgar Scott
What is Bioinformatics?


An interdisciplinary field that combines concepts from biology,
probability and statistics, and computer science to create and test
new hypotheses based on sequence data.
Resources



Databases (i.e., protein sequence database, nucleotide sequence database,
tertiary structure databases)
Computational Tools (i.e., sequence similarity searching, phylogenetic
analysis, tertiary structure prediction tools.)
Bioinformatics Applied



Pharmacogenomics
Genome Sequence and Annotation
Forensic Entomology
EMBL


European Molecular Biology Laboratory
(EMBL)
Part of International Nucleotide Sequence
Database Collaboration
EMBL
 National Center for Biotechnology (NCBI)
 DNA Databank of Japan (DDBJ)


Provides numerous databases and computational
tools
Databases and Database Records


Database – a collection of data or records stored in a
computer system
Database record - A data file that contains a sequence
and annotations

Sequence


…TAGCCTCCTTATTCGAGCCGAGCTGGGCCAGCCAGGCAA
CCTTCTAGGTAACGACCACATCTACAACGTT…
Annotation




Mitochondrial genome
DNA
Homo sapiens
AM948965

Accession number - A unique identifier for the database record
Example Database Record




Point your browser to http://www.ebi.ac.uk/
Type the ascension number into the text box
AM948965
Click on Nucleotide Sequences
Click on AM948965
Sequence Alignments

Sequence alignment – a comparison between two sequences to identify a
series of characters or character patterns in the same order in both sequences.




Basic Local Alignment Search Tool (BLAST) – sequence similarity search
tool.



Pairwise Global Alignment
Pairwise Local Alignment
Multiple Sequence Alignment
Compares a query sequence to a sequence database using the local alignment
method.
Returns a list of sequences that are significantly similar to the query.
Types of BLAST programs





Blastp: compares protein sequence to a protein database
Blastn: compares DNA sequence to DNA database
Blastx: compares a translated DNA sequence to a protein database
Tblastn: compares a protein sequence to a translated DNA database
Tblastx: compares a translated DNA sequence to a translated DNA database
Sequence Alignments

Alignment features





Identical matches
Conservative matches (conservative
substitutions)
mismatches
gaps
Alignment scoring




Percent identity = (ident.
matches)/(align. length)*100
Percent similarity = (ident. + cons.
matches)/(align. length)*100
Alignment score = a score that
measures the similarity between the
two sequence being compared that
takes into account all identical
matches, conservative matches,
mismatches, and gaps.
Expectation Values = estimation of
the number of times an alignment
with this alignment score could be
observed by random chance from a
database search.
BLAST example



Point your browser to
http://www.ebi.ac.uk/Tools/blast2/nucleot
ide.html
From the Lab Home Page, copy and paste the
BLAST input sequence into the input text
box.
Press the Run BLAST button.
Molecular Phylogenetics


The analysis of molecular sequences to infer
evolutionary relationships between a group of
sequences or a group of organisms.
ClustalW2 – bioinformatics program that creates
multiple sequence alignments and phylogenetic trees


Multiple sequence alignment – an alignment with three or
more sequences.
Phylogenetic tree – a diagram of nodes and branching lines
depicting close and distant relationships between sequences
or organisms.
Molecular Phylogenetics
Example MSA



Point your browser to
http://www.ebi.ac.uk/Tools/clustalw2/
From the Lab home page, copy and paste the
ClustalW input sequences into the text box.
Press the Run button.
Example Tree





Point a second browser to
http://www.ebi.ac.uk/Tools/clustalw2/
From your previous browser, copy and paste the entire
multiple sequence alignment into the text box.
Change the “TREE TYPE” designation from “none”
to “nj”.
Change the “IGNORE GAPS” designation from “off ”
to “on”.
Press the Run button