Homology assessment and molecular sequence

Download Report

Transcript Homology assessment and molecular sequence

Homology assessment and
molecular sequence alignment.
Chris Stewart and Ka Yi Ling
Genetics 677
Homology
Classical
Phylogenetics
Molecular
Phylogenetics
Big picture
Evolution
Divergent
Convergent
Homology
Analogy
Orthologs
Paralogs
Systematics
Homology
1. Equal in position and details
in structure
2. Equal in developmental origin
(i.e. cellular/tissue structure)
3. Logical and continual series of character
state transformations
Figure from http://images-eu.amazon.com/images/P/0895262002.01.LZZZZZZZ.jpg
Homology
Speciation
Duplication
A
B
C
Character
Trait from group of organisms, which has
two or more independent states that can
be evaluated
http://www.choose-life.org/Map_states_color.jpg
Parsimony
Working principle that prefers the least complex
explanation for an observation
Figure from http://www.cartoonchurch.com/cartoons/large/simple-living-cartoon.gif
Classical phylogenetics
Method of parsimony analysis used to
develop
cladograms
explaining
evolutionary relationships.
Fig 1. Hypothetical cladogram
Constructing cladograms
A
B
C
E
F
D
G
Matrix
Segmented Jawed
Cat
Hair
Placenta
Multi-cell
Limbs
1
1
1
1
1
1
Kangaroo
1
1
1
0
1
1
Lizard
1
1
0
0
1
1
1
1
0
0
1
0
1
0
0
0
1
0
0
0
0
0
1
0
0
0
0
0
0
0
Salmon
Earthworm
Sponge
Amoeba
Possible cladograms
….which one do you pick?
Homoplasy & subjective
characters
A faulty assignment of primary homology
Figure from: http://www.blackwellpublishing.com/ridley/images/analogies.jpg
Things to consider
• Auxillary Principle
• Congruence Test
More things to consider
• Weighting
–Needs to respond to homoplasy
• Independent characters
http://ksuoncampus.com/2008/01/29/evolution-of-mario/
Molecular Phylogenetics
• Goal:
– to infer process from
pattern
• Why
– Not just the observables
– Alternative method to derive
evolutionary relationships
Protein or
Gene of
interest
Sequence alignment
programs
Sequence alignment
programs
Figure modified from
http://bioinfo.ochoa.fib.es/docus/courses/Ali2005Filogenias/seq_analysis/images/SeqAnalFloChart.gif
Molecular characters
• What can be used?
– Nucleotide sequences
– Protein sequences
– DNA
– RNA
– Protein
• NO single universally accepted recipe
Figure from http://www.ittc.ku.edu/bioinfo_seminar/images/wheel.gif
List of alignment software
Sequence Alignment
• Types
–Pairwise alignment
–Multiple sequence alignment
Figure from http://en.wikipedia.org/wiki/Sequence_alignment
Human Molecular Genetics, 2006, Vol. 15, Review Issue 1, R54
The nuts and bolts
1. Gene/ protein of
interest
2. Homolog search
3. Sequence
alignment
4. Tree building
Figure from http://www.usingneuralnetworks.com/images/Face_But_Not_The_Name_Cartoon.jpg
Database Search: BLAST
•
•
•
•
•
Basic Local Alignment Search Tool
Input: Protein and Nucleotide
Default algorithm: Blosum62
Other algorithms: PAM family, Blosum family
Sites that use BLAST: NCBI, EBI,
GenomeNet, PIR, DDBJ
Figure: NCBI alignment result site.
How does BLAST score an alignment?
Default matrix in BLAST 2.0
BLOSUM= BLOcks Substitution
Matrix
Based on local alignments
How does BLAST score an alignment?
BLOSUM62: contributions from
proteins more than 62%
identical are weighted to sum to
one.
Scores: Number values
BLAST
The nuts and bolts
1. Gene/protein of
interest
2. Homolog search
3. Sequence
alignment
4. Tree building
Figure from http://www.usingneuralnetworks.com/images/Face_But_Not_The_Name_Cartoon.jpg
Homologene
Aligning Genes
Homologene scoring
The nuts and bolts
1. Gene/protein of
interest
2. Homolog search
3. Sequence
alignment
4. Tree building
Figure from http://www.usingneuralnetworks.com/images/Face_But_Not_The_Name_Cartoon.jpg
Pair-alignment
• Algorithm: Needleman-wunsch dynamic
programming
• Global alignment
• DNA, protein
• Find positional
primary homology
• Sites that use N-W: EBI server
Figure from http://ww2.cs.fsu.edu/~hui/research/scanalyze_tutorial/pics/registered_group.jpg
Needleman-Wunsch algorithm
Figure from Journal of Medical Physics 39 (2006) pg 29
Needle-wunsch algorithm
BLAST align
Multiple sequence alignment
• Example: ClustalW
• Progressive alignment
• Nucleotide and Protein
sequences
• Local or Global alignment
• Sites that use MSA: EBI, DDBJ, PBIL,
EMBNet, GenomeNet
Figure from www.cs.umbc.edu
ClustalW @ EBI
The nuts and bolts
1. Gene/protein of
interest
2. Homolog search
3. Sequence
alignment
4. Tree building
Figure from http://www.usingneuralnetworks.com/images/Face_But_Not_The_Name_Cartoon.jpg
SRD5A2 @ TreeFAM
Discussion
• Does sequence orthology relate to
functional equivalence?
• Can paralogs be functionally related?
• Do unsequenced genomic regions affect
the understanding of
orthology and paralogy?
Figures from:
http://www.ndpgenderequality.ie/images/cartoons/cartoon_large_intro.gif and
http://www.faithmouse.com/cartoon567.jpg
Pros
Cons
Pros and cons
• Evolutionary history
• Find animal models
• Relation between
structure and
function
• Biological
processes
•
•
•
•
•
Best guess
Parsimony
Algorithms
Speed vs accuracy
Evolution vs religion
Figure from: http://www.pbrainprojects.com/images/angel_devil.jpg
Assumptions, assumptions,
assumptions
• If Xs is true then the tree is true…
Figure from: http://www.gdargaud.net/Humor/Pics/string_theory.png