IntroComputationalGenomics_shortx
Download
Report
Transcript IntroComputationalGenomics_shortx
Introduction to computational
evolutionary genomics
Yong E. Zhang
Institute of Zoology, CAS
2015/6/26
http://zhanglab.ioz.ac.cn
Outline
1. Concept and topic
2. Cases or new trends
1. Concept and topic
1.1 Speciation and tree
1.2 Ortholog and paralog
1.3 Mutation
1.4 Polymorphism and divergence
1.5 Selection
1.1 Speciation and tree
Darwin, C. (1837)
From tree of life to web of life
Adapted from Eugene V. Koonin (2009) Nucleic Acids Res.
Is it time to redefine evolutionary biology?
HGT drives adaptation
Nancy A. Moran and Tyler Jarvik (2010) Science
Software to build tree and infer divergence time
1.2 Ortholog and paralog
Gene
duplication
a Hemoglobin
b Hemoglobin
Speciation
Mouse
a Hb
Rat
a Hb
Paralogs
Mouse
b Hb
Rat
b Hb
Orthologs
By David Pollock
Orthologs may not be functionally more
similar between each other
“It is widely assumed that orthologs share similar
functions, whereas paralogs are expected to diverge
more from each other. But does this assumption hold up
on further examination? We present evidence that
orthologs and paralogs are not so different in either their
evolutionary rates or their mechanisms of divergence.“
Studer, R. et al. (2009) Trends in Genet.
Orthologs may not be functionally more
similar between each other (continued)
Gabaldón, T. & Koonin, E.V. (2013) Nature Rev. Genet.
Detection of orthologs
We can perform BLAST all-against-all search and pull out one-to-one best
hits. A more convinient way is to download Ensembl pre-computed
annotation.
http://www.ensembl.org
Detection of orthologs (continued)
http://genome.ucsc.edu
Detection of orthologs (continued)
Meyer et al. (2013) Nucleic Acids Res
1.3 Mutation: single nucleotide
polymorphism (SNP)
From Wikipedia
Types of SNP
Purines:
Transitions
A
G
Transversions
Pyrimidines:
C
T
David Pollock (2011)
SNP in coding regions
Cys Arg Lys
UGU/AGA/AAG
Silent
Nonsense
Missense
UGU/CGA/AAG
Cys Arg Lys
UGU/GGA/AAG
Cys Gly Lys
Cys STOP Lys
First position: 4% of all changes silent
Second position: no changes silent
Third position: 70% of all changes silent (wobble position)
David Pollock (2011)
Count synonymous and non-synonymous mutations
Indels
…TGTACAAAG…
Insertion
Deletion
…TGTAAAAG…
…TGTTACAAAG…
Adapted from David Pollock (2011)
Indels may increase the local substitution rate
Tian, D. et al (2008) Nature
Structural variation
Sharp, A., Cheng, Z. & Eichler, E.E. (2006) Annu. Rev. Genomics Hum. Genet.
1.4 Polymorphism and divergence
Graur, D. & Li, WH (2002) Fundamentals of molecular evolution
Polymorphism and divergence (continued)
Innan, H & Kondrashov, F (2010) Nature Rev. Genet.
Population genetics and molecular evolution is intrinsically
interconnected. Conventionally, the key question for evolutionary
biologists is to infer the evolutionary history of DNAs and the
underlying evolutionary forces.
1.5 Positive or adaptive Selection
From Wikipedia
Negative or purifying selection
In natural selection, negative selection or purifying
selection is the selective removal of alleles that are
deleterious.
From Wikipedia
Ka/Ks
The Ka/Ks ratio (or ω, dN/dS), is the ratio of the
number of non-synonymous substitutions per nonsynonymous site (Ka) to the number of synonymous
substitutions per synonymous site (Ks).
Ka/Ks > 1, positive selection
Ka/Ks = 1, neutral evolution
Ka/Ks < 1, negative selection
http://abacus.gene.ucl.ac.uk/software/paml.html
PAMLX
Xu & Yang (2013) Mol Biol Evol
Hyphy
Delport, W. et al. (2010) Bioinformatics
Outline
1. Concept and topic
2. Cases or new trends
“We have learned nothing from the genome”
Venter, C.
HGP is dead. Long live HGP!
Our journey is to the ocean of stars.
Rapidly decreased cost enabled by next
generation sequencing (NGS) techniques
Explosion of NCBI data and services
http://www.nlm.nih.gov/about/2015CJ.html
Golden age
“It is an exciting time to be studying the evolutionary forces that shape genomic
variation in natural populations. After decades as a theory-rich and data-poor
discipline, rapidly advancing genomic technology is turning the intellectual
dynamic in population genetics on end.”
By Langley, C.
“Within years, tens of thousands of complete genome sequences will be available
from humans and from extinct hominids, as well as from thousands of other
species. Given the human mutation rate, we will soon know of variation among
individuals at almost all sites in the genome. For population genetics, this ushers in
a previously unimaginable opportunity to reconstruct the entire genealogical and
mutational history of humans and pushes us against the limits of what we will be
able to infer about the evolutionary and genetic forces that affected every region
of the genome. “
Przeworski, M.(2011) Science
Golden age (continued)
Things previously impossible become
possible
1. Deep population survey
2. Cancer evolution
Micro Evolution
3. Experimental evolution
4. In search of loci under recent adaptation
5. Paleogenomics
6. Expressional evolution
7. Meta genome
8. Large-scale comparative genomics
Macro Evolution
1. Deep population survey
Arabidopsis 1001 project
Drosophila African survey
Hunting down my son's killer
http://matt.might.net/articles/my-sons-killer/
Understand yourself
Chen et al (2012) Cell
Kitzman et al (2012) Science Trans. Med.
2. On timing of cancer progress
“A quantitative analysis of the timing of the genetic
evolution of pancreatic cancer was performed, indicating
at least a decade between the occurrence of the initiating
mutation and the birth of the parental, non-metastatic
founder cell.”
Yachida, S. et al. (2010) Nature
Tumor heterogeneity
3. Understand virus evolution in human cells
Experimental evolution of fruitfly
4. Tibet adaptation
Yi, X., et al. (2010) Science
Insect adaptation
Zhen, et al. (2012) Science
Adaptation stories of wide animals
5. Human Paleogenomics
“We show that Neandertals shared
more genetic variants with presentday humans in Eurasia than with
present-day humans in sub-Saharan
Africa, suggesting that gene flow
from Neandertals into the ancestors
of non-Africans occurred before the
divergence of Eurasian groups from
each other.”
6. Primate transcriptome evolution
Brawand, D., et al. (2011) Nature
Expressional variation within human
Pickrell, J., et al. (2011) Nature
7. Metagenomics
Coghlan, M.L. et al. (2012) PLoS Genet.
8. Animal survey
Insects are "Little Creatures Who Run the World"
Wilson, E.O.
“Therefore, we, the undersigned, are pleased to announce the launch of the
“i5k” initiative to sequence the genomes of 5000 species of insects and other
arthropods during the next 5 years (8). This project is aimed at sequencing and
analyzing the genomes of all species known to be important to worldwide
agriculture and food safety, medicine, and energy production; all species used
as models in biology; the most abundant insects in world ecosystems; and, to
achieve a deep understanding of arthropod evolution, representatives of
insect relatives in every major branch of arthropod phylogeny. “
Robinson, G.E. et al. (2011) Science
Animal survey (continued)
Misof, B. et al. (2014) Science
Jarvis, E.D. et al. (2014) Science
Animal survey (continued)
Animal survey (continued)
It is time to play,
Many thanks. Any question is welcome!
Email: [email protected]