PowerPoint Presentation - Evolutionary questions at DUSEL

Download Report

Transcript PowerPoint Presentation - Evolutionary questions at DUSEL

Evolutionary questions at DUSEL
overview
• Background: Recent progress in microbial
evolution, comparative genomics, community
genomics
• Some examples of broad questions that might be
addressed at DUSEL
– General indications of approaches
• Linkages
– Databases, other initiatives
Evolution of approaches to prokaryotic diversity
•
•
Presequencing era: bacterial species characterized on the basis of biochemical
properties (stains/metabolism)
Early days of RNA sequencing: revolutionized views of bacterial relationships and
gave basis for current view of universal tree of life (Carl Woese, 1980s)
– Recognition of the Archaea as distinct group
– Previously recognized higher taxa completely reorganized
– Small genome bacteria recognized as derived rather than primitive (Mycoplasmas/Mollicutes,
Rickettsia)
– confirms symbiotic origin of mitochondria and chloroplasts
•
•
Largescale small subunit rDNA sequencing using PCR, (Ribosomal database project,
improved coverage and accuracy of the universal tree)
Sequence-based approaches to microbial diversity (1990s)
– Sequence-based diversity inventories
– In situ hybridization of rRNA to relate sequence to distribution
•
•
Comparative Genomics: finding of massive movement of novel genes into lineages of
organisms over evolutionary time, development of advanced methods for recognizing
functional homology
Community genomics / “meta-genomics” : merging genomics and inferences about
functional properties with evolutionary and ecological framework
Microbial communities from
diverse environments
• Hyperdiverse
• Most (~99%) cannot be readily cultured
• “great plate count anomaly”
– Molecular sequence surveys reveal much more
diversity among taxa than found from culturing
from the same environmental sources
Full genome sequences
now available for most
major groups
Numbers growing
quickly
Environmental genomics
Eg.: Tyson et al. Nature 2004.
Community structure and
metabolism through reconstruction
of microbial genomes from the
environment.
Large scale sequencing from
pink biofilm in acid mine
drainage: Simple community
Dominated by
Leptospirillum (Bacteria,
yellow) & Ferroplasma
(Archaea -purple)
Probes specific to
determined DNA sequences
Microbial evolutionary genomics
• 1st genome sequenced in 1994, hundreds now
complete and public, thousands in the near future.
• Broad questions that are now being addressed:
– How do genomes acquire their contents, as lineages evolve
over long time scales?
– What are the forces that enable complex biological
processes to be maintained?
– How do genes (and their associated functions) originate in
genomes and how are they lost?
– How do biological capabilities move from one organism to
another and how do they persist over space and time?
Mechanisms and consequences of Lateral
Gene Transfer
Transduction:
via a virus
(bacteriophage)
A
B
C
A
B
C
Conjugation:
direct contact
(plasmid)
Organismal
phylogeny
Transformation:
integration of
free DNA
Redfield, Nat. Rev. Genet. 2001
gene phylogeny
Example of genome dynamics over time due to Lateral Gene Transfer
S. typhimurium
Gene uptake
In gammaProteobacteria
E. Lerat et al 03
E. coli
Y. pestis KIM
Y. pestis CO92
W. brevipalpis
B. aphidicola
P. multocida
H. influenzae
V. cholerae
P. aeruginosa
X. fastidiosa
X. campestris
X. axonopodis
Most genes in most
genomes arrived via
LGT after the common
ancestor.
Most genes arriving via
LGT come from distant
sources (not in this
group)
Many persist as
vertically transmitted
genes within the
descendant clade.
---but many are lost
quickly (many present
only in tips of tree)
Evolutionary and ecological
genomics underground:
Questions first, logistics later
• Imagine obtaining
complete genomes for
whole communities of
organisms in deep
subsurface communities
• And the phage as well
• What questions could be
addressed?
Evolutionary questions
1. Do deep organisms consist of recent twigs on the Tree of Life that is already
well sampled on the surface or do they include ancient branches with clues to
the earliest organisms on Earth?
2. Do subsurface organisms show distinctive genome dynamics, such as lack of
gene uptake due to low densities? Do they comprise a gene reservoir that
exchanges genes with surface organisms?
3. Do subsurface organisms show distinctive mutational profiles, as expected from
their distinct mutagenic environment? Do they show fast evolution of
polypeptide sequences, reflecting small genetic population sizes/spatial
substructuring?
4. What are unusual adaptations, such as mechanisms for responding to unusual
kinds of stress, using different energy sources, etc? Can they provide us with
novel sources of genes that can be harnessed for practical uses?
For each question, answers are likely to depend on the particular niche and may
differ among microbial communities associated with different processes or
sites.
1. Do deep organisms consist of recent twigs on the
Tree of Life that is already well sampled on the
surface or do they include ancient branches with
clues to the earliest organisms on Earth?
• Approaches
– Phylogenetics
• Models of sequence evolution provides a statistical basis for
inferring relationships and ancestral sequences
– Molecular clocks and “coalescent” approaches:
• allows age of divergence to be calculated in units of generation
times and population sizes.
• Based in “Neutral Theory” of molecular evolution
2. Do subsurface organisms show distinctive genome
dynamics, such as lack of gene uptake due to low
densities? Do they comprise a gene reservoir that
exchanges genes with surface organisms?
• Approaches: genomic sequencing, comparative
analyses with growing genomic databases,
phylogenetics.
• Comparisons of genomes from organisms isolated
from sites with similar and different geochemical
features.
• Factors enabling smaller genome sizes in
bacteria
– Genetic drift (genes are inactivated through mildly deleterious
mutations and are then lost) (eg symbionts and pathogens with
small effective population sizes)
– Parasitism (genes are not needed because hosts provide gene
products) (eg pathogens such as mycoplasmas)
– Environmental constancy (eg Prochlorococcus, symbionts) no
need for different genes for different conditions, no need for
signalling mechanisms
– Lack of intense biowarfae in dense complex communties: eg soil
microbes sometimes have quite large genomes with size
augmented by genes for making bacteriocins, antibiotics (eg
Streptomyces and others).
– Selection for efficiency? (small genomes are more competitive in
replication race) (eg possibly low-light adapted Prochlorococcus
marine photoautotrophs).
What is the role of phage in dynamics of
subsurface genomes?
• Do phage move genes from the deep subsurface to
surface communities?
– Much evidence for phage-mediated gene movement among
diverse lineages and environments in surface communities.
– As a set, phage contain the largest diversity of gene types
– Phage genes can be considered as bacterial genes in transit
• But are phage rare in subsurface communities due to
low host densities?
– These genomes may be uniquely deprived of gene uptake
and recombination more generally
Approaches: can sample phage directly and also search
bacterial chromosomes for phage indicator genes.
3. Do subsurface organisms show distinctive
mutational profiles, as expected from their distinct
mutagenic environment? Do they show fast evolution
of polypeptide sequences, reflecting small genetic
population sizes/spatial substructuring?
gene sequence evolution in the deep
• Deep organisms expected to have unusual mutational patterns and
unusual patterns of fixation of mutations in populations, both affecting
gene evolution
• Expectations for Mutation rates
– Lack of uv irradiation
– Other mutagens present--Chemical, heat-dependent
– Long generation time
• some mutational events are dependent on replication and will be proportional
to generation time not absolute time; other mutations result from DNA damage
between replication events, such as during transcription.
– Repair processes
• differ among organisms with some having more complete sets of repair
pathways than others. Organisms with small genomes tend to have fewer
repair pathways and may suffer effectively higher rates of mutation.
• Expectations for fixation of mutations in populations
– at sites under purifying selection (most sites in most genes)
– From Neutral Theory and much empirical data: small genetic population sizes
and/or low rates of genetic recombination impose more rapid rates of protein
evolution, resulting from inability to purge mildly deleterious mutations.
3. Do subsurface organisms show distinctive
mutational profiles, as expected from their
distinct mutagenic environment? Do they show
fast evolution of polypeptide sequences, reflecting
small genetic population sizes/spatial
substructuring?
• Approaches: genomic sequencing, comparative analyses
with growing genomic databases, phylogenetics.
• Comparisons of genomes from organisms isolated from
sites with similar and different geochemical features.
• Many kinds of analyses are designed to distinguish
changes in mutation rates from changes in substitution
patterns
--> Amount of sequence evolution-->
Do gene sequences
of subsurface
organisms evolve
at faster or slower
rates than those of
surface organisms?
Deep
Surface
?
Address by using sequences
obtained from modern organisms,
reconstruct phylogenetic relationships,
ancestral sequences, and relative lengths of branches
4. What are unusual adaptations, such as mechanisms for
responding to unusual kinds of stress, using different
energy sources, etc? Can they provide us with novel
sources of genes that can be harnessed for practical uses?
• Approaches: “meta-genomics” analyses of
organisms -- can be associated with geochemical
processes measured in situ
• Systems biology: how do individual processes
mediated by different organisms lead to net effects
of microbial communities on their surroundings?
• Processes and products useful in bioremediation,
industrial applications, pharmaceuticals
“meta-genomics”
• Obtaining gene sequences directly from
environmental samples of DNA, no cultivation or
characterization of individual species
• Two goals
– Discovery of novel biocatalyst genes for applications
– Understanding of microbial community diversity
– Understanding of basic cell processes (“systems
biology” approaches in particular)
Two basic metagenomic approaches
1. Extract DNA from environmental sample
2. Construct library
conventional small insert
(<10kb) library
large insert (cosmid or BAC)
library (up to 200 kb), allows
sampling of whole operons
Sequence DNA or RNA,
look for genes with
functions of interest
Perform functional screens:
directly test for some
biochemical property in the
cloning host
3. Screen
Limitations:
Limits search to genes with
detectable (evolutionary)
homology to functionally
characterized genes:
Sequence or structural
homology
Possible problems with efficient
transcription of the cloned
fragment, translation, secretion
of the product, correct chaperones
for folding of the product
Metabolic inferences
Example from Tyson
et al study (acid mine
community)
e.g.,
Infer
N fixation
In Leptospirillum
Group III,
Ferroplasma must
Use N fixed by
Lepto. Gr III
Example from marine bacteria
PCR-based approaches have revealed that 99% of marine microbes
cannot be readily cultured in the lab.
Gene sequencing from marine DNA has revealed undiscovered genes
that imply new metabolic strategies:
Proteorhodopsin = rhodopsin-based photosynthesis in bacteria
located on a 130 kb BAC
recognized based on similarity to archaeal rhodopsins
(same basic structure as animal eye opsins)
Bacterial Rhodopsin:
Evidence for a New Type of
Phototrophy in the Sea
Beja et al. Science 2000
Expressed in E. coli (red)
structure
Linkages of Evolutionary questions at DUSEL
with other large scale initiatives
• Well developed tools and databases for genomics data of
microbes (NIH-NCBI, DOE, others).
• Cyber Infrastructure for Tree of Life Initiative (NSF)
– network and research project to develop software and databases for
phylogenetic information
– $11.6 million over 5 years
– Central aim is to establish Universal Tree of all Life using
biological data, esp. DNA sequence data
– Large component concerns whole genome evolution
– PI Bernard Moret (Computer Science, U New Mexico), plus large
team at 13 institutions, hosted by SD supercomputing network.
• Evolutionary Synthesis Center (NSF)
– NSF sponsored center, recently awarded to Research Triangle
institutions, emphasis on interdisciplinary research on evolution,
analysis of large datasets, education and outreach
• Others