Transcript Slide 1

Topic 16. Lecture 25. Origin of Life and Evolution of
Functional Phenotypes
We have no direct data, and only rather speculative ideas, on how life emerged.
The context of the origin of life:
In the beginning the surface of the Earth was extremely hot, because the Earth is the
product of a collision that created the Moon. As it cooled, the Earth’s surface passed
through every temperature regime between silicate vapor to liquid water and perhaps even
to ice, eventually reaching an equilibrium with sunlight.
The Moon was formed ~4,500Mya; the oldest Earth crust is ~4,300My old, and the first
definite fossils of advanced cellular life are ~3,500My old. Thus, life originated on Earth
(almost certainly) between 4,500Mya and 3,500Mya.
There is no sharp boundary between simple inorganic and complex organic compounds.
The possibility of prebiotic synthesis of many organic molecules, including amino acids,
was demonstrated in the Miller-Urey experiment in 1953.
Organic molecules could have spontaneously formed
on early Earth from inorganic precursors.
This is the only uncontroversial part of current
thoughts about the origin of life.
Still, a lot of work, mostly theoretical, has been done
on possible mechanisms of the origin of life.
Aleksandr Oparin, who proposed, in 1923,
the "membranes first" scenario.
There is even a journal on the origin of life.
Before studying the origin of life, we need to define life.
Replicators that can evolve by natural selection must: (i) multiply and (ii) have heredity that,
however, is not perfectly accurate (variability). However, these qualities are necessary but
not sufficient to be "life". Viruses and even computer codes are also replicators.
long i, k, k4;
k = 10;
k4 = 1;
for (i = 0; i != k; i++) k4 *=4;
print (k4);
A living being must also be a self-sufficient physical entity. Minimal living systems must
comprise at least a metabolic subsystem, a hereditary subsystem and a boundary. Such
things can be called reproducers.
A reproducer must have at least a minimum developmental
capacity required for further multiplication. There is not only an
informational link but also material overlap between
generations of reproducers.
Three essential components of cells (reproducers) - boundary (membrane), template
(responsible for inheritance), and metabolism (which keeps a reproducer going) - may have
originated independently, to be combined by primordial "symbioses".
Molecularly, the boundary subsystem probably
always consisted of amphiphilic
phospholipids that can spontaneously form
bilayers and vesicles.
Similarly, the heredity subsystem probably
was represented by nucleic acids (DNA or,
more likely, RNA) from the very beginning.
It is less clear what molecules were
responsible for primitive metabolism.
Ribosomes are so immensely complex that
genetic code, translation, and proteins
probably appeared rather late.
The RNA world scenario
RNA molecules may be able to not only serve as templates for self-replication, but can also
catalyze many other reactions. In modern life, some RNAs, known as ribozymes, act as
enzymes, and some mRNAs catalyze removal of their own-introns (autocatalytic splicing).
The origin of life could begun from "naked" self-replicating and chemically active RNA
molecules ("RNA World"). This hypothesis was proposed by Walter Gilbert in 1986 and
implies that heredity with variability appeared very early. A possible problem with this
scenario is "error catastrophe".
An obvious problem is that primitive self-catalytic replication (without helpful proteins) was
probably very imprecise. This can lead to too high a genetic load.
According to Haldane-Muller principle ("one mutation - one genetic death"), L = U, where U
is the genomic deleterious mutation rate. This is true, however, only when mutations are
eliminated one-by-one. If an eliminated individual may carry many mutations, HaldaneMuller principle should be generalized (only without sex):
L = 1 - C,
where C is the probability of not acquiring even a single new mutation
Indeed, without sex, all offspring that acquired at least one mutation must be eliminated. If
mutations are individually rare and independent, C = exp(-U), and L = 1 - exp(-U).
If U = 5, L ~99%. Thus, U must be < 1-2. So, a self-replicating RNA cannot be much longer
than 1/m, where m is the per nucleotide mutation rate.
Pre-biotic self-replicating RNAs must be no longer than 100-1000 nucleotides. It is not clear
if this was a serious limitation.
The lipid world scenario
Alternatively, the origin of life could start from its boundary. Amphiphilic compounds tend to
form bilayers, micelles and vesicles. They can grow by incorporating molecules from the
environment. This scenario was proposed by Oparin in 1923.
Perhaps, the lipid world can have some heredity and variation, creating a possibility for
natural selection. Micelles containing more efficient gangs of lipids will take over.
We do not know if RNA-world and/or lipid-world stages existed during the origin of life.
Current diversity of life can shed light on the features of LUCA.
LUCA, which lived before 3,500Mya, was already a fairly advanced cell:
- it had modern genetic code, translation on ribosomes, and, thus, proteins,
- it was almost certainly anaerobic,
- it apparently was mesophilic,
- it probably had DNA, although this is uncertain.
We have only rather vague ideas on how life originated. Worse, it is not clear if this will ever
be known. It might be feasible to experimentally reconstruct all stages in the origin of life,
but this is not going to happen soon.
Evolution of complex phenotypes: molecules, cells, organisms
Back to post-LUCA evolution! Its most mysterious and important facet is the origin of
complex, functional phenotypes.
"To suppose that the eye ... could have been formed by natural selection, seems ... absurd in
the highest possible degree" (Darwin, "The Origin of Species", 1859, Chapter 6).
Data on past evolution of life demonstrate that gradual origin of complex phenotypes is
possible, and reveals some interesting general patterns in the process - but this is not a
substitute for a deep understanding.
Generalizations concerned with adaptation and complexity:
1. Genetical aspects of adaptive evolution
a) Evolution of both coding and non-coding sequences is important for adaptation
b) The target for strong positive selection is narrow at each moment
c) Tightly related genes can perform rather different functions
2. Phenotypic aspects of adaptive evolution
a) Adaptations can be very general and very specific
b) Evolution is irreversible
c) Perhaps, all adaptations are imperfect
3. Origin of novelties
a) New non-coding regulatory sites, but not new genes, often appear from scratch
b) Origin of phenotypic novelties is usually opportunistic and can happen fast
4. Dynamics of complexity
a) Complex phenotypes evolve through adaptive intermediate stages
b) Complexity is rapidly lost, if selection stops maintaining it
c) The overall trend is for complexity to increase
The key issue: why are complex phenotypes designable, so that greedy evolution can arrive
to them from simple phenotypes.
Evolution impossible
Evolution possible
Of course, the reality is immensely more complex. Remember, were are dealing with fitness
landscapes in spaces of:
molecular structures
(mostly determined by
their sequences)
affanmatrewklfri qwertyasdfghcvnm plmncvhgfdsawreyqi
cellular structures
(mostly determined by
networks of interaction)
organismal structures
Even the space of sequences is hard to imagine. There are 201000 possible amino acid
sequences of length 1000. Still, it is easy to think of all possible sequences. In contrast, it is
not clear how to list "all possible networks" or "all possible organisms". Such spaces are
essentially infinite-dimensional.
If we could only knew the fitness of each possible
molecule, cell, and organism, we would understand
their Macroevolution. Of course, this is hopeless.
Real fitness landscapes must be very rugged - that is why evolution is prone of producing
suboptimal phenotypes. Perhaps, occasional shakings of the fitness landscape, due to a
changed environment, are essential for evolution.
Are natural adaptations designable? There are useful photoreceptors less complex than the
human eye.
A protist Euglena possesses a
distinct eyespot or stigma, where
photosensitive pigments are located.
Planaria (genus Dugesia), has two ocelli or
eyespots composed of cells full of
photosensitive pigments.
Data on living and past organisms tell us about stepping-stones that led to human brain.
However, in many other cases, known living beings, both present and past, do not provide
any data on possible intermediate steps in the evolution of a complex adaptation.
For example, we know nothing about simple, primitive, imperfect ribosomes. Perhaps,
primitive ribosomes were made exclusively of RNA - but even this is only a hypothesis.
Thus, experimental studies of evolutionary phases spaces and fitness landscapes in them
are necessary. The longest path, within the space of phenotypes, that has been so far traced
experimentally, consisted of FIVE steps.
Artificial transformation of a NAD-binding enzyme into a NADP-binding one. A: Five amino
acids of isopropylmalate dehydrogenase that cause it to preferentially bound NAD (ignore
anomalous Arg341). B: Five amino acid replacements which convert NAD-binding enzyme
into NADP-binding enzyme. Together, these replacements change the enzyme from the 100fold preference for NAD to the 200-fold preference for NADP.
In the absence of data from observations and experiments, can we use theory? Only to the
extent to which we can predict functioning and fitness of phenotypes without observing or
creating them. So far, this is mostly impossible even for molecules:
.
Fisher's Fundamental Theorem only tells us that the population will climb upward on fitness
landscape, and within-population variation is too tiny to reveal much of this landscape.
Still, in some simple cases fitness landscapes can be studied "in silico".
1) Fitness landscapes in the space of RNA sequences.
In the case of RNA, sequence > phenotype maps are relatively simple, and in silico evolution
of short "functional" sequences can bring interesting results.
The key results (Science 280, 1451-1455, 1998):
1) each phenotype, understood as a particular secondary structure, can be obtained from
very many very different sequences.
2) phenotypes are "entangled" in the space of sequences: one substitution can convert a
sequence that folds into phenotype A into sequences that fold into many other phenotypes.
501,572 one-substitution modifications of
2199 different sequences that all accept the
same clover-leaf secondary structure of a
tRNA accept 141,907 different structures.
Only 23% of these modifications retained the
original structure.
The 12 most common structures of the
modifications are mostly similar to the
original tRNA structure:
As a result, almost every phenotype can be reached from any other phenotype through only
a small number of phenotypic steps. This pattern, of course, are ideal for evolution.
It is not clear if it is universal, however. In particular, a particular 3D structure in proteins
cannot be always obtained from a wide variety of amino acid sequences.
2) Fitness landscapes in the space of feed-forward circuits.
Structure of the "genotype" > phenotype map of simple feed-forward circuits is very close to
the ones found in RNA: a circuit can be completely rewired keeping its input-output function
intact, and a small neighborhood of each circuit contains almost all phenotypes.
Feed-forward network (FFN) consists of a set of inputs (I
units), hidden units, and outputs (O units). Units can connect
strictly to the layers above, thus avoiding cycles, except for
the outputs, which cannot connect directly to the inputs.
The RNA-like structure of the "genotype" > phenotype map of FFNs leads to efficient
evolution. The population finds its evolutionary target in 1749 generations, and the
dynamics shows allele replacements in which population diversity falls abruptly.
However, such RNA-like structure of "genotype" > phenotype map is not a universal
property of networks. In particular, two kinds of networks, homogeneous random networks
and scale-free networks, exhibit drastically different evolutionary patterns. Whereas
homogeneous random networks accumulate neutral mutations and evolve by sparse
punctuated steps, scale-free networks evolve rapidly and continuously.
Populations of random (a) and scale-free (b) networks for various average connectivity K
and degree exponents g. Average fitness of 50 independent evolutionary paths.
Macroevolution of Cells and Organisms
There is currently no theory for Macroevolution of cells or multicellular organisms. Still,
there are some intriguing data on in silico evolution.
A simple model of a cell, consisting of an organic nutrient A and an energy carrier X, and
proteins of five types – two transcription factors, two enzymes, and a membrane transporter.
The metabolism of the cell consists of importing A from the environment and utilizing it in
order to produce X and an unspecified end product. The genome may carry an arbitrary
number of genes, each one encoding a protein of one of the five types. Fitness was
determined by the ability of organisms to maintain a constant concentration of X.
Even this very simple model displays complex behavior. In particular, there is a high degree
of unpredictability of the outcome of evolution - different experiment performed under the
same conditions produced locally optimal organisms of very different levels of complexity.
Regulatory interactions within a simple evolved organism.
Regulatory interactions within a simple evolved organism, obtained under the
same conditions as the previous one.
Conclusions:
Finally, it became possible to start addressing the key issue in the evolution of life - "how
can complex phenotypes evolve by natural selection?".
Answering this question will be the main task of evolutionary biology, and probably of all
natural sciences, in the XXI century. However, it is not even clear whether there exists a
general theory describing evolution of all kinds of complex functional phenotypes.
Still, if you are interested in studying evolution, consider this subject seriously - you will not
get a Nobel prize for yet another paper on phylogenetics, or paleontology, or Microevolution.
Quiz:
Suppose that you have an ability to predict 3D structure and even function(s) of a protein
from its amino acid sequence. What evolutionary questions will you address, and how?