Modular proteins I

Transcript Modular proteins I

Modular proteins I
Level 3 Molecular Evolution and
Bioinformatics
Jim Provan
Patthy Sections 8.1.1 – 8.1.3
Protein domains
Folded structures of proteins that are larger than 200-300
residues generally consist of multiple structural domains:
Compact, stable units with a unique three-dimensional structure
Interactions within a domain are more significant than those
between domains
Fold independently i.e. structural domains are also folding domains
If domain performs distinct function which remains intact in the
isolated domain, then it is also a functional domain
Many multidomain proteins are homomultimeric i.e. contain
multiple copies of a single type of structural domain:
Arisen through internal duplication of complete domains
Fate of domains determined by similar rules to paralogous genes
Protein domains
Many multidomain proteins are heteromeric:
Example is plasminogen activator where a trypsin-like serine
protease is joined to kringle, finger and EGF domains
May occur by fusion of two or more genes (chimeric proteins)
Also known as modular proteins, with domains known as modules
Certain modules occur in a wide variety of hetero- and
homomultimeric proteins:
Suggests mechanisms to facilitate duplication and dispersal
“Building blocks” of different types of multidomain proteins are
known as mobile protein modules
Frequency of transfer and incorporation into new protein reflects
fixation probability
Modular assembly by intronic
recombination
Discovery of introns provided potential new mechanisms
for protein evolution:
Gilbert suggested that recombination within introns could assort
exons independently
Idea of rapid construction of novel genes from parts of old ones
led to the formulation of the exon-shuffling hypothesis
According to “introns early” theories, all extant genes
were constructed from a limited number of exon types
Under the “introns late” theory, intronic recombination
and exon shuffling could not have played a major role in
the assembly of the earliest genes
Original theory was that exons corresponded directly to
modules and/or structural motifs
Problems with the “introns early”
hypothesis
In the case of many genes, no obvious correspondence was
observed between protein structure and intron location
It is now known that introns can also be inserted into genes
i.e. structure of a gene may not be its original structure
Introns suitable for exon shuffling did not originate until a
relatively late stage of eukaryotic evolution
Exon shuffling has only been conclusively demonstrated in
“young” proteins unique to higher eukaryotes
Only a special group of exons, the “symmetrical” modules,
are really valuable for exon shuffling. Intron phase
distribution is also a crucial factor.
Self-splicing introns
Group I introns:
Reaction requires only a guanine nucleotide cofactor:
—
—
—
Provides a free 3’-OH group that attacks the 5’ splice site
3’-OH generated at the end of the upstream exon
Second transesterification joins the two exons
Crucially depends on folded structure of the intron itself
Group II introns:
Does not require an external cofactor: 2’-OH of an adenine within
the intron cuts the 5’ splice site
2’5’ phosphodiester bond (branch site) forms the lariat structure
Although folding is still crucial, chemistry, sequence of events and
lariat formation are similar to nuclear spliceosomal introns
Spliceosomal intron splicing mechanism
Spliceosomal introns
Spliceosomal introns are only spliced in the presence of a
complex of specific proteins and RNA known as a
spliceosome
Majority of intron is unimportant: as long as the 5’ and 3’
splice sites and the branch site are conserved, splicing
can take place:
Large insertions into spliceosomal introns, or deletions do not
affect splicing efficiency
Chimeric introns, containing the 5’ end of one intron and the 3’
end of another, are also properly spliced
Mutations (directed or otherwise) in these regions lead to
aberrant splicing
Spliceosomal intron plays a minor role in its own splicing:
the actual spliceosome complex is more important
Evolution of spliceosomal introns
Both group I and group II self-splicing mechanisms
resemble spliceosome catalysed splicing:
Initial step is attack by ribose hydroxyl group on 5’ splice site
In each case, reactions are transesterifications where phosphate
moieties are retained in products
In group II and spliceosomal introns, intron is released as a lariat
Accepted that spliceosomal-catalysed splicing evolved from
group II self-splicing introns :
Key step was transfer of catalytic role from intron to other
molecules
Formation of spliceosome gave spleceosomal introns structural
freedom as they no longer had to fulfil the catalytic function
Generally found only in nuclear genomes of higher
eukaryotes (plants, animals and fungi)
Insertion and spread of spliceosomal
introns
Intron loss
Plays a significant role in changing exon-intron structure
of genes
Introns may be eliminated through mechanism that gives
rise to processed genes (retroposition)
Reverse transcription can also lead to loss of only some
introns:
Reverse transcription of perfectly spliced mRNA and
recombination with the functional gene: mutates original gene
Partially processed pre-mRNA could give rise to a semi-processed
gene: generates a new paralogue
Gene duplication / deletion due to
intronic recombination
Exon shuffling via recombination in
introns
Believed that insertion of exons may occur by the
same mechanism as insertion of introns:
Exon shuffling may be a consequence of the occasional
inclusion of exon sequences in the insertion cycle of introns
Alternative splicing (exon skipping during splicing) may yield
exons with flanking introns
If such a composite is inserted into the genome by the same
mechanism that inserts single introns (reverse splicing) we
have exon shuffling
Key difference between intronic recombination model
and retrotransposition model:
In first case, insertion occurs into a pre-existing intron of
same phase as introns flanking exon
Retrotransposition model does not have this requirement
Evolution of urokinase
S
P
G
S
G
P
K
S
G
K
P
Evolution of tissue plasminogen
activator
S
G
K
P
F
S F
G
K
P
K module duplication
S F
G
K
K
P
Evolution of Factor XII
S
G
K
P
Duplication of G module
S
G
G
K
P
F
S
G
F
G
K
P
FN2
S FN2
G
F
G
K
P

Modular proteins I

Transcript Modular proteins I

Directory