Aliens? - Johns Hopkins Bloomberg School of Public Health
Download
Report
Transcript Aliens? - Johns Hopkins Bloomberg School of Public Health
Aliens? Oddities? Or misunderstood?
Transposons and miRNAs
Genome sizes (haploid)
Wheat
16 GB (? ploid)
~7 chromosomes
Human
3.3 GB
23 chromosomes
Mouse
2.5 GB
19 chromosomes
Dog
2.4 GB
38 chromosomes
Chicken
1.2 GB
38 chromosomes
plus microchr.
Drosophila
1.2 GB
4-5 chromosomes
C. elegans
100 MB
5 chromosomes
E. coli
5.2 MB
1 chromosome
Carsonella ruddii
160 KB
1 chromosome
182 ORFs
Number of genes in different
organisms
25000
20000
15000
10000
5000
0
Human
Rice
Mouse
Arabidopsis
Chicken
C. elegans
Dog
Drosophila
E. coli
What is a transposon?
• Contiguous piece of DNA of varying length (300 bp to
6.5kb or so)
• Repeated with minor variations throughout the host
genome
• Can replicate itself by cut and paste or copy and
paste mechanisms (can move around!)
• No known function — most synthetic genome
projects aim to remove them
• Structural and functional analogies to viruses
– Much of the terminology reflects this
Barbara McClintock, 1940s
Discovered transposons and characterized
their effects on their hosts
She was ostracized for her ideas but won the
Nobel Prize in 1983.
Types of transposons
• Cut and paste
– DNA transposons
• Copy and paste
– Autonomous retrotransposons
• ERVs *possibly active in human genome
• L1 & relatives *active in human genome
– Nonautonomous retrotransposons
• SINEs (Alu) *active in human genome
• SVA *active in human genome
– Composite element (SINE, VNTR, Alu)
• Processed pseudogenes
Transposons comprise ~45%
of the human genome
• DNA transposons 3%
• Autonomous retrotransposons
–
–
–
–
ERVs
L1 18% (500,000 copies)
L2 3%
L3 & relatives 1%
]
LTR
retrotransposons
• Nonautonomous retrotransposons
– SINEs (Alu) 15% (1 million+ copies)
– SVA (3000 copies)
– Processed pseudogenes (>8000)
(Simple repeats occupy almost another 10%)
“Junk DNA”?
• What do transposons do?
–
–
–
–
Make more of themselves
Move genes around
Serve as reservoirs of new sequence
Cause genetic instability (repeats stimulate translocation; L1
causes chromosome breakage)
• Can contribute to genes and gene expression
– 5% of alternatively spliced internal human exons come from
Alus
– 80% of genes have some L1 sequence in noncoding portion
– 1-4% of coding sequence is L1-derived
– Act as methylation centers
Importance in genomics
• Transposons are a source of human
variability
– Roughly 5% of people have a transposon not
found in either parent (not due to nonpaternity!)
– Overall polymorphism variable but remarkable
(40-50% of youngest elements are polymorphic)
• Transposons can be useful in medicine
– Occasionally cause disease (de novo insertion in
factor VIII clotting gene led to L1 discovery in
1980s)
– May often be linked to disease loci
Importance in genomics
• Transposons in introns may disrupt gene
expression
– Mechanism depends on whether they are on the
sense or antisense strand
– (+) strand orientation — transcription stalling
– (-) strand orientation — premature
polyadenylation, gene splitting
Importance in genomics
• Can have huge effects, through chromosomal
translocation, inversion, breakage
Transposon domestication
• Overly active transposons will kill a cell (and
then the organism)
• Transposons have tempered
– active almost exclusively in germ line
– also in cancer cells and neuronal cell precursors
Transposon domestication
• Host cells use many mechanisms to control
transposons
–
–
–
–
Methylation (original role?)
miRNA defense
Sequestered in stress granules
Nucleic acid editing
• APOBEC family of proteins edits cytosines to
uracils
• ADARs edit dsRNA adenosine to inosine
What to do with transposons?
• Study them
• Work around them (be aware)
– RepeatMasker (Smit & Jurka)
– Problem: each element is at least in part unique,
and RepeatMasker will mask that too
Another old element, new to science:
microRNAs
RNA world hypothesis:
First “organism” was a strand of RNA that
could somehow replicate itself.
Eventually RNA used DNA as a more stable
storage for genetic material.
1982: Tom Cech reported self-splicing RNAs
microRNA
• 21-25 nucleotide small RNAs
• Discovered in a C. elegans screen
• Alter gene expression at the posttranscriptional level (precise mechanism
unknown)
• Tend to be high-level regulators (>100 targets
each)
• Percentage of human genes under miRNA
control is unknown but possibly 20-30%
• Often are developmental or cell state
miRNA
Two mechanisms:
Perfect match to target
leads to mRNA cleavage
or
Imperfect match leads to
translational repression
Neither is wellunderstood, but likely
involve the dsRNA
recognition system
Another role?
• Under conditions of cell stress, a miRNA may
be activating instead, as responding
regulatory proteins interpret the signal
differently
Seems odd . . .
• Why would a cell use this sort of mechanism?
It’s making an mRNA and then degrading it.
Should be easier to just not make it . . .
• But what if the cell is not in control of that
RNA, for example if it’s coming from an
invasive nucleic acid species under its own
promoter?
– Transposon control!!!
– piRNA (piwi RNA) are a whole class of small
RNAs that control transposons
– Invasive RNA was a big problem in the RNA world!
Occam’s razor
All other things being equal, the simplest
solution is the best
My alternative: If a biological principle is simple,
it’s probably wrong.
Evolution tends to higher complexity, as old
mechanisms are reused and there’s little
incentive to clean up.
Looking for new miRNAs
• Often found within stem-loop precursor structures
(hairpins)
• Associated (in the cell) with polysomes and other
structures
• Bioinformatics: unexpected sequence conservation in
noncoding region, or homology to miRNA in a closely
related species (works less often than you would
think)
• Identify candidate miRNA targets (TargetScan, by
Chris Burge’s group)
– A target protein usually has multiple target sites
Problems with miRNAs
• Small! Unstable, hard to get large quantities
• Binding is degenerate, noncontiguous, and
includes not only mismatches but bulges
• Actual sequence recognition only 15 or so
nucleotides (noncontiguous), varies by target
• Essential “seed” element not well
characterized
• Sequences not well conserved across
species
• miRNA microarrays: statistics problematic