Transcript lecture 3
Repetitive elements
Significance
Evolutionary ‘signposts’
Passive markers for mutation assays
Actively reorganise gene organisation by
creating, shuffling or modifying existing
genes
Chromosome structure and dynamics
Provide tools for medical, forensic,
genetic analysis
Repetitive sequences
AAA, ATATATAT, CGTCGTCGT etc..
5 main classes
1)
2)
3)
4)
Tandem repeats
Transposon-derived repeats
Segmental duplications
Processed pseudogenes
1) Tandem repeats
Blocks of tandem repeats at
subtelomeres
pericentromeres
Short arms of acrocentric
chromosomes
Ribosomal gene clusters
Tandem / clustered repeats
Broadly divided into 4 types based on size
class
Size of
repeat
Repeat
block
Satellite 5-171 bp > 100kb
Major
chromosomal
location
centromeric
heterochromatin
minisatellite
9-64 bp
0.1 – 20kb
Telomeres
microsatellites
1-13 bp
< 150 bp
Dispersed
HMG3 by Strachan and Read pp 265-268
Satellites
Large arrays of repeats
Some examples
Satellite 1,2 & 3
- found in all
chromosomes
a (Alphoid DNA)
b satellite
HMG3 by Strachan and Read pp 265-268
Minisatellites
Moderate sized arrays of repeats
Some examples
Hypervariable minisatellite DNA
- core of GGGCAGGAXG
- found in telomeric regions
- used in original DNA
fingerprinting technique by Alec
Jeffreys
HMG3 by Strachan and Read pp 265-268
Microsatellites
VNTRs - variable number of tandem repeats, SSR - simple sequence repeats
1-13 bp repeats e.g. (A)n ; (AC)n
2% of genome (dinucleotides - 0.5%)
Used as genetic markers (especially for disease mapping)
Individual genotype
HMG3 by Strachan and Read pp 265-268
Microsatellite genotyping
The most common way to detect microsatellites is to design PCR
primers that are unique to one locus in the genome and that base pair
on either side of the repeated portion
Therefore, a single pair of PCR primers will work for every individual
in the species and produce different sized products for each of the
different length microsatellites
Fig 7.7 HMG3 by Strachan and Read pp 190
Microsatellite genotyping
.
CA repeat genotyping
.
Marker D17S800
A
B C
D
E
Allele types
A (3,6)
B (1,5)
C (3,5)
D (2,5)
E (3,6)
N.B. ‘stutters’ or shadow bands
Caused by strand slippage
Fig 7.8 HMG3
strand slippage during replication
Fig 11.5 HMG3 by Strachan and Read pp 330
strand slippage during replication
Fig 11.5 HMG3 by Strachan and Read pp 330
Repetitive elements…
2) Transposon-derived repeats
A.k.a. interspersed repeats
45% of genome
Arise mainly as a result of
transposition either through
a DNA or a RNA intermediate
4 main types
LINES, SINES, LTRs and DNA transposons
Transposon-derived repeats…
LINEs (long interspersed elements)
Most ancient of eukaryotic genomes
Autonomous transposition (reverse trancriptase)
~6-8kb long
Internal polymerase II promoter and 2 ORFs
3 related LINE families in humans
– LINE-1, LINE-2, LINE-3.
Believed to be responsible for retrotransposition
of SINEs and creation of processed pseudogenes
Nature (2001) pp879-880
HMG3 by Strachan & Read pp268-272
Transposon-derived repeats…
SINEs (short interspersed elements)
Non-autonomous (successful freeloaders! ‘borrow’
RT from other sources such as LINEs)
~100-300bp long
Internal polymerase III promoter
No proteins
Share 3’ ends with LINEs
3 related SINE families in humans
– active Alu, inactive MIR and Ther2/MIR3.
Nature (2001) pp879-880
HMG3 by Strachan & Read pp268-272
LINES and SINEs have preferred insertion sites
• In this example,
yellow represents the
distribution of mys (a
type of LINE) over a
mouse genome where
chromosomes are
orange. There are
more mys inserted in
the sex (X)
chromosomes.
Try the link below to do an online experiment
which shows how an Alu insertion
polymorphism has been used as a tool to
reconstruct the human lineage
http://www.geneticorigins.org/geneticorigins/
pv92/intro.html
Transposon-derived repeats…
Long Terminal Repeats (LTR)
Repeats on the same orientation on both sides of element
e.g. ATATATNNNNNNNATATAT
Autonomous or non-autonomous
Autonomous retroposons encode gag, pol genes
which encode the protease, reverse
transcriptase, RNAseH and integrase
Nature (2001) pp879-880
HMG3 by Strachan & Read pp268-272
Transposon-derived repeats…
DNA transposons (lateral transfer?)
DNA transposons
Inverted repeats on both sides of element
e.g. ATGCNNNNNNNNNNNCGTA
From
GenesVII by Levin
Nature (2001) pp879-880
Transposon derived repeats
major types
class
family
size
Copies*
LINE
LINE-1
(Kpn family)
~6.4kb
0.8x106
%
genome*
15.4
SINE
Alu
~0.3kb
1.3x106
10.7
LTR
e.g.HERV
~1.3kb
0.7x106
7.9
~0.25kb 0.4x106
2.7
DNA
transposon
mariner
* Updated from HGP publications
HMG3 by Strachan & Read pp268-272
3) Segmental duplications
Closely related sequence blocks at different
genomic loci
Transfer of 1-200kb blocks of genomic
sequence
Segmental duplications can occur on homologous
chromosomes (intrachromosomal) or non
homologous chromosomes (interchromosomal)
Not always tandemly arranged
Relatively recent
Segmental duplications
Interchromosomal
segments duplicated
among non-homologous
chromosomes
Intrachromosomal
duplications occur
within a chromosome / arm
Nature Reviews Genetics 2, 791-800 (2001);
Segmental duplications in chromosome
Segmental 22
duplications
Segmental duplications - chromosome 7.
Nature Reviews Genetics 2, 791-800 (2001)
4) Pseudogenes - processed
Repetitive sequences
AAA, ATATATAT, CGTCGTCGT etc..
5 main classes
1) Tandem repeats
2) Transposon-derived repeats
3) Segmental duplications
4) Processed pseudogenes
Insights from the HGP………
7) Repeat content
a) Age distribution
b) Comparison with other genomes
c) Variation in distribution of repeats
d) Distribution by GC content
e) Y chromosome
Nature (2001) 409: pp 879-891
Repeat content…….
a) Age distribution
Most interspersed repeats predate eutherian
radiation (confirms the slow rate of clearance
of nonfunctional sequence from vertebrate
genomes)
LINEs and SINEs have extremely long lives
2 major peaks of transposon activity
No DNA transposition in the past 50MYr
LTR retroposons teetering on the brink of
extinction
a) Age distribution
overall decline in interspersed repeat activity in
hominid lineage in the past 35-40MYr
compared to mouse genome, which shows a
younger and more dynamic genome
b) Comparison with other genomes
Higher density of
transposable elements
in euchromatic portion
of genome
Higher abundance of
ancient transposons
60% of IR made up of
LINE1 and Alu repeats
whereas DNA
transposons represent
only 6%
(a few human genes
appear likely to have
resulted from
horizontal transfer
from bacteria!!)
c) Variation in distribution of repeats
Some regions show either
High repeat density
e.g. chromosome Xp11 – a 525kb region shows
89% repeat density
Low repeat density
e.g. HOX homeobox gene cluster (<2% repeats)
(indicative of regulatory elements which have low
tolerance for insertions)
d) Distribution by GC content
High GC – gene rich ; High AT – gene poor
LINEs abundant in AT-rich regions
SINEs lower in AT-rich regions
Alu repeats in particular retained in actively transcribed GC rich
regions E.g. chromosme 19 has 5% Alus compared to Y chromosome
Repeat content…….
e) The Y chromosome !
Unusually young genome (high tolerance
to gaining insertions)
Mutation rate is 2.1X higher in male
germline
Possibly due to cell division rates or
different repair mechanisms
• Working draft published – Feb 2001
• Finished sequence – April 2003
• Annotation of genes going on
References
Text:
1) Human Molecular Genetics 3 by Strachan and
Read – Chapter 9 pp 265-268
Optional Reading
1)
2)
Batzer MA, Deininger PL Alu repeats and human genomic diversity
Nature Rev Genet 3 (5): 370-379 May 2002
BS Emanuel & TH Shaikh Segmental duplications: an 'expanding'
role in genomic instability and disease Nature Reviews Genetics 2,
791-800 (2001)
3)
Nature (2001) 409: pp 879-891