Genome structure and organization
Download
Report
Transcript Genome structure and organization
The genomes of living organisms vary enormously in size
Four classes of DNA polymorphisms
Single nucleotide polymorphism (SNP)
Single base-pair substitutions
Arise by mutagenic chemicals or mistakes in
replication
Biallelic – only two alleles
2001 – over 5 million human SNPs identified
Most occur at anonymous loci
Useful as DNA markers
Fig. 11.2
Microsatellites
1 every 30,000 bp
Repeated units 2 – 5
bp in length
Mutate by
replication error
Useful as highly
polymorphic DNA
markers
Fig. 11.3
Minisatellites
Repeating
units 20-100 bp
long
Total length of
0.5 – 20 kb
1 per 100,000
bp, or about
30,000 in
whole genome
Fig. 11.4
Deletions, duplications, and insertions
Expand or contract the length of
nonrepetitive DNA
Small deletions and duplications arise by
unequal crossing over
Small insertions can also be caused by
transposable elements
Much less common than other
polymorphisms
Figure 11.5
Formation of haplotypes over time
SNP detection using
southern blots
Restriction fragment
length polymorphisms
(RFLPs) are size changes
in fragments due to the
loss or gain of a restriction
site
Fig. 11.6
SNP detection by
PCR
Must know sequence
on either side of
polymorphism
Amplify fragment
Expose to restriction
enzyme
Gel electrophoresis
e.g., sickle-cell
genotyping with a PCR
based protocol
Fig. 11.7
SNP detection by ASO
Very short probes (<21 bp) that hybridize to one allele or other
Such probes are allele-specific oligonucleotides (ASOs)
Fig. 11.8
ASOs can
determine
genotype at any
SNP locus
Fig. 11.9 a-c
Hybridized and labeled
with ASO for allele 1
Hybridized and labeled
with ASO for allele 2
Fig. 11.9 d, e
Preimplantation embryo diagnosis of CF
using ASO analysis
Fig. 11.1
Fig. 11.1
Fig. 11.1
High-throughput
instruments
e.g, microarrays
Fig. 10.24
Large-scale multiplex ASO analysis with
microarrays can detect BRCA1 mutations
Each column contains an ASO differing only at the
nucleotide position under analysis
BRCA1 DNA from any one allele can only be one of
four ASOs in a column
Heterozygotes are easily deteted
Fig. 11.10
Primer extension to detect SNPs
Mass spectrometer
Fig. 10.27
Microsatellite allele
detection
analysis of size
differences
Fig. 11.12
Huntington’s
disease is an
example of a
microsatellite
triplet repeat
in a coding
region
Fig. 11.13
Minisatellite detection and DNA
fingerprinting
1985 – Alec Jeffreys made two key findings
Each minisatellite locus is highly polymorphic
Most minisatellites occur at multiple sites
around the genome
DNA fingerprint – pattern of simultaneous
genotypes at a group of unlinked loci
Use restriction enzymes and southern blots to
detect length differences at minisatellite loci
Most useful minisatellites have 10 – 20 sites
around genome and can be analyzed on one gel
Fig. 11.14
Minisatellite
analysis
Fig. 11.15
DNA fingerprints can
identify individuals and
determine parentage
E.g., DNA fingerprints
confirmed Dolly the
sheep was cloned from
an adult udder cell
Donor udder (U), cell
culture from udder (C),
Dolly’s blood cell DNA
(D), and control sheep
1-12
Human Karyotype
(a) complete set of
human
chromosomes
stained with
Giemsa dye shows
bands
(b) Ideograms show
idealized banding
pattern
Fig. 10.5 a
Chromosome 7 at three levels of resolution
Fig. 10. 5 b
FISH protocol for top-down approach
DNA hybridization and restriction mapping – a
bottom-up approach
Fig. 10.7
Identifying and isolating a set of overlapping fragments from a library
Two approaches
Linkage maps used to derive a physical map
set of markers less than 1 cM apart
Use markers to retrieve fragments from library by hybridization
Construct contigs – two or more partially overlapping cloned
fragments
Chromosome walk by using ends of unconnected contigs to probe
library for fragments in unmapped regions
Physical mapping techniques
Direct analysis of DNA
Overlapping clones aligned by restriction mapping
Sequence tag segments (STSs)
High density linkage mapping to
build overlapping set of genomic
clones
Fig. 10.8
Physical mapping of overlapping
genomic clones without linkage
information
Fig. 10.10
Physical mapping by analysis of STSs
Fig. 10.11
Each STS represents a unique segment of the genome amplified by PCR.
Sequence maps show the order of nucleotides in a
cloned piece of DNA
Two strategies for sequence human genome
Hierarchical shotgun approach
Whole-genome shotgun approach
Shotgun – randomly generated overlapping
insert fragments
Fragments from BACs
Fragments from shearing whole genome
Shearing DNA with sonication
Partial digestion with restriction enzymes
Hierarchical shotgun strategy
Used in publicly funded effort to sequence human genome
Shear 200 kb BAC clone
into ~2 kb fragments
Sequence ends 10 times
Need about 1700 plasmid
inserts per BAC and about
20,000 BACs to cover
genome
Data from linkage and
physical maps used to
assemble sequence maps
of chromosomes
Significant work to create
libraries of each BAC and
physically map BAC
clones
Fig. 10.12
Whole-genome shotgun sequencing
Private company Celera used to sequence whole human genome
Whole genome randomly
sheared three times
Plasmid library constructed
with ~ 2kb inserts
Plasmid library with ~10 kb
inserts
BAC library with ~ 200 kb
inserts
Computer program assembles
sequences into chromosomes
No physical map construction
Only one BAC library
Overcomes problems of repeat
sequences
Fig. 10.13
Sequencing of the human genome
Most of draft took place during last year of
project
Intruments improvements – 345,600 bp/day
Automated factory-like production line
generated sufficient DNA to supply sequencers
on a daily basis
Large sequencing centers with 100-300
instruments – 103,680,000 bp/day (10-fold
coverage in 30 days)
High-throughput DNA sequencing
Fig. 10.23
Integration of linkage, physical, and sequence maps
Provides check on the correct order of each
map against other two
SSR and SNP DNA linkage markers readily
integrated into physical map by PCR
analysis across insert clones in physical map
SSR, SNP (linkage maps), and STS markers
(physical maps) have unique sequences 20
bp or more allowing placement on sequence
map
Cloning human genes
A pedigree of the royal family descended from Queen Victoria
In which hemophilia A is segregating
Fig. 11.16 a
Blood-clotting cascade in which vessel damage causes a
cascade of inactive factors to be converted to active factors
Fig. 11.16 b
Blood tests determine if active form of each
factor in the cascade is present
Fig. 11.16 c
Techniques used to purify Factor VIII and
clone the gene
Fig. 11.16 d
Positional Cloning – Step 1
Find extended families in which disease is
segregating
Use panel of polymorphic markers spaced
at 10 cM intervals across all chromosomes
About 300 markers total
Determine genotype for all individuals in
families for each DNA marker
Look for linkage between a marker and
disease phenotype
Once region of
chromosome is
identified, a high
resolution
mapping is
performed with
additional
markers to
narrow down
region where
gene may lie
Fig. 11.17
Positional cloning – Step 2 identifying
candidate genes
Once region of chromosome has been narrowed
down by linkage analysis to 1000 kb or less, all
genes within are identified
Candidate genes
Usually about 17 genes per 1000 kb fragment
Identify coding regions
Computational analysis to identify conserved sequences
between species
Computational analysis to identify exon-like sequences by
looking for codon usage, ORFs, and splice sites
Appearance in one or more EST databases
Computational analysis of genomic sequences
to identify candidate genes
Fig. 11.19
Gene expression patterns can pinpoint
candidate genes
Look in public database of EST sequences
representing certain tissues
Northern blot
RT-PCR
Northern blot example showing SRY candidate for testes determining
factor is expressed in testes, but not lung, ovary, or kidney
Fig. 11.20
Positional cloning – Step 3
Find the gene responsible for the phenotype
Expression patterns in affected individuals
Sequence differences
RNA expression assayed by Northern blot or RT-PCR with
primers specific to candidate transcript
Look for misexpression (no expression, underexpression,
overexpression)
Missense mutations identified by sequencing coding region of
candidate gene from normal and abnormal individuals
Transgenic modification of phenotype
Insert the mutant gene into a model organism
Transgenic analysis can prove candidate gene
is disease locus
Fig. 11.21
Example: Positional Cloning of Cystic
Fibrosis Gene
Linkage analysis places CF on chromosome 7
Fig. 11.22 a
Northern blot analysis reveals only one of candidate
genes is expressed in lungs and pancreas
Fig. 11.22 b
Every CF patient has a mutated allele of the
CFTR gene on both chromosome 7
homologs
Location and number of mutations indicated
under diagram of chromosome
Fig. 11.22 c
CFTR is a membrane protein. TMD-1 and
TMD-2 are transmembrane domains.
Fig. 11.22 d
Proving CFTR is the right gene
Phenotype eliminates gene function
Cannot use transgenic technology
Instead perform CFTR gene “knockout” in
mouse to examine phenotype without CFTR
gene
Targeted mutagenesis
Genetic dissection of complex traits
Incomplete penetrance – when a mutant genotype
does not always cause a mutant phenotype
No environmental factor associated with likelihood
of breast cancer
Positional cloning identified BRCA1 as one gene
causing breast cancer.
Only 66% of women who carry BRCA1 mutation
develop breast cancer by age 55
Incomplete penetrance hampers linkage mapping
and positional cloning
Solution – exclude all nondisease individuals form
analysis
Requires many more families for study
Phenocopy
Disease phenotype is not caused by any
inherited predisposing mutation
Decreases power to detect correlation between
inheritance of disease locus and expression of the
disease
Genetic heterogeneity
Mutations at more than one locus cause same
phenotype
Multiple families used in most studies
If different families have different gene mutations,
power of statistics to detect linkage will drop
significantly
Polygenic inheritance
Two or more genes interact in the expression of
phenotype
QTLs, or quantitative trait loci
Unlimited number of transmission patterns for QTLs
Discrete traits – penetrance may increase with number of
mutant loci
Expressivity may vary with number of loci
Many other factors complicate analysis
Some mutant genes may have large effect
Mutations at some loci may be recessive while others are
dominant or codominant