Genome structure and organization

Download Report

Transcript Genome structure and organization

The genomes of living organisms vary enormously in size
Four classes of DNA polymorphisms
Single nucleotide polymorphism (SNP)






Single base-pair substitutions
Arise by mutagenic chemicals or mistakes in
replication
Biallelic – only two alleles
2001 – over 5 million human SNPs identified
Most occur at anonymous loci
Useful as DNA markers
Fig. 11.2
Microsatellites




1 every 30,000 bp
Repeated units 2 – 5
bp in length
Mutate by
replication error
Useful as highly
polymorphic DNA
markers
Fig. 11.3
Minisatellites



Repeating
units 20-100 bp
long
Total length of
0.5 – 20 kb
1 per 100,000
bp, or about
30,000 in
whole genome
Fig. 11.4
Deletions, duplications, and insertions




Expand or contract the length of
nonrepetitive DNA
Small deletions and duplications arise by
unequal crossing over
Small insertions can also be caused by
transposable elements
Much less common than other
polymorphisms
Figure 11.5
Formation of haplotypes over time

SNP detection using
southern blots

Restriction fragment
length polymorphisms
(RFLPs) are size changes
in fragments due to the
loss or gain of a restriction
site
Fig. 11.6
SNP detection by
PCR

Must know sequence
on either side of
polymorphism




Amplify fragment
Expose to restriction
enzyme
Gel electrophoresis
e.g., sickle-cell
genotyping with a PCR
based protocol
Fig. 11.7
SNP detection by ASO


Very short probes (<21 bp) that hybridize to one allele or other
Such probes are allele-specific oligonucleotides (ASOs)
Fig. 11.8
ASOs can
determine
genotype at any
SNP locus
Fig. 11.9 a-c
Hybridized and labeled
with ASO for allele 1
Hybridized and labeled
with ASO for allele 2
Fig. 11.9 d, e
Preimplantation embryo diagnosis of CF
using ASO analysis
Fig. 11.1
Fig. 11.1
Fig. 11.1
High-throughput
instruments
e.g, microarrays
Fig. 10.24
Large-scale multiplex ASO analysis with
microarrays can detect BRCA1 mutations



Each column contains an ASO differing only at the
nucleotide position under analysis
BRCA1 DNA from any one allele can only be one of
four ASOs in a column
Heterozygotes are easily deteted
Fig. 11.10
Primer extension to detect SNPs
Mass spectrometer
Fig. 10.27
Microsatellite allele
detection
analysis of size
differences
Fig. 11.12
Huntington’s
disease is an
example of a
microsatellite
triplet repeat
in a coding
region
Fig. 11.13
Minisatellite detection and DNA
fingerprinting

1985 – Alec Jeffreys made two key findings
Each minisatellite locus is highly polymorphic
 Most minisatellites occur at multiple sites
around the genome
 DNA fingerprint – pattern of simultaneous
genotypes at a group of unlinked loci
 Use restriction enzymes and southern blots to
detect length differences at minisatellite loci
 Most useful minisatellites have 10 – 20 sites
around genome and can be analyzed on one gel


Fig. 11.14
Minisatellite
analysis



Fig. 11.15
DNA fingerprints can
identify individuals and
determine parentage
E.g., DNA fingerprints
confirmed Dolly the
sheep was cloned from
an adult udder cell
Donor udder (U), cell
culture from udder (C),
Dolly’s blood cell DNA
(D), and control sheep
1-12
Human Karyotype


(a) complete set of
human
chromosomes
stained with
Giemsa dye shows
bands
(b) Ideograms show
idealized banding
pattern
Fig. 10.5 a
Chromosome 7 at three levels of resolution
Fig. 10. 5 b
FISH protocol for top-down approach
DNA hybridization and restriction mapping – a
bottom-up approach
Fig. 10.7
Identifying and isolating a set of overlapping fragments from a library

Two approaches

Linkage maps used to derive a physical map





set of markers less than 1 cM apart
Use markers to retrieve fragments from library by hybridization
Construct contigs – two or more partially overlapping cloned
fragments
Chromosome walk by using ends of unconnected contigs to probe
library for fragments in unmapped regions
Physical mapping techniques



Direct analysis of DNA
Overlapping clones aligned by restriction mapping
Sequence tag segments (STSs)
High density linkage mapping to
build overlapping set of genomic
clones
Fig. 10.8
Physical mapping of overlapping
genomic clones without linkage
information
Fig. 10.10
Physical mapping by analysis of STSs
Fig. 10.11
Each STS represents a unique segment of the genome amplified by PCR.
Sequence maps show the order of nucleotides in a
cloned piece of DNA

Two strategies for sequence human genome
Hierarchical shotgun approach
 Whole-genome shotgun approach


Shotgun – randomly generated overlapping
insert fragments
Fragments from BACs
 Fragments from shearing whole genome

Shearing DNA with sonication
 Partial digestion with restriction enzymes

Hierarchical shotgun strategy
Used in publicly funded effort to sequence human genome





Shear 200 kb BAC clone
into ~2 kb fragments
Sequence ends 10 times
Need about 1700 plasmid
inserts per BAC and about
20,000 BACs to cover
genome
Data from linkage and
physical maps used to
assemble sequence maps
of chromosomes
Significant work to create
libraries of each BAC and
physically map BAC
clones
Fig. 10.12
Whole-genome shotgun sequencing
Private company Celera used to sequence whole human genome

Whole genome randomly
sheared three times







Plasmid library constructed
with ~ 2kb inserts
Plasmid library with ~10 kb
inserts
BAC library with ~ 200 kb
inserts
Computer program assembles
sequences into chromosomes
No physical map construction
Only one BAC library
Overcomes problems of repeat
sequences
Fig. 10.13
Sequencing of the human genome

Most of draft took place during last year of
project
Intruments improvements – 345,600 bp/day
 Automated factory-like production line
generated sufficient DNA to supply sequencers
on a daily basis
 Large sequencing centers with 100-300
instruments – 103,680,000 bp/day (10-fold
coverage in 30 days)

High-throughput DNA sequencing
Fig. 10.23
Integration of linkage, physical, and sequence maps



Provides check on the correct order of each
map against other two
SSR and SNP DNA linkage markers readily
integrated into physical map by PCR
analysis across insert clones in physical map
SSR, SNP (linkage maps), and STS markers
(physical maps) have unique sequences 20
bp or more allowing placement on sequence
map
Cloning human genes
A pedigree of the royal family descended from Queen Victoria
In which hemophilia A is segregating
Fig. 11.16 a
Blood-clotting cascade in which vessel damage causes a
cascade of inactive factors to be converted to active factors
Fig. 11.16 b
Blood tests determine if active form of each
factor in the cascade is present
Fig. 11.16 c
Techniques used to purify Factor VIII and
clone the gene
Fig. 11.16 d
Positional Cloning – Step 1


Find extended families in which disease is
segregating
Use panel of polymorphic markers spaced
at 10 cM intervals across all chromosomes



About 300 markers total
Determine genotype for all individuals in
families for each DNA marker
Look for linkage between a marker and
disease phenotype

Once region of
chromosome is
identified, a high
resolution
mapping is
performed with
additional
markers to
narrow down
region where
gene may lie
Fig. 11.17
Positional cloning – Step 2 identifying
candidate genes


Once region of chromosome has been narrowed
down by linkage analysis to 1000 kb or less, all
genes within are identified
Candidate genes


Usually about 17 genes per 1000 kb fragment
Identify coding regions



Computational analysis to identify conserved sequences
between species
Computational analysis to identify exon-like sequences by
looking for codon usage, ORFs, and splice sites
Appearance in one or more EST databases
Computational analysis of genomic sequences
to identify candidate genes
Fig. 11.19
Gene expression patterns can pinpoint
candidate genes



Look in public database of EST sequences
representing certain tissues
Northern blot
RT-PCR
Northern blot example showing SRY candidate for testes determining
factor is expressed in testes, but not lung, ovary, or kidney
Fig. 11.20
Positional cloning – Step 3

Find the gene responsible for the phenotype

Expression patterns in affected individuals



Sequence differences


RNA expression assayed by Northern blot or RT-PCR with
primers specific to candidate transcript
Look for misexpression (no expression, underexpression,
overexpression)
Missense mutations identified by sequencing coding region of
candidate gene from normal and abnormal individuals
Transgenic modification of phenotype

Insert the mutant gene into a model organism
Transgenic analysis can prove candidate gene
is disease locus
Fig. 11.21
Example: Positional Cloning of Cystic
Fibrosis Gene

Linkage analysis places CF on chromosome 7
Fig. 11.22 a
Northern blot analysis reveals only one of candidate
genes is expressed in lungs and pancreas
Fig. 11.22 b
Every CF patient has a mutated allele of the
CFTR gene on both chromosome 7
homologs
Location and number of mutations indicated
under diagram of chromosome
Fig. 11.22 c
CFTR is a membrane protein. TMD-1 and
TMD-2 are transmembrane domains.
Fig. 11.22 d
Proving CFTR is the right gene

Phenotype eliminates gene function


Cannot use transgenic technology
Instead perform CFTR gene “knockout” in
mouse to examine phenotype without CFTR
gene

Targeted mutagenesis
Genetic dissection of complex traits
Incomplete penetrance – when a mutant genotype
does not always cause a mutant phenotype


No environmental factor associated with likelihood
of breast cancer
Positional cloning identified BRCA1 as one gene
causing breast cancer.


Only 66% of women who carry BRCA1 mutation
develop breast cancer by age 55
Incomplete penetrance hampers linkage mapping
and positional cloning


Solution – exclude all nondisease individuals form
analysis
Requires many more families for study

Phenocopy

Disease phenotype is not caused by any
inherited predisposing mutation


Decreases power to detect correlation between
inheritance of disease locus and expression of the
disease
Genetic heterogeneity

Mutations at more than one locus cause same
phenotype
Multiple families used in most studies
 If different families have different gene mutations,
power of statistics to detect linkage will drop
significantly


Polygenic inheritance

Two or more genes interact in the expression of
phenotype

QTLs, or quantitative trait loci


Unlimited number of transmission patterns for QTLs
 Discrete traits – penetrance may increase with number of
mutant loci
 Expressivity may vary with number of loci
Many other factors complicate analysis
 Some mutant genes may have large effect
 Mutations at some loci may be recessive while others are
dominant or codominant