Ch 18 - Quia

Download Report

Transcript Ch 18 - Quia

Genomics
Chapter 18
Mapping Genomes
Maps of genomes can be divided into 2 types
-Genetic maps
-Abstract maps that place the relative
location of genes on chromosomes
based on recombination frequency
-Physical maps
-Use landmarks within DNA sequences,
ranging from restriction sites to the
actual DNA sequence
2
Physical Maps
Distances between “landmarks” are measured
in base-pairs
-1000 basepairs (bp) = 1 kilobase (kb)
Knowledge of DNA sequence is not necessary
There are three main types of physical maps
-Restriction maps
-Cytological maps
-Radiation hybrid maps
3
Physical Maps
Restriction maps
-The first physical maps
-Based on distances between restriction
sites
-Overlap between smaller segments can be
used to assemble them into a contig
-Continuous segment of the genome
4
5
6
Physical Maps
Cytological maps
-Employ stains that generate reproducible
patterns of bands on the chromosomes
-Divide chromosomes into subregions
-Provide a map of the whole genome, but
at low resolution
-Cloned DNA is correlated with map using
fluorescent in situ hybridization (FISH)
7
Physical Maps
8
Physical Maps
Radiation hybrid maps
-Use radiation to fragment chromosomes
randomly
-Fragments are then recovered by fusing
irradiated cell to another cell
-Usually a rodent cell
-Fragments can be identified based on
banding patterns or FISH
9
Physical Maps
Sequence-tagged sites
-An STS is a small stretch of DNA that is
unique in the genome
-Only 200-500 bp
-Boundary is defined by PCR primers
-Identified using any DNA as a template
-STSs essentially provide a scaffold for
assembling genome sequences
10
11
12
Genetic Maps
Genetic maps are measured in centimorgans
-1 cM = 1% recombination frequency
Linkage mapping can be done without
knowing the DNA sequence of a gene
-Limitations:
1. Genetic distance does not directly
correspond to actual physical distance
2. Not all genes have obvious
phenotypes
13
Genetic Maps
Most common markers are short repeat
sequences called, short tandem repeats,
or STR loci
-Differ in repeat length between individuals
-13 form the basis of modern DNA
fingerprinting developed by the FBI
-Cataloged in the CODIS database to
identify criminal offenders
14
Genetic Maps
Genetic and physical maps can be correlated
-Any cloned gene can be placed within the
genome and can also be mapped genetically
15
Genetic Maps
All of these different kinds of maps are stored
in databases
-The National Center for Biotechnology
Information (NCBI) serves as the US
repository for these data and more
-Similar databases exist in Europe and
Japan
16
Whole Genome Sequencing
The ultimate physical map is the base-pair
sequence of the entire genome
-Requires use of
high-throughout
automated
sequencing and
computer analysis
17
Whole Genome Sequencing
Sequencers provide accurate sequences for
DNA segments up to 800 bp long
-To reduce errors, 5-10 copies of a genome
are sequenced and compared
Vectors use to clone large pieces of DNA:
-Yeast artificial chromosomes (YACs)
-Bacterial artificial chromosomes (BACs)
-Human artificial chromosomes (HACs)
-Are circular, at present
18
Whole Genome Sequencing
Clone-by-clone sequencing
-Overlapping regions between BAC clones
are identified by restriction mapping or STS
analysis
Shotgun sequencing
-DNA is randomly cut into smaller fragments,
cloned and then sequenced
-Computers put together the overlaps
-Sequence is not tied to other information 19
20
21
The Human Genome Project
Originated in 1990 by the International Human
Genome Sequencing Consortium
Craig Venter formed a private company, and
entered the “race” in May, 1998
In 2001, both groups published a draft
sequence
-Contained numerous gaps
22
The Human Genome Project
In 2004, the “finished” sequence was published
as the reference sequence (REF-SEQ) in
databases
-3.2 gigabasepairs
-1 Gb = 1 billion basepairs
-Contains a 400-fold reduction in gaps
-99% of euchromatic sequence
-Error rate = 1 per 100,000 bases
23
Characterizing Genomes
The Human Genome Project found fewer
genes than expected
-Initial estimate was 100,000 genes
-Number now appears to be about 25,000!
In general, eukaryotic genomes are larger and
have more genes than those of prokaryotes
-However, the complexity of an organism is
not necessarily related to its gene number
24
Characterizing Genomes
25
Finding Genes
Genes are identified by open reading frames
-An ORF begins with a start codon and
contains no stop codon for a distance long
enough to encode a protein
Sequence annotation
-The addition of information, such as ORFs,
to the basic sequence information
26
Finding Genes
BLAST
-A search algorithm used to search NCBI
databases for homologous sequences
-Permits researchers to infer functions for
isolated molecular clones
Bioinformatics
-Use of computer programs to search for
genes, and to assemble and compare
genomes
27
Genome Organization
Genomes consist of two main regions
-Coding DNA
-Contains genes than encode proteins
-Noncoding DNA
-Regions that do not encode proteins
28
Coding DNA in Eukaryotes
Four different classes are found:
-Single-copy genes : Includes most genes
-Segmental duplications : Blocks of genes
copied from one chromosome to another
-Multigene families : Groups of related but
distinctly different genes
-Tandem clusters : Identical copies of genes
occurring together in clusters
-Also include rRNA genes
29
Noncoding DNA in Eukaryotes
Each cell in our bodies has about 6 feet of
DNA stuffed into it
-However, less than one inch is devoted to
genes!
Six major types of noncoding human DNA
have been described
30
Noncoding DNA in Eukaryotes
Noncoding DNA within genes
-Protein-encoding exons are embedded
within much larger noncoding introns
Structural DNA
-Called constitutive heterochromatin
-Localized to centromeres and telomeres
Simple sequence repeats (SSRs)
-One- to six-nucleotide sequences repeated
31
thousands of times
Noncoding DNA in Eukaryotes
Segmental duplications
-Consist of 10,000 to 300,000 bp that have
duplicated and moved
Pseudogenes
-Inactive genes
32
Noncoding DNA in Eukaryotes
Transposable elements (transposons)
-Mobile genetic elements
-Four types:
-Long interspersed elements (LINEs)
-Short interspersed elements (SINEs)
-Long terminal repeats (LTRs)
-Dead transposons
33
Noncoding DNA in Eukaryotes
34
Expressed Sequence Tags
ESTs can identify genes that are expressed
-They are generated by sequencing the
ends of randomly selected cDNAs
ESTs have identified 87,000 cDNAs in
different human tissues
-But how can 25,000 human genes encode
three to four times as many proteins?
-Alternative splicing yields different
proteins with different functions
35
Alternative Splicing
36
Variation in the Human Genome
Single-nucleotide polymorphisms (SNPs)
are sites where individuals differ by only one
nucleotide
-Must be found in at least 1% of population
Haplotypes are regions of the chromosome
that are not exchanged by recombination
-Tendency for genes not to be randomized is
called linkage disequilibrium
-Can be used to map genes
37
38
39
40
Genomics
Comparative genomics, the study of whole
genome maps of organisms, has revealed
similarities among them
-For example, over half of Drosophila genes
have human counterparts
Synteny refers to the conserved arrangements
of DNA segments in related genomes
-Allows comparisons of unsequenced
genomes
41
Genomics
42
43
Genomics
Organellar genomes
-Mitochondria and chloroplasts are
descendants of ancient endosymbiotic
bacterial cells
-Over time, their genomes exchanged
genes with the nuclear genome
-Both organelles contain polypeptides
encoded by the nucleus
44
Genomics
Functional genomics is the study of the
function of genes and their products
DNA microarrays (“gene chips”) enable
the analysis of gene expression at the
whole-genome level
-DNA fragments are deposited on a slide
-Probed with labeled mRNA from
different sources
-Active/inactive genes are identified
45
46
47
Genomics
Transgenics is the creation of organisms
containing genes from other species
(transgenic organisms)
-Can be used to determine whether:
-A gene identified by an annotation
program is really functional in vivo
-Homologous genes from different
species have the same function
48
Genomics
49
Genomics
50
Genomics
51
Genomics
52
Proteomics
Proteomics is the study of the proteome
-All the proteins encoded by the genome
The transcriptome consists of all the RNA
that is present in a cell or tissue
53
Proteomics
Proteins are much more difficult to study
than DNA because of:
-Post-translational modifications
-Alternative splicing
However, databases containing the known
protein structural motifs exist
-These can be searched to predict the
structure and function of gene sequences
54
Proteomics
55
Proteomics
Protein microarrays are being used to study
large numbers of proteins simultaneously
-Can be probed using:
-Antibodies to specific proteins
-Specific proteins
-Small molecules
The yeast two-hybrid system has generated
large-scale maps of interacting proteins 56
Applications of Genomics
The genomics revolution will have a lasting
effect on how we think about living systems
The immediate impact of genomics is being
seen in diagnostics
-Identifying genetic abnormalities
-Identifying victims by their remains
-Distinguishing between naturally occurring
and intentional outbreaks of infections
57
Applications of Genomics
58
Applications of Genomics
Genomics has also helped in agriculture
-Improvement in the
yield and nutritional
quality of rice
-Doubling of world grain production in last
50 years, with only a 1% cropland increase
59
Applications of Genomics
Genome science is also a source of ethical
challenges and dilemmas
-Gene patents
-Should the sequence/use of genes be
freely available or can it be patented?
-Privacy concerns
-Could one be discriminated against
because their SNP profile indicates
susceptibility to a disease?
60