Chapter 12 from book

Download Report

Transcript Chapter 12 from book

12
Genomes
Chapter 12 Genomes
Key Concepts
• 12.1 There Are Powerful Methods for
Sequencing Genomes and Analyzing
Gene Products
• 12.2 Prokaryotic Genomes Are Relatively
Small and Compact
• 12.3 Eukaryotic Genomes Are Large and
Complex
• 12.4 The Human Genome Sequence Has
Many Applications
Chapter 12 Opening Question
What does genome sequencing reveal
about dogs and other animals?
Concept 12.1 There Are Powerful Methods for Sequencing
Genomes and Analyzing Gene Products
The Human Genome Project was
proposed in 1986 to determine the normal
sequence of all human DNA.
The publicly funded effort was aided and
complemented by privately funded groups.
Methods used were first developed to
sequence prokaryotes and simple
eukaryotes.
Concept 12.1 There Are Powerful Methods for Sequencing
Genomes and Analyzing Gene Products
A key to interpreting DNA sequences is to
experiment simultaneously on a given
chromosome and to break the DNA into
fragments.
The fragment sequences are put together
using larger, overlapping fragments.
Next-generation DNA sequencing uses
DNA replication and the polymerase chain
reaction (PCR).
Concept 12.1 There Are Powerful Methods for Sequencing
Genomes and Analyzing Gene Products
One approach to next-generation DNA
sequencing:
• DNA is cut into 100 bp fragments.
• DNA is denatured by heat, and each single
strand then acts a template for synthesis.
• Each fragment is attached to adapter
sequences and then to supports.
• Fragments are then amplified by PCR.
Concept 12.1 There Are Powerful Methods for Sequencing
Genomes and Analyzing Gene Products
Amplified DNA attached to a solid substrate
is ready for sequencing:
• Fragments are denatured and primers,
DNA polymerase, and fluorescently
labeled nucleotides are added.
• DNA is replicated by adding one
nucleotide at a time.
• Fluorescent color of the particular
nucleotide is detected as it is added,
indicating the sequence of the DNA.
Concept 12.1 There Are Powerful Methods for Sequencing
Genomes and Analyzing Gene Products
The power of this method derives from the
fact that:
• It is fully automated and miniaturized.
• Millions of different fragments are
sequenced at the same time. This is called
massively parallel sequencing.
• It is an inexpensive way to sequence large
genomes.
Figure 12.1 DNA Sequencing (Part 1)
Figure 12.1 DNA Sequencing (Part 2)
Concept 12.1 There Are Powerful Methods for Sequencing
Genomes and Analyzing Gene Products
Determining sequences is possible
because original DNA fragments are
overlapping.
Example: A 10 bp fragment cut three
different ways yields
TG, ATG, and CCTAC
AT, GCC, and TACTG
CTG, CTA, and ATGC
The correct sequence is ATGCCTACTG.
Concept 12.1 There Are Powerful Methods for Sequencing
Genomes and Analyzing Gene Products
For genome sequencing the fragments are
called “reads.”
The field of bioinformatics was developed
to analyze DNA sequences using complex
mathematics and computer programs.
Figure 12.2 Arranging DNA Sequences
Concept 12.1 There Are Powerful Methods for Sequencing
Genomes and Analyzing Gene Products
In functional genomics, sequences identify
the functions of various parts:
• Open reading frames—the coding regions
of the genes, recognized by start and stop
codons for translation, and sequences
indicating location of introns
• Amino acid sequences of proteins
Concept 12.1 There Are Powerful Methods for Sequencing
Genomes and Analyzing Gene Products
• Regulatory sequences—promoters and
terminators for transcription
• RNA genes, including rRNA, tRNA, small
nuclear RNA, and microRNA genes
• Other noncoding sequences in various
categories
Figure 12.3 The Genomic Book of Life
Concept 12.1 There Are Powerful Methods for Sequencing
Genomes and Analyzing Gene Products
Comparative genomics compares a newly
sequenced genome with sequences from
other organisms.
It provides information about function of
sequences and can trace evolutionary
relationships.
Genetic determinism—the concept that a
phenotype is determined solely by his or
her genotype
Concept 12.1 There Are Powerful Methods for Sequencing
Genomes and Analyzing Gene Products
Many genes encode for more than one
protein, through alternative splicing and
posttranslational modifications.
The proteome is the total of the proteins
produced by an organism—more complex
than its genome.
Figure 12.4 Proteomics (Part 1)
Concept 12.1 There Are Powerful Methods for Sequencing
Genomes and Analyzing Gene Products
Two techniques are used to analyze
proteins and the proteome:
• Two-dimensional gel electrophoresis
separates proteins based on size and
electric charges.
• Mass spectrometry identifies proteins by
their atomic masses.
Proteomics seeks to identify and
characterize all of the expressed proteins.
Figure 12.4 Proteomics (Part 2)
Concept 12.1 There Are Powerful Methods for Sequencing
Genomes and Analyzing Gene Products
The metabolome is the description of all of
the metabolites of a cell or organism:
• Primary metabolites are involved in normal
processes, such as in pathways like
glycolysis. Also includes hormones and
other signaling molecules.
• Secondary metabolites are often unique to
particular organisms or groups.
Examples: Antibiotics made by microbes,
and chemicals made by plants for defense.
Concept 12.1 There Are Powerful Methods for Sequencing
Genomes and Analyzing Gene Products
Metabolomics aims to describe the
metabolome of a tissue or organism under
particular environmental conditions.
Analytical instruments can separate
molecules with different chemical
properties, and other techniques can
identify them.
Measurements can be related to
physiological states.
Figure 12.5 Genomics, Proteomics, and Metabolomics
Concept 12.2 Prokaryotic Genomes Are Relatively Small and
Compact
Features of bacterial and archaeal
genomes:
• Relatively small, with single, circular
chromosome
• Compact—mostly protein-coding regions
• Most do not contain introns
• Often carry plasmids, smaller circular DNA
molecules
Concept 12.2 Prokaryotic Genomes Are Relatively Small and
Compact
Functional genomics assigns functions to
the products of genes.
H. influenzae chromosome has 1,727 open
reading frames. When it was first
sequenced, only 58 percent coded for
proteins with known functions.
Since then, the roles of almost all other
proteins have been identified.
More genes are involved in each function in
the larger E. coli.
Table 12.1 Gene Functions in Three Bacteria
Concept 12.2 Prokaryotic Genomes Are Relatively Small and
Compact
Next, the study of the smallest known
genome (M. genitalium) was completed.
Comparative genomics showed that M.
genitalium lacks many enzymes and must
obtain them from its environment.
It also has very few genes for regulatory
proteins—its flexibility is limited by its lack
of control over gene expression.
Concept 12.2 Prokaryotic Genomes Are Relatively Small and
Compact
Transposons (or transposable elements)
are DNA segments that can move from
place to place in the genome.
They can move from one piece of DNA
(such as a chromosome), to another (such
as a plasmid).
If a transposon is inserted into the middle of
a gene, it will be transcribed and result in
abnormal proteins.
Figure 12.6 DNA Sequences That Move (Part 1)
Figure 12.6 DNA Sequences That Move (Part 2)
Concept 12.2 Prokaryotic Genomes Are Relatively Small and
Compact
Prokaryotes can be identified by their
growth in culture, but DNA can also be
isolated directly from environmental
samples.
Metagenomics—genetic diversity is
explored without isolating intact
organisms.
DNA can be cloned for “libraries” or
amplified and sequenced to detect known
and unknown organisms.
Figure 12.7 Metagenomics
Concept 12.2 Prokaryotic Genomes Are Relatively Small and
Compact
Comparing genomes of prokaryotes and
eukaryotes:
Certain genes are present in all organisms
(universal genes); and some universal
gene segments are present in many
organisms.
This suggests that a minimal set of DNA
sequences is common to all cells.
Concept 12.2 Prokaryotic Genomes Are Relatively Small and
Compact
Efforts to define a minimal genome involve
computer analysis of genomes, the study
of the smallest known genome (M.
genitalium), and using transposons as
mutagens.
Transposons can insert into genes at
random; the mutated bacteria are tested
for growth and survival, and DNA is
sequenced.
Figure 12.8 Using Transposon Mutagenesis to Determine the Minimal Genome (Part 1)
Concept 12.3 Eukaryotic Genomes Are Large and Complex
There are major differences between
eukaryotic and prokaryotic genomes:
• Eukaryotic genomes are larger and have
more protein-coding genes.
• Eukaryotic genomes have more regulatory
sequences. Greater complexity requires
more regulation.
• Much of eukaryotic DNA is noncoding,
including introns, gene control sequences,
and repeated sequences.
Concept 12.3 Eukaryotic Genomes Are Large and Complex
Several model organisms have been
studied extensively.
Model organisms are easy to grow and
study in a laboratory, their genetics are
well studied, and their characteristics
represent a larger group of organisms.
Table 12.2 Representative Sequenced Genomes
Concept 12.3 Eukaryotic Genomes Are Large and Complex
The yeast, Saccharomyces cerevisiae:
Yeasts are single-celled eukaryotes.
Yeasts and E. coli appear to use about the
same number of genes to perform basic
functions.
However, the compartmentalization of the
eukaryotic yeast cell requires it to have
many more genes to target proteins to
organelles.
Concept 12.3 Eukaryotic Genomes Are Large and Complex
The nematode, Caenorhabditis elegans:
A millimeter-long soil roundworm made up
of about 1,000 cells, yet has complex
organ systems.
Its genome is 8 times larger than yeast, and
it has about 3.5 times as many proteincoding genes as do yeasts.
Other genes are for cell differentiation,
intercellular communication, and forming
tissues from cells.
Concept 12.3 Eukaryotic Genomes Are Large and Complex
The fruit fly, Drosophila melanogaster:
The fruit fly has ten times more cells and is
more complex than C. elegans,
undergoing more developmental stages.
It has a larger genome with many genes
encoding transcription factors needed for
development.
Figure 12.9 Functions of the Eukaryotic Genome
Concept 12.3 Eukaryotic Genomes Are Large and Complex
The thale cress, Arabidopsis thaliana:
The genomes of some plants are huge, but
A. thaliana has a much smaller genome.
Many of the genes found in fruit flies and
nematodes have orthologs—genes with
very similar sequences—in plants,
suggesting a common ancestor.
Concept 12.3 Eukaryotic Genomes Are Large and Complex
Arabidopsis has some genes related to
functions unique to plants:
• Photosynthesis, water transport, assembly
of the cell wall, and making molecules for
defense against microbes and herbivores
The basic plant genome may be determined
by comparing different plant genomes for
common sequences.
Figure 12.10 Plant Genomes
Concept 12.3 Eukaryotic Genomes Are Large and Complex
Eukaryotes have closely related genes
called gene families.
These arose over evolutionary time when
different copies of genes underwent
separate mutations.
For example: Genes encoding the globin
proteins in hemoglobin and myoglobin all
arose from a single common ancestral
gene.
Concept 12.3 Eukaryotic Genomes Are Large and Complex
During development, different members of
the globin gene family are expressed at
different times and in different tissues.
Hemoglobin of the human fetus contains γglobin, which binds O2 more tightly than
adult hemoglobin.
Hemoglobins with different affinities are
provided at different stages of
development.
Figure 12.11 The Globin Gene Family
Concept 12.3 Eukaryotic Genomes Are Large and Complex
Many gene families include nonfunctional
pseudogenes (Ψ), resulting from
mutations that cause a loss of function,
rather a new one.
A pseudogene may simply lack a promoter,
and thus fail to be transcribed, or a
recognition site, needed for the removal of
an intron.
Concept 12.3 Eukaryotic Genomes Are Large and Complex
Eukaryotic genomes have repetitive DNA
sequences:
• Highly repetitive sequences—short
sequences (< 100 bp) repeated thousands
of times in tandem; not transcribed
• Short tandem repeats (STRs) of 1–5 bp
are scattered around the genome and can
be used in DNA fingerprinting.
Concept 12.3 Eukaryotic Genomes Are Large and Complex
Moderately repetitive sequences are
repeated 10–1,000 times.
• Includes the genes for tRNAs and rRNAs
• Single copies of the tRNA and rRNA
genes are inadequate to supply large
amounts of these molecules needed by
cells, so genome has multiple copies in
clusters
Most moderately repeated sequences are
transposons.
Concept 12.3 Eukaryotic Genomes Are Large and Complex
Transposons are of two main types in
eukaryotes:
Retrotransposons (Class I) make RNA
copies of themselves, which are copied
into DNA and inserted in the genome.
• LTR retrotransposons have long terminal
repeats of DNA sequences
• Non-LTR retrotransposons do not have
LTR sequences—SINEs and LINEs are
types of non-LTR retrotransposons
Concept 12.3 Eukaryotic Genomes Are Large and Complex
DNA transposons (Class II) do not use
RNA intermediates.
They are excised from the original location
and inserted at a new location without
being replicated.
Table 12.3 Types of Sequences in Eukaryotic Genomes
Concept 12.4 The Human Genome Sequence Has Many
Applications
By 2010 the complete haploid genome
sequence was completed for more than
ten individuals.
Soon, a human genome will be sequenced
for less than $1,000.
Concept 12.4 The Human Genome Sequence Has Many
Applications
Some interesting facts about the human
genome:
• Protein-coding genes make up about
24,000 genes, less than 2 percent of the
3.2 billion base pair human genome.
• Each gene must code for several proteins,
and posttranscriptional mechanisms (e.g.,
alternative splicing) must account for the
observed number of proteins in humans.
Concept 12.4 The Human Genome Sequence Has Many
Applications
• An average gene has 27,000 base pairs,
but size varies greatly as does the size of
the proteins.
• All human genes have many introns.
• 3.5 percent of the genome is functional but
noncoding—have roles in gene regulation
(microRNAs) or chromosome structure.
Concept 12.4 The Human Genome Sequence Has Many
Applications
• Over 50 percent of the genome is
transposons and other repetitive
sequences.
• Most of the genome (97 percent) is the
same in all people.
• Chimpanzees share 95 percent of the
human genome.
Figure 12.12 Evolution of the Genome
Concept 12.4 The Human Genome Sequence Has Many
Applications
Rapid genotyping technologies are being
used to understand the complex genetic
basis of diseases such as diabetes, heart
disease, and Alzheimer’s disease.
“Haplotype maps” are based on single
nucleotide polymorphisms (SNPs)—
DNA sequence variations that involve
single nucleotides.
SNPs are point mutations in a DNA
sequence.
Concept 12.4 The Human Genome Sequence Has Many
Applications
SNPs that differ are not all inherited as
independent alleles.
A set of SNPs that are close together on a
chromosome are inherited as a linked unit.
A piece of chromosome with a set of linked
SNPs is called a haplotype.
Analyses of human haplotypes have shown
that there are, at most, 500,000 common
variations.
Concept 12.4 The Human Genome Sequence Has Many
Applications
Technologies to analyze SNPs in an
individual genome include next-generation
sequencing methods and DNA
microarrays.
A DNA microarray detects DNA or RNA
sequences that are complementary to and
hybridize with an oligonucleotide probe.
The aim is to find out which SNPs are
associated with specific diseases and
identify alleles that contribute to disease.
Figure 12.13 SNP Genotyping and Disease
Concept 12.4 The Human Genome Sequence Has Many
Applications
Genetic variation can affect an individual’s
response to a particular drug.
A variation could make an drug more or less
active in an individual.
Pharmacogenomics studies how the
genome affects the response to drugs.
This makes it possible to predict whether a
drug will be effective, with the objective of
personalizing drug treatments.
Figure 12.14 Pharmacogenomics
Concept 12.4 The Human Genome Sequence Has Many
Applications
Comparisons of the proteomes of humans
and other eukaryotes has revealed
categories of proteins.
The human proteome includes a set of
1,300 proteins—also present in yeasts,
nematodes, and fruit flies—that carry out
the basic metabolic functions of the cell.
Concept 12.4 The Human Genome Sequence Has Many
Applications
Proteomics can be useful in the diagnosis of
diseases by studying the pattern of
proteins made in a particular tissue at a
particular time.
Metabolomics may also be able to aid in
diagnostics when patterns of metabolites
can be associated with physiology.
Concept 12.4 The Human Genome Sequence Has Many
Applications
DNA fingerprinting refers to a group of
techniques used to identify individuals by
their DNA.
Short tandem repeat (STR) analysis is most
common.
When several different STR loci are
analyzed, a unique pattern becomes
apparent.
Can be used for questions of paternity and
in crime investigation
Figure 12.15 DNA Fingerprinting (Part 1)
Figure 12.15 DNA Fingerprinting (Part 2)
Answer to Opening Question
• Genome sequencing in dogs led to the
identification of an SNP in the IGF-1 gene that
is important in determining size.
• Large and small breeds have different alleles
of the gene.
• Another gene shows differences in the
musculature of dogs and cattle when a
mutation is present.
Figure 12.16 Muscular Gene (Part 1)
Figure 12.16 Muscular Gene (Part 2)