Transcript Lecture

Central Dogma
Information storage in biological molecules
replication
DNA
transcription
RNA
translation
Protein
DNA---deoxyribonucleic acid
phosphate
sugar (deoxyribose)
backbone
4 nitrogen bases
Purines
Pyrimidines
Blackburn and Gait,
Nucleic acids in chemistry
and biology, Oxford
University Press New York
1996.
T-A base pair
2 H bonds
C-G base pair
3 H bonds
Central Dogma
Information storage in molecules
replication
DNA
transcription
RNA
translation
Protein
RNA—ribonucleic acid
phosphate
sugar (ribose)
backbone
There’s an OH here instead of an H!
4 bases,
A,G,C
but U instead of T
Single stranded
Types of RNA-
mRNA holds the Message transcribed from DNA
will be translated into a protein
rRNA--a component of the Ribosome
tRNA—helps Transfer the message from base
pairs to protein
Note that rRNA and tRNA function in the cell as RNA
molecules and are never themselves translated into proteins
RNA secondary structure
especially
important for:
rRNA
tRNA
Chastain, M. and Tinoco Jr., I., (1991)
Prog. Nucleic Acid Res. Mol. Biol. 41, 131-177.
Central Dogma
Information storage in molecules
replication
DNA
transcription
RNA
translation
Protein
How do you sequence an entire genome?
genomic DNA
finishing and closure
using PCR to close gaps
and verify assembly
sheared to 3kb
computer assembly
of sequence reads
clone library
insert ends
sequenced to 8X
coverage
First complete genome sequence of a
free-living organism:
1995 Haemophilus influenzae
1,830,137 base pairs (1.8 Mbp), 1743 genes
Since 1995 there has been an explosion in the
number of completed genomes
http://www.genomesonline.org/
2004
Bacteria: 405 completed, 994 ongoing
147, 463
Archaea: 31 Completed, 64 ongoing
18, 26
Eukaryotes: 44 completed, 631 ongoing
27, 414
Meta genome projects: 62
Why?
Advances in sequencing technology—major
sequencing centers have enough capacity to
complete a bacterial genome in a day!
http://www.genomesonline.org/
Environmental Genomics
Idea: to look at DNA directly from the environment
One way: clone really large pieces
Clone into
BAC or
fosmid
Concentrate
on filter
100s of liters of water
Extract
HMW DNA
Large insert vectors
YAC—yeast artificial chromosome
Can clone DNA fragments up to 1000 kb insert size (average, 150
kb) in yeast cells. Issues with insert stability, high rates of chimerism, and
difficulty in purifyiing vector DNA.
BAC—bacterial artificial chromosome
Can clone DNA fragments 100- to 300-kb insert size (average, 150
kb) in Escherichia coli cells. Based on naturally occurring F-factor plasmid
found in the bacterium E. coli.
Fosmid/Cosmid----Artificially constructed cloning vector containing the cos gene of
phage lambda. Cosmids can be packaged in lambda phage particles for infection into
E. coli; this permits cloning of larger DNA fragments (up to 45kb) than can be
introduced into bacterial hosts in plasmid vectors.
Fosmid Library Construction
CopyControlTM
System
(Epicentre
Technologies)
Can be used for any
vector type—
plasmid, BAC,
fosmid
Allows maintenace of
cell stock at low
vector copy number,
and inducibility to
high copy number
when needed
Products of an environmental BAC library from California coastal waters
Beja et al 2000 Environmental
Microbiology 2: 516-529
Can screen BAC/fosmid libraries multiple ways:
Sequence ends of each BAC/fosmid
Probe with gene of interest (rRNA or functional gene)
Sequence entire fosmid to see what else is there
PCR pooled library with primers for gene of interest
Narrow down which fosmid gave positive band
Sequence entire fosmid to see what else is there
Expression and activity of rhodopsin from environmental BAC
Beja et al 2000 Science
Comparison of environmental BACs to genomes of cultured organisms
Beja et al 2002 Nature 415: 630-633
Genomics in the Environment: a shotgun
approach
Science, April 2, 2004
http://www.sorcerer2expedition.org/main.htm
Genomics in the Environment
Applied whole genome shotgun sequencing technique to
200 l of surface seawater
 1.045 billion bases sequenced
 1800 microbial species estimated to exist in sample,
including 148 novel phylotypes
 1.2 million previously unknown genes
 12 microbial genomes partially assembled
Whole genome sequencing
genomic
DNA
finishing and closure
using PCR to close gaps
and verify assembly
sheared to
3kb
computer
assembly of
sequence reads
clone library
insert ends
sequenced to 8X
coverage
GP2
MIT9302
75M 09
75M 08
75M 15
75M 18
MIT9201
MIT9312
MIT9321
75M 06
MIT9107
II
Low B/A high light
adapted
Prochlorococcus
NS_000023
SB
MIT9314
AS9601
MIT9301
175M 16
MIT9215
RS810
75M 02
75M 20
75M 19
MB11E08
MB11F02
I
MED4
MIT9515
NATL2A
PAC1
NATL1A
MIT9211
MIT9303
SS120
MIT9313
High B/A low light
adapted
Prochlorococcus
WH6501
WH8102
WH7805
WH8101
0.1
marine
Synechococcus
Comparison of MED4 with environmental scaffolds
Venter et al 2004, Science
High degree of synteny between MED4 and
environmental Prochlorococcus scaffolds
MED4
Pro. SAR-1
Variation at the nt and aa level between MED4 and
environmental Prochlorococcus scaffolds
nt
aa
rbcL
87
100
glnA
83
96
idiA
91
91
% identity