Transcript here.
Primary Metabolism
Microbial life strategies
“Nomads”
~3 cm
“Settlers”
~1 mm
~10 cm
E. coli
Salmonella
Streptococcus
Actinomycetes
Cyanobacteria
Filamentous Fungi
Settlers have to protect themselves from the Nomads
Antibiotics
Siderophores
Signaling molecules
Specialized cell walls/lipids
Secondary metabolites, or natural products
Chemical Genetics
DNA sequence
Genes (Open Reading Frames)
Gene Cluster
ER
KR
Domains
AT
A
C
P
KS
mAT
Loading
Module
A
C
P
KR
KS
mAT
A
C
P
DH KR
KS
mAT
A
C
P
KS
Module 3
Module 2
mAT
KS AT
Module 5
Module 6
O
N
OH
HO O
O
O
O
O
O
A
C
P
KR
KS
mAT
Module 7
Module 4
Secondary
metabolite
KR
A
C
P
HO
OH
A
C
P
TE
The Genus
Salinispora
Jensen, Fenical
Lycopene/Iterative Type I PKS
nrps2
pks4
5-module NRPS
FAS/PKS-related
Cyclomarin
Enediyne PKS
FAS/PKS-related
S. arenicola
Rifamycin
5.68 Mb
Lym
Type I PKS
FAS/PKS-related
4-module NRPS
FAS/PKS-related
Udwary, Ziegler, Lapidus, Moore, Jensen
pks2
sid1
NRPS-related
Calicheamycin – part1
sid2
Calicheamycin – part2
Staurosporine
2-module PKS/NRPS
A Yersiniabactin Cluster?
unknown
transport
27698 bp
Yersiniabactin cluster
from Yersinia pestis
The quick brown fox jumped over the lazy dog
Yoda speak: Over the lazy dog the quick brown fox jumped.
A fast tan dingo leaped above some sleepy mutt
Low sequence
identity,
Similar “product”
Yersiniabactin cluster
from Yersinia pestis
High “sequence
identity”,
But different
“product”
The quick brown fox jumped over the lazy dog
The quick brown fox jumped over the lazy dog
S
N
S
N
N
S
H O
H
C
O
O
H
O
H
BLAST can’t really
tell the difference
S
N
O
H
S
N
O
H H
N
S
C
O
O
H
Would lead to different,
but similar, chemical products
that may have different activity
Project #1: Operon/Cluster identification tool
Rationale: I have a need to locate conserved or related operons or gene
clusters across multiple species. Important to understanding of genome
evolution, and functional analysis.
Goal: Develop a rapid automated method to identify conserved gene
clusters or operons across genomes.
Personnel needs:
Programmer/scripter (Java or Perl is ideal)
Biologist with some knowledge of gene structure or genomics
Identifying Gene Clusters
Automation?
PKS:
NRPS:
Other:
KS
AT
DH
ER
KR
ACP
Methyl transferase MT
TE
C
A
T (PCP)
Epimerization
C-cyclization
N-methyltransferase
TE
Prenyl transferase
Cytochrome P450
Type III PKS
KAS III
BLAST vs contig sequences
Evaluate proximity of hits
NP Chemistry
Enzymology
The hard part
Biochemistry
Examine predicted function of
genes around “clusters” of hits.
Determine boundaries for
putative clusters
Putative clusters
DNA sequence(s)
High scores
Genome
Genome
Genome(s)
extract
Library of query
genes and
desirability scores
Gene
Score for each gene
Map across chromosome
Putative clusters
DNA sequence(s)
KS
AT
DH
ER
KR
ACP
Methyl transferase MT
TE
C
A
T (PCP)
Epimerization
C-cyclization
N-methyltransferase
TE
Prenyl transferase
Cytochrome P450
KAS III
Low scores
Project #2: Synthetic Sequence Generator
Rationale: Synthesis of large DNA sequences is commercially feasible.
There is need for an organism-agnostic design tool.
Goal: Develop a tool to generate organism-specific DNA sequences
suitable for DNA synthesis from protein sequences
Personnel needs:
Programmer/scripter
Biologist with some knowledge of molecular biology and genetics
SCO7671 - Type III PKS
SCO7672 - prenyl transferase
SCO7670 - Isoprenylcysteine carboxyl methyltransferase
SCO7673 - lipoprotein
SCO7669 - oxidoreductase
SCO7668 - regulatory
SCO7674 - Cu-binding plastocyanin
SCO7667 - phosphohydrolase
SCO7675 - unknown membrane protein
S.coelicolor SCO7671 region
6901 bp
STRO2878 - UbiG benzoquinol methyltransferase
STRO2877 - UbiA (SCO7672)
STRO2879 - Acyl-CoA dehydrogenase
STRO2876
STRO2880 - Type III PKS
STRO2875 - DNA repair
STRO2881
STRO2874 - DNA-binding
STRO2882 - transcription factor
STRO2873
STRO2883
Cluster 12 - Type III region - pks4
11301 bp
Problem: I want to understand what the four-gene operon from each organism does,
biochemically. NEED all four proteins active and in the same place. Plus, one organism
is not available to me.
Old Way
PCR-amplify gene(s)
PCR-amplify gene(s)
PCR-amplify gene(s)
gene(s)
PurifyPCR-amplify
PCR product(s)
Purify PCR product(s)
Purify PCR product(s)
PCR product(s)
Clone into Purify
expression
plasmid(s)
Clone into expression plasmid(s)
Clone into expression plasmid(s)
Clone
into expression
plasmid(s)
Confirm
accuracy
of sequence(s)
Confirm accuracy of sequence(s)
Confirm accuracy of sequence(s)
Confirm
accuracy
of sequence(s)
Transform
into
expression
host
Transform into expression host
Transform into expression host
Transform
into expression host
Culture organism
Culture organism
Culture organism
Culture
organism
Purify
proteins
Purify proteins
Purify proteins
Purify proteins
Evaluate activity
in vitro
Evaluate activity in vitro
Evaluate activity in vitro
Evaluate activity in vitro
Dan’s Way
Type a DNA sequence
Email sequence to a
company to synthesize it
Transform into expression host
Culture organism
Evaluate activity in vivo
Codon usage
table of host
organism
Protein sequence
Reverse
translate
Codon-adjusted
DNA sequence
Optional modifications
Re-engineered,
optimized DNA
sequence
Add expression tag(s)
Remove restriction sites
Remove or alter other
undesirable motifs
Adjust for expressible multi-gene transcript
Engineered, optimized
expressible DNA operon