Greedy Algorithms in the Libraries of Biology

Download Report

Transcript Greedy Algorithms in the Libraries of Biology

Greedy Algorithms in the Libraries of Biology
17-Apr-2008 3:30-3:45 PM
Avogadro-Scale Computing MIT Bartos E15
Thanks to:
PGP
Is biology optimal?
Human Past
Locomotion
50 km/h
Ocean depth 75m
Visible l
.4-.7 m
Cold
0oC
Memory
20 yr
Present
26720 km/h
4500m
pm-Mm
3oK
2000 yr
3 Exponential technologies
1 to 18 month doubling times
1E+14
1E+12
Daltons synth
1E+10
1E+8
Computation &
Communication
Bits/sec
Gb chips
Seq bp/$
human
1E+6
1E+4
1E+2
1E+0
tRNA
B12
urea
Synthetic chemistry
telegraph
1E-2
1E-4
1840
Analytic tRNA
1860
1880
1900
1920
1940
1960
1980
2000
2020
Shendure J, Mitra R, Varma C, Church GM, 2004.
Carlson 2003; Kurzweil 2002; Moore 1965.
Avogadro scale, >>Yottaflops
(from CMOS to sea moss)
Ultra-parallel 1038 units (lab libraries:108 to 1015 25mers)
Adaptable
Evolution (years), Immune (days), Neural (seconds)
Thermodynamic limit 2x1019 op/J (irreversible)
3 x1020 for polymerase (1010 for current computers)
Memory density:
Neural: (1012 op/s & 106 bits)/mm3,
DNA: (103 op/s & 1 bit)/nm3
Error rate: DNA: 10-9 ; RNA/protein: 10-4
Adleman 1994
Biofuel: 4x107 J/kg (~=$)
DNA error rates
3. Mismatch repair
DNA Replication Fork
2. Proofreading
exonuclease 3’to 5’
1. Incorporation 5’to 3’
Ellis et al. PNAS 2001
Constantino & Court. PNAS 2003
Bionano – Inorganic-microfab interfaces
• Metal-oxide-semiconductors
(sponge silicateins for Ti & Ga oxides)
• Magnetic components
(magnetosomes in magnetotactic bacteria)
• Optical fibers & lenses
(e.g. venus basket sponge)
• Bacterial reduction of salts to metals
(e.g. Se, Au, Ag)
• Reading and writing DNA
Reading DNA : Open-source hardware,
software, wetware Polonator G007
~10 to $400/Gbp
1E-6 @ >3X
redundancy
Synthetic Biology:
augmentation & combinatorics (not minimization)
1.
2.
3.
4.
5.
6.
7.
Synthetic DNA: 1Mbp per month (Codon Devices)
New polymers in vitro – affinity selection (Vanderbilt)
Hydrocarbon & other chemical syntheses in E.coli (LS9)
Bacterial & stem cell therapies (SynBERC & MGH)
New codes: Viral resistant cells & new aminoacids (MIT)
Synthetic Ecosystems – Evolve secretion & signaling
Interfaces of Genomics & Society
Hierarchical, modular, evolvable
DNA origami -- highly
predictable 3D
nanostructures
Rothemund
Nature’06
Douglas, et al.
PNAS’07
DNA-nanotube-induced
alignment of membrane
proteins for NMR structure
determination
10 Mbp of DNA / $300 chip
Spatially patterned chemistry
8K Atactic/Xeotron/Invitrogen
Photo-Generated Acid
12K Combimatrix Electrolytic
44K Agilent Ink-jet standard reagents
380K Nimblegen/GA Photolabile 5'protection
Amplify pools of 50mers using
flanking universal PCR primers &
3 paths to 10X error correction
Tian et al. Nature. 432:1050
Carr & Jacobson 2004 NAR
Smith & Modrich 1997 PNAS
Mirror world :
resistant to enzymes, parasites, predators
Mirror aptamers, ribozymes, etc. require mirror polymerases
352 aminoacid long Dpo4 Sulfolobus DNA polymerase IV
347 peptide bonds done; 4 to go.
L-aminoacids
D-nucleotides
(current biosphere)
D-aminoacids
L-nucleotides
(Mirror-biopolymers)
Why synthesize (minimal)
in vitro self-replication?
• Molecular Biology Central Dogma
DNA > RNA > Protein
PCR, T7 RNA pol, in vitro translation.
• Production of devices larger than or toxic to cells.
• Directed evolution of drugs & affinity agents.
• Mirror-image proteins
Duhee Bang (HMS)
Tony Forster
(Vanderbilt)
Pure
in vitro
translating &
replicating
system
ideal for
comprehensive
atomic, ODE &
stochastic
models
Forster &
Church
MSB ‘05
GenomeRes.’06
Shimizu, Ueda
et al ‘01
113 kbp DNA 151 genes
Genome engineering CAD
Recombination
in vivo E.coli
Polymerase
in vitro
70b
15Kb
Error
Correction
MutS 1E-4
Chemical
Synthesis
1E-2
5Mb
Bacterial (Artificial)
Chromosomes
BACs
Recombination
in human cells
250 Mb
Human(Artificial)
Chromosomes
HACs
Sequencing
1E-7
Isaacs, Carr, Emig, Gong, Tian, Reppas, Jacobson, Church
Native DNA computing : Lab Evolution
About 3 serial additive changes per 30 days
vs 2^30 exhaustive search
Reppas/Lin
Tolonen
Lenski
Palsson
Edwards
Ingram
Marliere
J&J
DuPont
Trp/Tyr exchange
Ethanol resistance
Citrate utilization
Glycerol utilization
Radiation resistance
Lactate production
Thermotolerance
Diarylquinoline resistance (TB)
1,3-propanediol production
rE.coli Strategy #3: ss-Oligonucleotide Repair
DNA Replication Fork
Ellis et al. PNAS 2001
Constantino & Court. PNAS 2003
Obtain 25% recombination efficiency in
E. coli strains lacking mismatch repair
genes (mutH, mutL, mutS, uvrD, dam)
Improved Recombination Frequency:
10-4  0.25 (> 3 log increase!)
Multiplex Automated Genome Engineering (MAGE)
Wash with
water &
DNA pool
(50)
Concentrate
O-ring
Concentrate,
electroporate
Wang, Isaacs, Terry
membrane
Resuspend,
bubble,
select
GEMASS Prototype
H. Wang, Church Lab, Harvard, 2008
Recombination-Cycling for Combinatorial Accelerated Evolution
Mutation Distribution: 11 oligos, 15 cycles
Mutation Distribution: 54 oligos, 45 cycles
25
Frequency
20
15
10
5
0
0
1
2
3
4
5
6
7
# mutations/clone
Oligo
Pool
# cycles
Best Clone (98 %tile)
Fraction of mutated sites
Time*
11
15
7
7/11
3 days
54
45
23
23/54
9 days
* Continuous cycling
 Scaling & Automation
 Increase Efficiency of Recombination
Wang, Isaacs, Carr, Jacobson, Church
Avogadro scale, >>Yottaflops
(from CMOS to sea moss)
Ultra-parallel 1038 units (lab libraries:108 to 1015 25mers)
Adaptable
Evolution (years), Immune (days), Neural (seconds)
Thermodynamic limit 2x1019 op/J (irreversible)
3 x1020 for polymerase (1010 for current computers)
Memory density:
Neural: (1012 op/s & 106 bits)/mm3,
DNA: (103 op/s & 1 bit)/nm3
Error rate: DNA: 10-9 ; RNA/protein: 10-4
Adleman 1994
Biofuel: 4x107 J/kg (~=$)
.
Multiplex Automated Genome Engineering (MAGE)
syringe pump
computer communication /
data acquisition system
electrically
actuated valves
OD sensor
electroporation cuvette w/
membrane filter
Wang, Isaacs, Terry
Fab vs. Bio-fab
+ Plays well with digital computers
- Doesn’t get DNA
- Needs us to replicate
- Needs expensive Fab (e.g. ICs)
- Intelligent Design
- No habla C++
+ DNA is it’s native digital media
+ We need them
+ Simple or complex inputs
+ Evolution
Cross-feeding symbiotic systems:
aphids & Buchnera
•
•
•
•
obligate mutualism
nutritional interactions: amino acids & vitamins
established 200-250 million years ago
close relative of E. coli with tiny genome (618~641kb)
MILKFTWV
MILKFTWV HR
Aphids
http://buchnera.gsc.riken.go.jp
Pink= enzymes apparently missing in Bucherna
Shigenobu et al. Genome sequence of the endocellular bacterial symbiont
of aphids Buchnera sp.APS. Nature 407, 81-86 (2000).
Synthetic genome pair evolution
Second
Passage
First
Passage
trp/tyrA pair of genomes shows best cogrowth
Reppas, Lin et al. ; Accurate Multiplex Polony Sequencing
of an Evolved Bacterial Genome 2005 Science
Co-evolution of mutual biosensors/biosynthesis
sequenced across time & within each time-point
Independent lines of
Trp & Tyr co-culture
5 OmpF: (pore: large,hydrophilic > small)
42R-> G,L,C, 113 D->V, 117 E->A
2 Promoter: (cis-regulator)
-12A->C, -35 C->A
5 Lrp: (trans-regulator)
1b, 9b, 8b, IS2 insert, R->L in DBD.
Heterogeneity within each time-point .
At late times Tyr- becomes prototroph!
Reppas, Shendure, Porecca
-12 -11 -10 -9
-8 -7
-6
Reducing costs of open-source
hardware & wetware
Factor
• 30 Equipment speed: from 1 up to 30 Mpixels/sec camera
• 4 Equipment cost: from $500K down to $150K (Danaher Inc)
• 36 Parallelism: 36 flow-cells per camera, 2 billion beads
-----------------• 75 Flow cell volume: 1.5 mm down to 0.0085 mm thin
• 40 Kit costs: $2000 down to $50 at standard enzyme costs
• 10 Enzymes: $4000/mg down to <$400 (Enzymatics Inc)
• 50 Genomic subset (Exome – 1% genome)