2 - self-assembly wiki

Download Report

Transcript 2 - self-assembly wiki

Notes on randomized selfassembly
Days 34, 35, 36 and 37 of Comp
Sci 480
Randomized self-assembly
• Also known as “tile concentration programming”
assembly model
– Or just “concentration programming”
– First proposed by Becker, Remila, Rapaport in ~2006
• Model refined by Kao, Schweller (2008), Doty (2009)
• Tiles are assigned relative concentrations
– In one “mole” of solution, tile t0 makes up 50% of the tiles, t1
25%, t2 15% and t3 10%
• Models the situation where two different tile types are
“competing” for the chance to bind at a particular location
– The higher the concentration of a particular tile, the more likely it
is to bind at that location
• Can be used to study “running time” in self-assembly
– We won’t do this
The basic idea
1
Which tile is more likely to bump into the
assembly and bind at this position?
A
t0 2
B
X
X
A
t1 Y
A
t1 Y
B
B
X
A
X
t1 Y
1
B
X
A
A
t1 Y
A
t0 2
B
t1 Y
B
B
X
A
?
X
A
B
t1 Y
B
A
X
A
t1 Y
B
t1 Y
B
X
1
A
t0 2
A
t1 Y
B
B
Tile type t1 is more likely to bind,
because it has a higher concentration.
Non-determinism
• Concentration programming always
involves non-determinism
– Otherwise, what’s the point of the
concentrations of tiles?
• Not interested in tile sets that uniquely
produce an assembly
• Very interested in tile sets that uniquely
produce a shape!
Rules of the model
• The rules are the same as they are for the aTAM
– Tiles bind one at a time to a growing assembly
– Must bind with strength at least a certain temperature value
– Etc.
• Assign to each tile t ϵ T a probability, p(t), such that
Σt ϵ T p(t) = 1
• Things get interesting when there is more than one
location at which a tile may be placed
– Even more interesting if, at one of these locations, more than
one tile type may attach
• How to determine which tile binds next to the growing
assembly A?
– Choose a location
– Choose a tile to go at that location…
Step 1: choose a location
• First, choose a location (there could be many) at which a
tile may bind in A
• Let ∂(A) be the set of all locations in A that can receive a
tile
• Let F(x,y) = “the sum of all the probabilities of all the tiles
that can bind to A at location (x,y)”
– F(x,y) = Σ{ t ϵ T | t can bind at A(x,y) } p(t)
• Let F = “the sum of all the probabilities of all the tiles that
can bind at any location in A”
– F = Σ(x,y) ϵ ∂(A) F(x,y)
• Let P(x,y) = F(x,y) / F
– P(x,y) is known as the “position selection probability”
Step 2: choose a tile
• Once we have a location, we need to
choose a tile that can be placed
• For this, define P(t | (x,y)) = p(t) / F(x,y)
– Tile selection probability
– It’s the conditional probability that t attaches
at location (x,y) in A
• Assuming (x,y) can receive a tile at all!
Put it all together
• Multiply the probabilities to see which tile
goes next
– I.e., to see the probability of a particular tile
being placed at a particular location
• P(x,y) * P(t | (x,y))
– This is the probability that t is placed at
location (x,y) in the next assembly step
– Can be simplified to: p(t) / F
• Time for an example…
A
1
1
B
2
3
2
3
S
4
4
C
4
D
5
p(A) = 5%
p(B) = 5%
p(C) = 60%
p(S) = 5%
p(D) = 20%
p(E) = 5%
5
E
Terminal assemblies
A
1
1
B
2
3
2
3
S
4
4
C
Two terminal assemblies…
4
D
5
5
…which one is more likely to selfassemble?
E
p(A) = 5%
p(B) = 5%
p(C) = 60%
p(S) = 5%
p(D) = 20%
p(E) = 5%
A
B
A
2
3
2
2
3
2
C
S
S
1
4
1
4
1
4
4
D
5
5
E
2
S
A
1
1
B
2
3
2
3
S
4
4
C
4
D
5
p(A) = 5%
p(B) = 5%
p(C) = 60%
p(S) = 5%
p(D) = 20%
p(E) = 5%
5
E
4
2
S
4
A
1
(0,1)
A
1
1
B
2
3
2
3
S
4
4
C
4
D
2
2
S
5
5
4
4
3
2
C
S
2
4
S
4
4
D
5
E
P(0,1)*p(A | (0,1)) = (F(0,1) / F)*(p(A) / F(0,1))
p(A) = 5%
p(B) = 5%
p(C) = 60%
p(S) = 5%
p(D) = 20%
p(E) = 5%
= .05 / (.05 + .6 + .2) * (.05 / .05)
= .05 / .85 * 1
= .059 = 5.9%
2
S
4
5.9%
A
1
1
2
B
A
3
2
(1,0)
2
S
2
3
4
4
C
4
D
S
5
5
4
4
1
3
2
C
S
2
4
S
4
4
D
5
E
P(1,0)*p(C | (1,0)) = (F(1,0) / F)*(p(C) / F(1,0))
p(A) = 5%
p(B) = 5%
p(C) = 60%
p(S) = 5%
p(D) = 20%
p(E) = 5%
= (.6 + .2) / (.05 + .6 + .2) * (.6 / (.6 + .2))
= .8 / .85 * .6 / .8
= .6 / .85 = .706
2
S
4
5.9%
A
1
1
2
70.6%
3
2
S
B
4
C
4
D
1
2
2
3
4
A
S
5
5
4
4
3
2
C
S
(1,0)
2
4
S
4
4
D
5
E
P(1,0)*p(D | (1,0)) = (F(1,0) / F)*(p(D) / F(1,0))
p(A) = 5%
p(B) = 5%
p(C) = 60%
p(S) = 5%
p(D) = 20%
p(E) = 5%
= (.6 + .2) / (.05 + .6 + .2) * (.2 / (.6 + .2))
= .8 / .85 * .2 / .8
= .2 / .85 = 23.5%
2
S
4
5.9%
A
1
1
B
2
3
2
3
S
4
4
C
4
D
70.6%
A
23.5%
1
2
2
S
5
5
4
4
3
2
C
S
2
4
S
4
4
D
5
E
P(1,0)*p(D | (1,0)) = (F(1,0) / F)*(p(D) / F(1,0))
p(A) = 5%
p(B) = 5%
p(C) = 60%
p(S) = 5%
p(D) = 20%
p(E) = 5%
= (.6 + .2) / (.05 + .6 + .2) * (.2 / (.6 + .2))
= .8 / .85 * .2 / .8
= .2 / .85 = 23.5%
2
S
4
5.9%
A
1
1
B
2
3
2
3
S
4
4
70.6%
A
23.5%
1
2
2
S
C
4
4
3
2
C
S
2
4
S
4
A
1
4
D
5
4
D
5
100%
4
D
5
5
E
A
??%
1
2
p(A) = 5%
2
2
S
4
4
3
2
C
S
4
p(B) = 5%
p(C) = 60%
p(S) = 5%
p(D) = 20%
p(E) = 5%
P(1,0)*p(C | (1,0)) = (F(1,0) / F)*(p(C) / F(1,0))
= (.6 + .2) / (.6 + .2) * (.6 / (.6 + .2))
= 1 * .6 / .8
= .75 = 75%
2
S
4
5.9%
A
1
1
B
2
3
2
3
S
4
4
70.6%
A
23.5%
1
2
2
S
C
4
4
3
2
C
S
2
4
S
4
A
1
4
D
5
4
D
5
100%
4
D
5
5
??%
E
A
75%
1
2
p(A) = 5%
2
2
S
4
4
3
2
C
S
4
p(B) = 5%
p(C) = 60%
p(S) = 5%
p(D) = 20%
p(E) = 5%
P(1,0)*p(D | (1,0)) = (F(1,0) / F)*(p(D) / F(1,0))
= (.6 + .2) / (.6 + .2) * (.2 / (.6 + .2))
= 1 * .2 / .8
= .25 = 25%
2
S
4
5.9%
A
1
1
B
2
3
2
3
S
4
4
70.6%
A
23.5%
1
2
2
S
C
4
4
3
2
C
S
2
4
S
4
A
1
4
D
5
4
D
5
100%
4
D
5
5
25%
E
A
75%
1
2
p(A) = 5%
p(B) = 5%
p(C) = 60%
p(S) = 5%
p(D) = 20%
p(E) = 5%
2
2
S
4
4
3
2
C
S
4
2
S
4
5.9%
A
1
1
B
2
3
2
3
S
4
4
70.6%
A
23.5%
1
2
2
S
C
4
4
3
2
C
S
2
4
S
4
A
1
4
D
5
4
D
5
100%
4
D
5
5
25%
E
A
75%
1
2
p(A) = 5%
2
2
S
4
4
3
2
C
S
4
p(B) = 5%
100%
p(C) = 60%
p(S) = 5%
p(D) = 20%
p(E) = 5%
A
100%
B
A
2
3
2
2
3
2
C
S
S
1
4
1
4
1
4
4
D
5
5
E
Probability of producibility
• For a producible assembly A, look at every
path, say πA, from the seed to A
• Compute the product of all the edges
along πA
• Add up all the products
• The final number is the probability that A
will be produced
• Back to our example…
2
S
4
5.9%
A
1
1
B
2
3
2
3
S
4
4
70.6%
A
23.5%
1
2
2
S
C
4
4
3
2
C
S
2
S
4
4
4
D
5
??%
100%
4
D
5
5
??%
25%
E
A
75%
1
2
p(A) = 5%
1
2
2
S
A
4
4
3
2
C
S
4
4
D
5
p(B) = 5%
100%
p(C) = 60%
p(S) = 5%
p(D) = 20%
p(E) = 5%
A
1
1
B
100%
.706 * 1 * 1
+
.059 * .75 * 1
=75%
A
2
3
2
3
2
C
S
S
4
4
??%
1
2
4
4
D
5
5
E
A simple case
• Things aren’t always that complicated
• If your tile set has the property that, at any
step, there is only one location, then all
you have to consider is the tile selection
probabilities
• Example: building a 1xN line
Two goals
• Working in the tile concentration programming
model, we will accomplish the following:
• Design a tile set with size O(1) that can build any
1xN line
– N is specified by the concentrations of the tiles
– Not possible in neither the aTAM nor the temperature
programming model
• Design a tile set with size O(1) that can build any
NxN square
– N is specified by the concentrations of the tiles
– Also possible with temperature programming, but
NOT in the aTAM
Lines
• Recall tile complexities for 1xN lines:
– aTAM: N
– Temperature programming: N
• Tile concentration programming: There is a tile
set of size O(1) that can build any 1xN line
– N specified by the relative concentrations of the tile
types
• Basic idea: start from a seed and always have
two tile types that can bind:
– Keep going
– Stop growing
• How many tile types?
–3
The tile set
This tile set, say T, produces infinitely many terminal
assemblies
S
?
?
A
?
May never stop growing (with extremely low probability)
?
B
What is the “expected” (a.k.a., average) length of a finite
terminal assembly?
p(S) = .01
p(A) = (1 - 1/(N - 1))(1 - .01)
p(B) = (1/(N - 1))(1 - .01)
Note: p(S) + p(A) + p(B)
= .01 + (1 - 1/(N - 1))(1 - .01) + (1/(N - 1))(1 - .01)
= .01 + (1 - .01)(1 - 1/(N - 1) + 1/(N - 1))
= .01 + (1 - .01)(1) = .01 + .99 = 1.0
Expected length
• What is the average length of any finite
assembly produced by T?
• Let X represent the length of assembly produced
by T
– X is a “random variable”, i.e., its value depends on a
random process (either stop, or keep going)
• We want to compute the expectation of X,
denoted as E[X]
– The average value that X takes on
Expectation
• To compute the expected value of a
random variable, we need to compute this
quantity:
E[X] = ΣkPr(X = k)*k
• Note: k ranges over all values that X can
take on
• Each term in the sum: “The probability that
X equals k, times k”
• In our lines example, k ranges from 2 to
infinity!
Expectation computation
• First and foremost, what is Pr(X = k)?
– The probability that the line length is k
• What has to happen in order to get a line
of length exactly k?
• This means we need k-2 “keep going” tiles
to attach followed by a “stop” tile
• What is the probability of getting this
assembly?
S
?
(1 - 1/(N - 1))(1 - .01) / ((1 - 1/(N - 1))(1 - .01) + (1/(N - 1))(1 - .01)) = 1 - 1/(N – 1)
S
?
?
A
?
(1 - 1/(N - 1))(1 - .01) / ((1 - 1/(N - 1))(1 - .01) + (1/(N - 1))(1 - .01)) = 1 - 1/(N – 1)
k-2
S
?
?
A
?
?
A
?
S
?
?
A
?
?
A
?
?
A
?
(1/(N - 1))(1 - .01) / ((1 - 1/(N - 1))(1 - .01) + (1/(N - 1))(1 - .01)) = 1/(N - 1)
S
?
?
A
?
?
A
?
Pr(X = k) = (1 - 1/(N - 1))k-2(1/(N - 1))
?
A
?
?
B
Expectation summation
E[X]
the good ol’ fashion way…
= Σk≥2 k*Pr(X = k)
= Σk≥2 k*(1 - 1/(N - 1))k-2(1/(N - 1))
; definition of Pr(X = k)
k-2
= Σk≥2 k*(1 - 1/(N - 1)) (1/(N - 1))
; algebra
= Σk≥2 k*(1 - 1/(N - 1))k-2(1 - (1 - 1/(N - 1)))
; algebra
= Σk≥2 {k*(1 - 1/(N - 1))k-2 - k(1 - 1/(N - 1))(1 - 1/(N - 1))k-2}
; algebra
= Σk≥2 {k*(1 - 1/(N - 1))k-2 - k(1 - 1/(N - 1))k-1}
; algebra
k
= 1 + Σk≥0 (1 - 1/(N - 1))
; tricky: consecutive terms cancel
= 1 + (1 / (1 - (1 - 1 / (N - 1)))) ; geometric series formula: Σk≥0 xk = 1 / (1 - x)
= 1 + (1 / (1 / (N - 1)))
= 1 + (N - 1)
=N
Hurray!!
Easier way: Google “geometric distribution” and note that formula for E[X], when X is a
geometric random variable… 
Can we do better?
• Can we do better than 3 tile types?
• Could we build 1xN (expected) lines with 2
tile types in the tile concentration model?
• NO!
• How would you “cap” growth at both ends?
– Need two “cap” tiles
• One would have zero strength on the left
• The other would have zero strength on the right
Squares
• How to build NxN squares with O(1) tile
types?
• Modify the line tile set…
Modified (and extended) tile set
S
0
?
A
?
B
?
Modified (and extended) tile set
#
L
S
L
L
#
#
0
0
0
?
#
?
A
?
B
#
?
R
R
R
Example
L
L
L
L
L
L
L
L
L
L
L
L
L
#
#
#
L
L
L
L
L
L
L
L
L
L
L
#
#
#
#
R
R
R
R
#
L
L
L
L
L
L
L
L
L
#
#
#
#
R
R
R
R
R
#
L
L
L
L
L
L
L
#
#
#
#
R
R
R
R
R
R
R
R
R
R
R
R
R
R
?
?
?
?
?
?
#
L
L
L
L
L
#
#
#
#
#
L
S
L
L
#
#
#
#
R
R
0
0
0
?
?
A
?
?
A
A
A
B
Relative concentrations
#
L
S
L
L
#
#
0
0
0
?
#
?
A
#
R
R
R
p(“all tile types that are not A and B”) = .001
?
c = Σt≠A,B p(t)
?
B
p(A) = (1 - c)(1 - 1/(N - 1))
p(B) = (1 - c)(1/(N - 1))
Let X = dimension of produced square
E[X] = N (same as the analysis for lines)
Summary / discussion
• Randomized self-assembly
– A.k.a., tile concentration programming
• Lines: O(1) tile types
• Squares: O(1) tile types
• Correct expected size, but the sizes can vary wildly
– Probability of getting a line/square that’s too big/small cannot be
controlled in our constructions
• Other results:
– Kao and Schweller (2008): squares with O(1) tile types but more
precise control over dimension of the produced square
– Doty (2009): squares with O(1) tile types, but extremely precise
control over dimension of the produced square
• Can make the probability of getting a square even just a tiny bit
bigger/smaller than desired arbitrarily small…
Wild variation!
• The average size of our produced square is NxN
• Probability of getting an NxN square for N = 100 with our
simple construction…
– (1 - 1/100)98(1/100) ≈ .00373 = .3%!!
• There’s no way to make this any better without making serious
modifications
– Getting a 1000x1000 square will basically never happen
• Doty (2009):
– Constant tile set
– Pick a really small (real) number, say ε = .00001
• Encode ε in the concentrations of the tile types
– Doty’s tile set self-assembles into an NxN square (exactly!) with
probability at least 1 - ε = .99999
Previous example
#
L
S
L
L
#
#
0
0
0
?
#
?
A
#
R
R
R
p(“all tile types that are not A and B”) = .001
?
c = Σt≠A,B p(t)
?
B
p(A) = (1 - c)(1 - 1/(N - 1))
p(B) = (1 - c)(1/(N - 1))
Can we do better than 8 tile types?
Let X = dimension of produced square
E[X] = N (mimic the analysis for lines)
One last example
• Let’s do one more example with randomized
self-assembly
• Let’s build a square in a weird kind of way
• Randomly build the bottom row and then
randomly build the height
• Basic idea:
– build the bottom row
– build the right column
• Only after bottom row fully completes
– fill in the middle with fillers
• How do we do this?
ONE way to do it…
Hi
Hi Hi
Hi
B
#
Hi
#
What are the concentrations?
C
p(“all tile types that are neither B nor C”) = .01
#
#
S
#
#
A
Lo
Lo
Lo Lo
c = p(A) + p(Lo) + p(Hi) + p(S)
Lo
p(B) = (1/2)(1 - c)(1 - 1/(N - 1))
p(C) = (1/2)(1 - c)(1/(N - 1))
Hi
Hi Hi
Hi
Hi Hi
Hi
Hi Hi
Hi
C
What is the average size of the produced
assembly?
#
#
Hi
Hi Hi
Hi
Hi Hi
Hi
B
#
#
#
A
Lo
Lo
#
Hi
Hi Hi
Hi
B
#
#
#
A
Lo
Lo
Lo Lo
Lo Lo
Lo
Lo Lo
Lo
#
S
#
#
A
Lo
Lo
Lo
Answer = N
ANOTHER The tile set
What are the concentrations?
p(“all tile types that are not A, A', B and B'”) = .01
B'
?
c = p(F) + p(S)
?
F
A'
?
p(A) = (1/2)(1 - c)(1 - 1/(N - 1))
p(B) = (1/2)(1 - c)(1/(N - 1))
?
S
?
?
A
?
?
B
p(A') = (1/2)(1 - c)(1 - 1/(N - 1))
p(B') = (1/2)(1 - c)(1/(N - 1))
What is the average size of the produced
assembly?
Expected squares
•
•
•
•
Let X = size of the produced assembly
E[X] = Σw,h≥2 (w*h)*Pr(X = w*h)
What is Pr(X = w*h)?
Seed, w-2 A tiles, 1 B tile, h-2 A' tiles, 1 B' tile, fillers take
care of the rest
– Pr(X = w*h) = (1-1/N)w-2(1/N)(1-1/N)h-2(1/N) = (1 - 1/N)w+h-41/N2
• E[X] = Σw,h≥2 (w*h)*Pr(X = w*h)
= Σw,h≥2 (w*h)(1-1/N)w+h-41/N2
= ???
• There’s an easier way…
Independent events
• Let X and Y be two events (things we want to have
happen)
• X and Y are said to be independent if P(A and B) =
P(A)*P(B)
• Example:
– A = probability of rolling a dice one time and getting a 6
– B = probability of rolling two numbers that sum to 8
• A and B are NOT independent events
• P(A) = 1/6
• P(B) = “2,6” + “3,5” + “4,4” + “5,3” + “6,2” / 6*6
= 5/36
• P(A and B) = 1/36
• P(A)*P(B) = 1/6 * 5/36 < 1/6 * 6/36 = 1/36
From last time: the tile set
What are the concentrations?
p(“all tile types that are not A, A', B and B'”) = .01
B'
?
c = p(F) + p(S)
?
F
A'
?
p(A) = (1/2)(1 - c)(1 - 1/(N - 1))
p(B) = (1/2)(1 - c)(1/(N - 1))
?
S
?
?
A
?
?
B
p(A') = (1/2)(1 - c)(1 - 1/(N - 1))
p(B') = (1/2)(1 - c)(1/(N - 1))
What is the average size of the produced
assembly?
Expectation and independence
• For random variables X, Y, we say that X and Y are independent if
all the events related to X and Y are independent events
– Not a formal definition
– Formal definition is more complicated than this
• Example: X is the length of the line
•
•
•
•
– Related event: X = 4
If X and Y are independent random variables, then E[X*Y] =
E[X]*E[Y]
Let X = width of produced assembly
Let Y = height of produced assembly
Are X and Y independent?
– Yes!
– Can get any height regardless of the height, and vice versa
Expected size of assembly
• Let Z = X*Y = size of produced assembly
– If E[Z] = N2, then we’re good!
• E[Z] = E[X*Y] = E[X]*E[Y]
• E[X] = E[Y]
• In our previous example: E[X] = N
– Now what is it?
Expectation computation
E[X]
= Σk≥2 k*Pr(X = k)
= Σk≥2 k*(1 - 1/(N - 1))k-2(1/(N - 1))
= Σk≥2 k*(1 - 1/(N - 1))k-2(1 - (1 - 1/(N - 1)))
= Σk≥2 {k*(1 - 1/(N - 1))k-2 - k(1 - 1/(N - 1))(1 - 1/(N - 1))k-2}
= Σk≥2 {k*(1 - 1/(N - 1))k-2 - k(1 - 1/(N - 1))k-1}
= 1 + Σk≥0 (1 - 1/(N - 1))k
= 1 + (1 / (1 - (1 - 1 / (N - 1))))
= 1 + (1 / (1 / (N - 1)))
= 1 + (N - 1)
=N
Expected size of the produced assembly = E[Z] = E[X*Y]
= E[X]*E[Y]
= E[X]2
= N2
Exactly what we wanted!
Can we do even better?
B'
?
?
F
A'
?
?
S
?
?
A
?
?
B
Can we build (expected) NxN squares with 5 tile
types?
Yes?
B'
?
?
A'
?
?
S
?
?
A
Can we build (expected) NxN squares with 5 tile
types?
?
?
?
B
What are the concentrations?
Randomized comb?
B'
?
?
A'
?
?
S
?
?
A
Can we build (expected) NxN squares with 5 tile
types?
?
?
?
B
What are the concentrations?
p(S) = .001
p(A) = (1/2)(1-p(S))(1-1/(N-1))
p(B) = (1/2)(1-p(S)(1/(N-1))
p(A') = (1/2)(1-p(S))(1-1/(N-2))
p(B') = (1/2)(1-p(S))(1/(N-2))
What is the expected size of the produced
assembly?
… Hmmmmmm