Transcript Oblivious 2

A Brief History Of
History Independent Data Structures
Maverick Woo
2004-09-22
Credit: South China Morning Post
2
Oblivious Data Structures
3
[Mic97]
Daniele Micciancio
Oblivious data structures:
Applications to cryptography
- STOC 1997
An attempt to solve the privacy problem in
incremental cryptography
4
[BGG95]
M. Bellare, O. Goldreich, S. Goldwasser
Incremental Cryptography
and Application to Virus Protection
- STOC 1995
5
Why Incremental?
Most cryptographic primitives act on the document as a whole.
MD5 Signature
T
Th
Thi
------------------
------------------
------------------
This is a survey on the
area of history independent data structures. I will show you
how this area developed
in the last decade [. . .]
------------------
B9ECE18C950AFBFA
6B0FDBFA4FF731D3
EEEB9A8EB45DD351
D9EC0EB4ACCE66CE
A4704FD35F030828
7F2937BA3ECCF5FE
B97799DE817E55BC
C3ADE4370246EB0D
…
Re-compute from scratch is wasteful
Ideally, the running time should be O(f(|change|)).
6
The Free Food Cam
At 15 frame per second, 1 M pixel per frame, 3 byte per pixel
http://zimbs4.srv.cs.cmu.edu/coke/ffc.html
(Google for the phrase “free food cam”)
7
2004-09-21 13:15:55
8
2004-09-21 13:46:24
How do you know
these pictures
are authentic?
9
Tree Signatures [BGG95]
Represent the document as a 2-3 Tree whose leaves
contain constant-size block of characters
Non-incrementally sign
each leaf and
each internal node
CS
A,BS
A S BS C S
DS
DS
ES
Each change only takes O(log n) signature updates
10
Privacy of Tree Signatures
An Apparent Paradox
The very nature of digital signature )
there can be no secret about the document
and its signature…
How can privacy be an issue of when your original intent is
to publish the document and its signature?
13
Metadata
Oct 2000: The Wall Street Journal reports that a candidate running for the U.S. Senate
began receiving anonymous emails containing messages written in MS Word criticizing and
attacking the candidate. A savvy aide looked at the document properties and discovered they
were authored by the chief-of-staff of the opposing party.
“a fine document”
Feb 2003: A dossier on Iraq’s security and intelligence organizations, cited by Colin Powell and
published by 10 Downing Street, is discovered to have been plagiarized from a U.S. researcher
on Iraq. Since the dossier was published on their website in MS Word format, researchers also
Is Microsoft
Wordthe document. They were
discovered the four people in the British government
who edited
subsequently called to Parliament for a hearing.
really so evil?
Mar 2004: SCO Group, seller of UNIX and Linux, sent out a warning letter to 1,500 of the
world’s largest companies threatening legal liability for using Linux if they failed to obtain a
license from the Utah-based company. After filing suit against Daimler-Chrysler, metadata in a
MS Word document revealed that the SCO’s attorneys had originally identified Bank of
America as the defendant.
From http://www.abanet.org/tech/ltrc/publications/metadata.html
14
Metadata in Tree Signatures
Represent the document as a 2-3 Tree whose leaves
contain constant-size block of characters
Non-incrementally sign
each leaf and
each internal node
Where is the metadata?
CS
A,BS
DS
A S BS C S
DS
ES
How can a
2-3 tree leak
information?
16
The Wedding Guest List
As you may know, some of us are getting married soon!
As hard-core computer scientists, of course you will
maintain the guest list in sorted order
(using an object-oriented multimedia relational XML database)
Congrads!
C
A,B
A B C
D
D
E
17
Manual Sorting Is Hard
Blelloch
Blandford
18
Back To The Guest List
And as the guest list gets compiled, there will always be
someone who gets added to the list LAST…
“You added me after you have added {Foo}?”
C
A,B
A B C
D
D
E
19
Between B and D is C…ontention
Initial
Initial
B
A
A
C
C
B
C
A
E
A
Insert(“D”)
D
C
C
A
C,D
B
E
Insert(“B”)
B
A
D
C
D
A,B
E
A B C
D
D
E
20
[NT01]
Moni Naor, Vanessa Teague
Anti-persistence:
History Independent Data Structures
- STOC 2001
21
[Mic97]
Daniele Micciancio
Oblivious data structures:
Applications to cryptography
- STOC 1997
An attempt to solve the privacy problem in
incremental cryptography
22
Oblivious Data Structures
Informal definition
A data structure is said to be oblivious if it does not
give out any knowledge about the sequence of
update operations that have been applied to it
other than the final result of the operations.
23
Formal Definition [Mic97]
Let M be a set of operations,
and S be a set of algorithms implementing them.
We say S is oblivious if:
for any two sequences of operations p1, p2, …, pn
and q1, q2, …, qm that lead to the same set of values,
the execution of these sequences have identical
output probability distributions.
24
Oblivious 2-3 Tree
Issue: How do 2-3 trees “leak”?
Degrees of nodes give out too much information
Solution: Randomize the degrees!
Degrees should split uniformly between 2 and 3
25
Results in [Mic97]
Oblivious 2-3 Trees
Insertion and Deletion in expected O(log n) time whp
Search in worst-case O(log n) time
Incremental Signature Scheme
Security: Tamper Proof, as defined in [BGG95]
Privacy: Private (hmm… we will see :P)
Expected O(log n) signature updates per change whp
26
For More Information
27
History Independence
28
Are We Done Yet?
In an oblivious 2-3 tree, the degrees
indeed don’t tell you anything about
the update sequence.
What else can leak information?
C
A,B
A B C
D
D
E
29
Interface Matters
Dictionary ADT
Insert(key)
Delete(key)
Search(key)
Principle
When privacy is concerned, if some
C piece of
information cannot be retrieved via the
A,B
D
legitimate interface of a system, then it
should not be retrievable
A BevenCwith full
D access
E
to the system.
Does this allow you to ask
“is k1 is inserted some time before k2?”
30
Full Access
00000200:
00000210:
00000220:
00000230:
00000240:
00000250:
00000260:
00000270:
00000280:
00000290:
000002a0:
000002b0:
000002c0:
000002d0:
000002e0:
000002f0:
00000300:
00000310:
00000320:
00000330:
006e
c81b
0a00
3908
5973
0000
0000
00f4
003c
4441
921e
4b32
cf9f
1190
054c
24c3
f10a
f90d
6292
83b3
1ef0
44b0
0000
0600
0000
0467
2063
2400
8b00
5478
0c07
3c97
9f07
6664
18ae
379e
a8e1
205f
9434
ab3e
6335
c62c
0d49
0001
0b13
414d
4852
0084
001b
da62
9e3f
00d2
d23c
90e4
020d
df0c
1850
15c8
04db
c32f
0000
7e3e
4844
a34c
0000
4100
4d00
cb00
58ca
fc5f
6778
7640
40fa
5365
f80b
5fc4
1d0b
7f0a
c006
0156
563d
ff89
5200
e24f
0b13
00b1
007a
006d
10f6
3ff9
2629
faf5
1190
0008
3448
2419
90be
a4ed
5470
8648
0673
504e
0000
0000
0100
9e61
3000
5f00
b800
fff3
c970
3506
5600
2096
dfff
76ff
cb20
80f4
1e68
231d
77a2
470d
9600
0009
9a9c
4c41
0080
00e8
0034
e72a
1848
49c9
d2df
e7cf
3783
bacf
0d54
0d06
201b
060e
f4fc
0a1a
0000
7048
1800
f700
9800
6c00
c849
0ce2
4bda
7f0c
81b4
ff01
1e8f
2021
f7f4
8000
372b
8ec7
.n..c5..V=.sw...
..D..,~>..PNG...
.....IHDR.......
9......L.O....pH
Ys..............
...gAMA....aLA..
.. cHRM..z0.....
..#......m_...l.
.<....X......4.I
DATx.b._?....*..
.....?gx&).p.HK.
K2<[email protected]...
.....<@...V.....
..fd..Se.. .....
.L......4H..7...
#.7..._.#.v... !
.....P..... .T..
.. _............
b..4....Tp.h .7+
...>./.V.H#.....
31
[NT01]
Moni Naor, Vanessa Teague
Anti-persistence:
History Independent Data Structures
- STOC 2001
32
Memory Representation
Most memory allocators have a tendency to
allocate newer objects at higher addresses
Shape
Memory
C
A,B
A B C
D
D
E
A
B
A,B
D
C
D
E
C
33
History Independence [NT01]
Definition
A data structure implementation is history independent if
any two sequences S1 and S2 that yield the same content
induce the same distribution on the memory representation.
Similar to Micciancio’s definition,
except this is specified on memory representation
34
Single vs. Multiple Observations
Thief
Hacker
35
Strong History Independence [NT01]
Definition
Let P1 = {i11, i12, … , i1l} and P2 = {i21, i22, … , i2l}
be two lists of points such that for all b 2 {1, 2} and 1 · j · l,
we have that 1 · ibj · |Sb| and the content of data structure
following S1 up to i1j and S2 up to i2j are identical.
A data structure implementation is strongly history independent if
for any S1,P1,S2,P2 the distributions of the memory representations
at the points i1j and i2j are identical.
36
Strong History Independence [NT01]
i14
P1
i11
i12
i13
i15
S1
D1

 
P2
i21
i22
…

i23
i24
i25
S2
D2

 


…
37
Weak History Independence [NT01]
Definition
Weakly
A data structure implementation is history independent if
any two sequences S1 and S2 that yield the same content
induce the same distribution on the memory representation.
Similar to definition [Mic97],
except this is specified on memory representation
38
Checkpoint
How HIDS got started
Why obliviousness is inadequate
What WHI and SHI are
The rest of this talk:
- WHI is bearable
- SHI is hard
39
Results in [NT01]
Besides the definitions,
WHI object allocators
Applied to Dynamic perfect hashing
SHI hash table
Currently does not support deletion
WHI Union-Find (sketch)
40
Incremental Card Shuffling
Insert(new card x)
Put the x at the end
Delete(card at position j)
Swap card with the last card
Swap x with another card,
chosen u.a.r.
Discard the (new) last card
Lemma
The permutation of cards remains random with these procedures.
41
Fixed Size Object Allocator
Turn any bounded-indegree, fixed-size record oblivious data
structure into WHI
Record allocation takes O(1) time
Need to be careful when moving blocks
Examples
Treaps
Oblivious 2-3 Trees
42
Variable Size Object Allocator
Create buckets for objects of geometrically increasing size
B1, B2, B4, B8... (exercise :P)
Record allocation takes O(s log s) time
From here, adapt the dynamic perfect hashing into WHI
44
Going Back To Card Shuffling
Insert(new card x)
Put the x at the end
Don’t store the
randomness and
we get WHI
Swap x with another card,
chosen u.a.r.
But that also
rules out SHI
45
Example
Ins(4)
Ins(5)
Del(4)
D1
D2
3, 1, 2
3, 1, 2
3, 4, 2, 1
3, 5, 2, 1
Ins(5)
3, 4, 5, 1, 2
3, 2, 5, 1
46
Ordered Hashing
In open addressing, how do you break tie in a collision?
Resolve by alphabetical order
O. Amble, D. Knuth
Ordered hash tables
- The Computer Journal 17(2), 1974
1
return
2
if
3
for
4
struct
5
while
6
int
7
float
47
SHI Hash Table
Looks like Ordered Hashing on Steroid
Define a priority function p(i, x, y) :
{0, …, N-1} £ U £ U  {True, False}
Round #
Contestants
Theorem
If p(i, x, y) defines a total order, then the hash table is SHI.
48
SHI Hash Table
Youth-rules
p(i, x, y) =
true if age(i,x) <age(i,y)
true if age(i,x) = age(i,y) and x <y
false otherwise
Theorems
For any sequence of insertions the expected amortized insertion
probe-time for an element is 1/(1-).
For any element, the expected probe-time for successful or
unsuccessful search is 1/(1-)
49
Observation in [NT01]
“It is interesting to note that all the SHI data structures
we have found have the property that each data
structure content has a unique representation.”
SHI ) Unique Representation?
50
SHI ) Unique Representation
51
[H.02]
J. Hartline, E. S. Hong, A. E. Mohr,
W. R. Pentney, E. C. Rocke
Characterizing History Independent
Data Structures
- ISSAC 2002
52
Results in [H.02]
A (somewhat) simpler definition of WHI and SHI
Show SHI ) Unique Representation
Relax SHI to a new definition: SHI*
The adversary can distinguish between empty and
non-empty sequences of operations
Show how to resize WHI data structures
53
Hysteresis
Your wedding
is running out
of budget?
Capacity
2n
n
0
0.75n
n
#Items
54
[BP03]
Niv Buchbinder, Erez Petrank
Lower and Upper Bounds on
Obtaining History Independence
- Crypto 2003
56
Results In [BP03]
SHI ) Unique Representation
Predated by [H.02]
SHI Heap Lowerbound (in the comparison-based model)
Insert, Extract, Increase-Key: (n)
WHI Heap Upperbound
Build-Heap: O(n); Increase-Key: O(log n)
Insert and Extract-Max: expected O(log n)
57
Heap Review
Binary Heap
10
9
7
8
3
5
6
4
1
2
10 9 7 8 5 4 1 3 6 2
58
www.HeapOrNot.com
Heapify
2
10
10
8
3 6 5
7
9
4
9
1
8
7
5
4
1
3 6 2
Heapify-1
59
Heap  Permutation
Build-Heap-1 (H)
If size of H is 1, Return H
Choose a node i u.a.r. among all nodes in H
H Ã heapify-1(H, i)
Return MakeNode(root(H),
Build-Heap-1(HL),
Build-Heap-1(HR))
60
WHI Heap
1: Pick u.a.r. a heap among all possible heaps with
values v1, …, vn.
2: Pick u.a.r. a permutation on the values v1, …, vn.
Place them in an (almost) full tree and call Build-Heap.
Theorem
1 and 2 are actually the same distribution.
61
Simulation
10
7
8
3
5
6
4
10,9,7,8,5,4,1,3,6,2
Build-Heap-1
9
Build-Heap-1
1
2
8,2,4,5,3,9,10,1,7,6
Insert
11
9
10
7
1
6
5
2
4
Build-Heap
8,2,4,5,3,11,10,1,7,6,9
8
3
62
[A.04]
U. A. Acar, G. E. Blelloch, R. Harper,
J. L. Vittes, S. L. M. Woo
Dynamizing Static Algorithms with
Applications to Dynamic Trees and
History Independence
- SODA 2004
Also [ABH02], [ABH03]
63
Dynamiziation
10
9
7
5
6
4
10,9,7,8,5,4,1,3,6,2
Permute
8
3
Build-Heap
1
2
8,2,4,5,3,9,10,1,7,6
Insert
11
9
10
7
1
6
5
2
4
Build-Heap
8,2,4,5,3,11,10,1,7,6,9
8
3
64
Uniquely Represented Dictionaries
65
[Sny77]
Lawrence Snyder
On Uniquely Represented
Data Structures
- FOCS 1977
66
Motivating Questions
Data Structure
Uniquely Represented
Cost
Linked list
Heap
AVL tree
2-3 tree
yes
yes
no
no
O(n)
O(n)
O(log n)
O(log n)
unique representation ) linear cost?
logarithmic cost ) redundant representation?


67
Jellyfish
Update and Search take O(n1/2) time
8
body
4
12
1
5
9
13
2
6
10
14
3
7
11
15
tentacles
68
Snyder’s Lowerbound
Theorem
If a search tree uniquely represents its data, then
“The argumentation of this lemma is quite involved and will not be presented here.”
69
[AO91]
Arne Andersson, Thomas Ottmann
New Tight Bounds on Uniquely
Represented Dictionaries
- FOCS 1991
70
71
72
73
[ST90]
Rajamani Sundar, Robert E. Tarjan
Unique Binary Search Tree Representations
and Equality-testing of Sets and Sequences
- STOC 1990
74
Future
76
Open Problems
SHI Hash table with deletion (Working on it)
SHI* seems to be a very reasonable definition
See what can be done in this model
77
Q&A
78
Goooooooooooooooogle is Hiring
Amitabh Sinha
Operations and
Mgmt. Science,
U.Mich. B. Sch.
Ke Yang
Nikhil Bansal
Google
IBM Research
2004-5-15
79
• They are hiring!
• Talk to Ke Yang if you are interested…
[email protected]
(917) 881-2797
He’ll be around until 2004-Oct-1.
Ask him for a GMail account.
80
End
81
Color Test
85