Transcript name - MIT
CryptDB: A Practical Encrypted
Relational DBMS
Raluca Ada Popa, Nickolai Zeldovich, and Hari Balakrishnan
MIT CSAIL
New England Database Summit 2011
Hackers
Curious DB administrators
Physical attacks
Both on public clouds and private data centers
Regulatory laws
Approach
Perform SQL query processing on encrypted data
Database server
user queries
Client
frontend
Trusted
Stores schema, master key
No query execution
1.
2.
3.
Stores the database and
Support standard SQL queriesprocesses
on encrypted
data
SQL queries
Not DB
trusted
to keep data
Process queries completely atthe
server
private
No change to existing DBMS
Example
60
100
800
100
SELECT * FROM emp
WHERE salary = 100
≥
Frontend
emp
SELECT * FROM table1
WHERE col1 = x5a8c34
x638e54
≥
x638e5
x5a8c34
?
4
x5a8c34
x922eb4
x638e5
x5a8c34
4
rank
name
salary
x934bc1
x1eab8
x4be219
1
x5a8c34
x638e5
x95c623
4
x922eb4
x84a21c
x2ea887
x638e5
x5a8c34
x17cea7
4
Two techniques
1. SQL-aware encryption strategy
– Different encryption schemes provide different
functionality
2. Adjustable query-based encryption
– Adapt encryption of data based on user queries
1. SQL-aware encryption
Highest
Privacy
Scheme
Operation
Details
RND
None
AES in UFE
HOM
+, *
e.g., Paillier
DET
equality
AES in CTR
JOIN
join
new
SEARCH
ILIKE
Song et al.’00
OPE
order
Boldyreva et al.
’09
e.g., =, !=, GROUP BY,
IN, COUNT, DISTINCT
e.g., >, <, ORDER BY,
SORT, MAX, MIN
first practical
implementation
Onions of encryptions
RND
DET
SEARCH
JOIN
RND
OPE
OPE-JOIN
Any value
Any value
Onion 1
Onion 2
HOM
int value
Onion 3
Each column has the same key in a given layer of an
onion
2. Adjustable query-based encryption
Start out the database with the most secure
encryption scheme
Adjust encryption dynamically
Strip off levels of the onions: frontend gives key to
server using a UDF
Example
RND
DET
emp:
rank
name
salary
SEARCH
JOIN
Any value
SELECT * FROM emp WHERE salary = 100000
UPDATE table1 SET col3onion1 =
DecryptRND(key, col3onion1)
SELECT * FROM table1 WHERE col3onion1 = x5a8c34
JOIN needs new crypto
Challenge: do not know which columns will be joined
Col1
Join key
Col1-Col2
Client
Frontend
=
Col2
-
Data items not revealed, cannot join without join key
Further components
Inserts, updates, deletes, nested queries
Indexes
Transactions, auto-increments
Optimizations to speed up performance
Not supported: A.a + A.b > B.c
Security converges…
… to maximum privacy for query mix
Onion levels stripped only when new operations
needed
Steady State: no decryptions at server
Practical: typical SQL processing on
enlarged tuples
Privacy Guarantees
Formal privacy definition and proof
Implications:
emp:
rank
name
•
•
salary
If query has
• equality predicate on name
repeats
• order predicate on name
order
• aggregation on salary
nothing
• no filter on a column
nothing
Never reveal plaintext
Server cannot compute unrequested queries
requiring new relationships
Privacy (cont’d)
DB owner can specify minimum security level
for some fields
CREATE TABLE emp (SSN text ≥ DET, name text, …)
Implementation
SQL Interface
Server
Query
Encrypted Query
Frontend
Results
Unmodified
DBMS
Encrypted Results
CryptDB
PK tables
CryptDB UDFs
(user-defined
functions)
No change to the DBMS
Should work on most SQL DBMS
Portability
Ported CryptDB from Postgres to MySQL with
86 lines of code
No change to MySQL
Code changed was to connect to server, UDF
declarations
Low overhead on TPC-C
• Supports all queries in TPC-C without change
Throughput loss 27%
Microbenchmarks from TPC-C
Adjustable encryption
Steady state of columns for TPC-C:
71% of columns remain encrypted with RND
Importance of adjustable query-based
encryption to privacy
In practice, we expect most sensitive fields to
remain at RND or DET (e.g., credit cards)
Related work
Theoretical approaches [Gennaro et al., ’10]
– Inefficient
Search on encrypted data (e.g., [Chang, Mitzenmacher ‘05],
[Evdokimov, Guenther ’07])
– Restricted set of queries, inefficient
Systems proposals (e.g., [Hacigumus et al., ’02])
– Lower degree of security, rewrite the DBMS, client-side
processing
Conclusions
CryptDB is the first practical DBMS for running
most standard queries on encrypted data
– Runs queries completely at server
– Provides provable privacy guarantees
– Modest overhead
– Does not change the DBMS or client applications
Thanks!