ppt - UCSB Computer Science - University of California, Santa Barbara

Download Report

Transcript ppt - UCSB Computer Science - University of California, Santa Barbara

Secure and Privacy-Preserving
Database Services in the Cloud
Divy Agrawal, Amr El Abbadi, Shiyuan Wang
University of California, Santa Barbara
{agrawal, amr, sywang}@cs.ucsb.edu
ICDE’2013 Tutorial
Cloud Computing
• Successful paradigm for computing and storage
• Features
–
–
–
–
Pay per use
No up-front cost for deployment
Scalability
Elasticity
• Software as a Service (SaaS)
• Platform as a Service (PaaS)
• Infrastructure as a Service (IaaS)
4/11/2013
ICDE 2013 Tutorial
2
Adopting the Cloud
•
•
•
•
•
Emails
Collaboration
Early adopters are mainly low risk apps
Administrative apps
with less sensitive data
Conferencing software
Education
Sensitive Data
4/11/2013
ICDE 2013 Tutorial
3
Cloud – A Tempting Attack Target
• Why the cloud?
– Ubiquitous access to consolidated data.
– Shared infrastructure economies of scale
– A lot of small and medium businesses
• Why attack?
– Target one service provider, attack multiple
companies
– Financial gain from trading sensitive information
4/11/2013
ICDE 2013 Tutorial
4
Cloud Provides Novel Attack Opportunities
• Co-residence attack [Ristenpart et al. CCS’09]
–
–
–
–
Adversary: non-provider-affiliated malicious parties
Map and identify location of target VM
Place attacker VM co-resident with target VM
Cross-VM side-channel attacks (due to sharing of physical
resources): eg, number of visitors to a page, or keystroke
attacks for password retrieval.
• Signature wrapping attack [Somorovsky et al. CCSW’11]
–
–
–
–
4/11/2013
Control Interface compromise by capturing a SOAP msg.
Manipulate SOAP message with arbitrary XML fragments
Use XML signature vulnerability to pass authentication
Take control of a victim’s account
ICDE 2013 Tutorial
5
Amazon’s Best Practices for Cloud
Security and Privacy
• Concerns
• Defenses [AWS security]
– Co-residence attacks
– Side channel attacks
– dedicated instances, virtual
private cloud, isolated network
and traffic
– Network based attacks
– Firewall and access control
– Unauthorized accesses
– Identity and access management,
multi-factor authentication
– Insider attacks
– accesses checked and audited
– Privacy violation
– Rely on clients for access control
– Future vulnerabilities?
– Recommend using data
encryption and encrypted file
system
Best effort defense is not sufficient
4/11/2013
ICDE 2013 Tutorial
6
A Barrier to Conquer
• Security and privacy –
a barrier to cloud
adoption
• Data (sensitive data) –
a key concern
• We need to solve data
security and privacy
problems in the cloud
4/11/2013
ICDE 2013 Tutorial
7
Outline
• Database Security and Privacy: General Practice
in the DB Community
• Data Security and Privacy in the Cloud
• Data Confidentiality
• Access Privacy
• Open Research Challenges
4/11/2013
ICDE 2013 Tutorial
8
Access Control [Bertino et al. TDSC’05]
• Problem Statement: authorizing data access scopes
(relations, attributes, tuples) to users of DBMS
• Discretionary access control
– Authorization administration policies, ie, granting and
revoking authorization (centralized, ownership, etc)
– Content-based using views and rewriting for fine-grained
access control
– Role-based access control: a function with a set of actions,
consisting of users members
• Mandatory access control:
– Object and subject classification (eg, top secret, secret,
unclassified, etc).
4/11/2013
ICDE 2013 Tutorial
9
Data Anonymization
• Problem: protecting Personally Identifiable
Information (PII) and their sensitive attributes
Quasi-identifier
Sensitive
DOB
Gender
Zipcode
Disease
1/21/76
Male
53715
Heart Disease
4/13/86
Female
53715
Hepatitis
2/28/76
Male
53703
Brochitis
1/21/76
Male
53703
Broken Arm
4/13/86
Female
53706
Flu
2/28/76
Female
53706
Hang Nail
Quasi-identifiers
need to be
generalized
or suppressed
Quasi-identifiers are sets of attributes that can be linked
with external data to uniquely identify an individual
4/11/2013
ICDE 2013 Tutorial
10
Solution: k-Anonymity
[Samarati et al. TR’98]
• Quasi-identifiers indistinguishable among k individuals
• Implemented by building generalization hierarchy or
partitioning multi-dimensional data space
Equivalence class
share same QI
4/11/2013
ICDE 2013 Tutorial
11
Enhanced Solution: l-Diversity
[Machanavajjhala et al. ICDE’06]
• At least l values for sensitive attributes in each
equivalence class
A 3-diverse patient table
4/11/2013
Zipcode
Age
Salary
Disease
476**
2*
20K
Gastric Ulcer
476**
2*
25K
Gastritis
476**
2*
30K
Stomach Cancer
4790*
≥40
50K
Gastritis
4790*
≥40
100K
Flu
4790*
≥40
70K
Bronchitis
476**
3*
60K
Bronchitis
476**
3*
80K
Pneumonia
476**
3*
90K
Stomach Cancer
ICDE 2013 Tutorial
12
Enhanced Solution: t-Closeness
[Li et al. ICDE’07]
• Distance between overall distribution of sensitive
attribute values and distribution of sensitive
attribute values in an equivalence class bounded by t
4/11/2013
ICDE 2013 Tutorial
13
Differential Privacy for Statistical Data
[Dwork ICALP’06]
• Strong privacy guarantees while querying a
database
Query
P(A)
A
Indistiguishable!
PERTURBATION
Query
P(A’)
A’
PERTURBATION
 A randomized function K gives ε-Differential Privacy IFF for all datasets
D1 and D2 differing on at most one element, and all S Î Range (K)
ln
Pr[K(D1 ) Î S]
£e
Pr[K(D2 ) Î S]
Thanks to Ben Zhao for this slide
16
Access Control & Privacy
[Chaudhuri et al. CIDR’11]
Hybrid System combining
authorization predicates and
“noisy” views
4/11/2013
ICDE 2013 Tutorial
17
Secure Devices for Privacy
[Anciaux et al. SIGMOD’07]
• Problem: protecting private data during queries involving both
private (hidden) and public (visible) data
• Solution: carry private data in a secure USB key, ensure private data
never leaves the USB key, and only public data flows to the key
• Query optimization for small RAM USB key
4/11/2013
ICDE 2013 Tutorial
18
Outline
• Database Security and Privacy
• Data Security and Privacy in the Cloud
• Data Confidentiality
• Access Privacy
• Open Research Challenges
4/11/2013
ICDE 2013 Tutorial
19
A LOT OF PROBLEMS NEED TO BE
TAKEN CARE OF
SOME PROBLEMS ARE OLD
SOME PROBLEMS ARE AMPLIFIED BY
THE CLOUD
4/11/2013
ICDE 2013 Tutorial
20
Problems Amplified by the Cloud
• Access privacy
• Data confidentiality
– Attacks
– Attacks
• Inferences on access
patterns or query results
• Unauthorized accesses,
side channel attacks
– Solutions
– Solutions
• Private information
retrieval
• Query obfuscation
• Encryption, querying
encrypted data
• Trusted computing
Query
Data
Answer
User
Cloud Servers
4/11/2013
ICDE 2013 Tutorial
21
Data Services in the Cloud
DB Queries
Functionality
Performance
Adversaries: curious but not malicious
cloud / insiders
3rd party attackers
Actions: obtain / infer data and queries
4/11/2013
ICDE 2013 Tutorial
22
Challenges: Conflicting Goals
High
Functionality
Performance
Low
4/11/2013
Existing
Services
Ideal State
Many Crypto
Systems/Protocols
Confidentiality / Privacy
ICDE 2013 Tutorial
High
23
Outline
• Database Security and Privacy
• Data Security and Privacy in the Cloud
• Data Confidentiality
• Access Privacy
• Open Research Challenges
4/11/2013
ICDE 2013 Tutorial
24
Data Confidentiality
• 1. Encryption
– Homomorphic encryption
– Partition Index
– Order-preserving encryption
– Encrypted Index
• 2. Leveraging Trust
– Distribution
– Trusted computing
4/11/2013
ICDE 2013 Tutorial
25
Database as a Service
[Hacigümüs et al. ICDE’02]
• Protects data from steeling but plaintext data can
still be seen on the server
• Write – encrypt before storing
– insert into lineitem (discount) values (encrypt(10,key))
• Read – decrypt before access
– select decrypt(discount,key) from lineitem where
custid = 300
• Encryption alternatives
– Software level v.s. Hardware level (cryptographic
coprocessor) encryption
– Granularity: field, row, page
4/11/2013
ICDE 2013 Tutorial
26
Keyword Search on Encrypted Texts
[Song et al. S&P’00]
• Directly search on encrypted data without decryption
on server side
• Encrypt word by word. For word Wi
– Block_ciphertext Xi = Ek(Wi), Word key ki = fk(Xi),
Pseudorandom sequence Ti = <Si, Fki(Si)>
– Searchable_ciphertext Ci = Xi Ti
• Search for a word W
– Block_ciphertext X = Ek(W), Word key ki = fk(X)
– Check ciphertexts one by one to see if C X = (X
i Ti) X is
of the form <s, Fki(s)> for some random value s
4/11/2013
ICDE 2013 Tutorial
27
Homomorphic Encryption
• Paillier’s cryptosystem
– 𝐷 𝐸 𝑚1 𝐸 𝑚2 𝑚𝑜𝑑 𝑛2 = 𝑚1 + 𝑚2 𝑚𝑜𝑑 𝑛
– 𝐷 𝐸 𝑚1 𝑚2 𝑚𝑜𝑑 𝑛2 = 𝑚1 𝑚2 𝑚𝑜𝑑 𝑛
• Fully Homomorphic Encryption [Gentry
CACM’10]
– Enable arbitrary functions over encrypted data
– Addition, multiplication, binary operations
4/11/2013
ICDE 2013 Tutorial
28
Homomorphic Encryption
Operation
X86-64
Intel Core 2 @ 2.1 GHz
SH_Keygen
250 ms
SH_Enc
24 ms
SH_Add
1 ms
SH_Mul
41 ms
SH_Dec (2-element ciphertext) 15 ms
SH_Dec (3-element ciphertext) 26 ms
1 million data
Aggregation: 16 minutes
Range query: 11 hours
4/11/2013
Too expensive to be practical
From Kristen Lauter’s Slides @ MSR Faculty Summit 2011
ICDE 2013 Tutorial
29
WE NEED PRACTICAL SOLUTIONS TO
QUERYING ON ENCRYPTED DATABASE
4/11/2013
ICDE 2013 Tutorial
30
Partition and Identification Index
[Hacigümüs et al. SIGMOD’02]
• E(tuple): encrypted-tuple, {attribute-index}
• Attribute-index: attribute value partition ids
2
0
4/11/2013
7
200
5
400
1
600
ICDE 2013 Tutorial
4
800
1000
31
Partition and Identification Index
• Client knows a map function, Map(val) = id of
the partition containing val
Random mapping
2
0
7
5
400
200
1
4
800
600
1000
Order-preserving mapping
1
0
4/11/2013
2
200
4
400
5
600
ICDE 2013 Tutorial
7
800
1000
32
Mapping Predicate Conditions
• Map(< val) : ids of the partitions that could contain
values < val
• E.g. Map(eid < 280) = {2, 7} for random mapping
• Map(> val) : ids of the partitions that could contain
values > val
• Map(Ai = Aj): pairs of ids of the partitions that could have
equal Ai and Aj values
• Decryption and processing on the client
4/11/2013
ICDE 2013 Tutorial
33
Mapping Predicate Conditions
emp.did = mrg.did
4/11/2013
ICDE 2013 Tutorial
34
Optimal Partition for Range Queries
[Hore et al. VLDB’04]
• Optimal for privacy-performance tradeoff
• Performance: minimize number of false positives over
all range queries in a given query distribution
– False positives caused by server returning a superset of
answers
• Privacy: maximize variance, entropy of value
distribution in a partition
– High variance – increase adversaries’ error in inferring
sensitive attribute values
– High entropy – reduce adversaries’ ability to identify
encrypted tuples satisfying a plaintext query
4/11/2013
ICDE 2013 Tutorial
35
Partition / Bucketization Review
• Pros
– Efficient computation on the server
• Cons
– Data update is hard (may need re-distribution)
– Filtering super answer set could be time
consuming depending on the partitions sizes
– Might reveal value distribution from relative
partitions changes during dynamic data updates
4/11/2013
ICDE 2013 Tutorial
36
Can Ciphertext Be Queried Directly
• Encryption with special properties that allow
predicate evaluation on ciphertexts
• Order-preserving partition mapping  orderpreserving encryption
4/11/2013
ICDE 2013 Tutorial
37
Order Preserving Encryption
[Agrawal et al. SIGMOD’04]
4/11/2013
ICDE 2013 Tutorial
38
Achieving Order Preserving Encryption
4/11/2013
ICDE 2013 Tutorial
39
Order-Preserving Review
• Pros
– Return exact answers instead of super sets
– Can leverage existing DB index
• Cons
– Hard to perform analysis and aggregation
– Some tuples could be easily identified if approach
is applied to multiple attributes
4/11/2013
ICDE 2013 Tutorial
40
CryptDB [Popa et al. SOSP’11]
• Supports a wide range of SQL queries over encrypted data
• Server fully evaluates queries on encrypted data, and client does
not perform query processing
• SQL-aware encryption
– leverage provable practical techniques for different SQL operators over
encrypted data
• Adjustable query-based encryption
– Dynamically adjust the encryption level of data items according to
user’s queries
• Onion of encryptions
– From weaker forms of encryption that allow certain computation to
stronger forms of encryption that reveal no information
4/11/2013
ICDE 2013 Tutorial
41
SQL-Aware Onion Encryption
RND: no functionality
RND: no functionality
DET: equality selection
OPE: comparison
SEARCH: word selection
(only for text fields)
OPE-JOIN: inequality join
JOIN: equality join
Any value
Any value
HOM: sum
int value
4/11/2013
ICDE 2013 Tutorial
42
CryptDB System
For sending certain onion layer key
For performing cryptographic operations
4/11/2013
ICDE 2013 Tutorial
43
CryptDB Review
• Pros
– Support a wide range of SQL queries
• Cons
– Confidentiality level degrades to the weakest
encryption in the long term
4/11/2013
ICDE 2013 Tutorial
44
WHY CAN WE NOT LEVERAGE WELL
PROVED ENCRYPTION MECHANISMS
AND DB INDEXING TECHNIQUES
4/11/2013
ICDE 2013 Tutorial
45
Encrypted Index for Outsourced Data
• Build a normal B+-tree index on key values
• Encrypt B+-tree nodes
• Store (and disperse) encrypted index in the cloud
[Damiani et al. CCS’03, Wang et al. SDM’11]
• A query with predicates on keys is processed by
locating desired key values on encrypted index.
• Traversal on index relies on the client to retrieve
and decrypt index nodes.
4/11/2013
ICDE 2013 Tutorial
46
n2
I: B+-tree
Index
D: Data Tuples
…
…
t1
t2
.
.
n1
…
…
…
…
…
…
.
.
,
tN
A1
A2
…
Ad
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
Salted IDA
ID
…
…
…
…
…
…
…
…
…
…
…
n1 n2
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
IE
TE
E(n1)E(n2)
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
S1
Si
Sn
E(tc1)E(tc2)
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
TD
…
…
…
…
…
…
…
…
…
…
…
tc1 tc2
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
Cloud
Servers
4/11/2013
ICDE 2013 Tutorial
47
Practical Secure Query Processing
IE
root
Index I
…
…
Cache partial index nodes
on…client to improve efficiency
n1
…
…
1
2
E(n1)E(n2)
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
TE
E(tc1)E(tc2)
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
S1
IEcol1
IE:1
Client
4/11/2013
E(n1)
…
…
…
TEcol2
IEcol1 TEcol2
IE:1
TE:2 E(tc2)
…
…
…
E(n1)
Proxy
…
…
…
Si
TE:2 E(tc2)
…
Sn
…
…
Cloud Servers
ICDE 2013 Tutorial
48
Encrypted Index Review
• Pros
– Can be directly deployed on existing cloud settings
– Provide stronger confidentiality than partition, orderpreserving encryption without losing query efficiency
• Cons
– The Cloud’s computational ability is under utilized
– Queries directly supported are limited to queries on
indexed key attributes
4/11/2013
ICDE 2013 Tutorial
49
Data Confidentiality
• 1. Encryption
– Homomorphic encryption
– Partition Index
– Order-preserving encryption
– Encrypted Index
• 2. Leveraging Trust
4/11/2013
ICDE 2013 Tutorial
50
Distribution instead of Encryption
• Under non-communicating servers
assumption [Aggarwal et al. CIDR’05]
Sensitive attributes Sensitive association
E(telephone), E(email)
name, salary
name
salary
Q1
Server 1
name, E(salary)
Q2
Query
Server 2
Result(Q1) join Result(Q2)
4/11/2013
ICDE 2013 Tutorial
51
Distribution Review
• Pros
– Reduce encryption and decryption overhead
• Cons
– Non-communicating servers assumption is strong*
– Data distribution policy is usually not up to a client,
but decided by cloud server providers
– * [Emekci et al. ICDE’06, Agrawal et al. SRDS’88,
Ciriani et al. ESORICS’09]
4/11/2013
ICDE 2013 Tutorial
52
Tamper Resistant Trusted Hardware
4/11/2013
ICDE 2013 Tutorial
53
Computation Cost Consideration
4/11/2013
ICDE 2013 Tutorial
54
Trusteddb [Bajaj et al. SIGMOD’11]
4/11/2013
ICDE 2013 Tutorial
55
Trusted Computing Review
• Pros
– Support almost all existing DBMS functionalities
• Cons
– Computing and memory resources are limited
• Cipherbase [Arasu et al. CIDR’13]: better optimization
based on trusted hardware
– Requires secret key handover from user to trusted
hardware
4/11/2013
ICDE 2013 Tutorial
56
Outline
• Database Security and Privacy
• Data Security and Privacy in the Cloud
• Data Confidentiality
• Access Privacy
• Open Research Challenges
4/11/2013
ICDE 2013 Tutorial
57
Access Privacy
• 1. Private Information Retrieval (PIR)
• 2. Oblivious RAM
• 3. Relaxing Privacy
4/11/2013
ICDE 2013 Tutorial
58
Private Information Retrieval
[Chor et al. JACM’98]
Client
q=“give me ith
record”
Server
encrypted(q)
database
Xi
encrypted-result=f(X, encrypted(q))
X1
X2
X= …
…
…
Xn
• Multi-servers information theoretic PIR
– Implemented based on XOR, polynomial interpolation
– Achieves 2-server communication complexity O(n1/3)
– Tolerate collusions of up to t < k servers
• Single-server PIR
– Require only computational indistinguishability
4/11/2013
ICDE 2013 Tutorial
59
cPIR Theoretical Background
• Quadratic Residue (QR)
• x is a quadratic residue (QR) mod N if ∃ 𝑦 ∈ 𝑍𝑁 𝑠. 𝑡. 𝑦 2 = 𝑥 𝑚𝑜𝑑 𝑁
– E.g. N=35, 11 is QR (92=11 mod 35)
– 3 is QNR (no y exists such that y2=3 mod 35)
– Essential properties:
• QR ×QR = QR
• QR ×QNR = QNR
• Let N =p1×p2, p1 and p2 are large primes of m/2 bits.
• Quadratic Residuosity Assumption (QRA)
– Determining if a number is a QR or a QNR is computationally
hard if p1 and p2 are not given.
4/11/2013
ICDE 2013 Tutorial
60
Single Database cPIR
[Kushilevitz et al. FOCS’97]
e
Computation cost: O(n)
4
16
Get M2,3
17
11
QNR
N=35
M2,3
z:
0
1
1
0
z4
27
1
1
1
0
z33
0
1
1
0
z2
27
0
1
1
1
z1
17
QNR={3,12,13,17,27,33}
g
QR={1,4,9,11,16,29}
z2=QNR => X10=1
Client
z2=QR => X10=0
Server
Adapted from Tan’s presentation
4/11/2013
ICDE 2013 Tutorial
61
Practicality of PIR
[Sion et al. NDSS’07, Olumofin et al. FC’11]
cPIR is more than one order of magnitude slower than trivial data transfer.
Multi-server PIR is more practical, but it requires servers cannot collude.
4/11/2013
ICDE 2013 Tutorial
62
Access Privacy
• 1. Private Information Retrieval (PIR)
• 2. Oblivious RAM
• 3. Relaxing Privacy
4/11/2013
ICDE 2013 Tutorial
64
Oblivious RAM Based PIR
[Goldreich & Ostrovsky JACM ‘96
Williams et al. NDSS’08]
• A step towards making PIR practical
• Oblivious RAM : achieve oblivious access in
server memory
• Organize data in pyramid like levels of buckets
• Ensure each access touches a bucket at every level
4/11/2013
ICDE 2013 Tutorial
65
4/11/2013
ICDE 2013 Tutorial
66
Oblivious RAM Review
• Pros
– Computation and communication complexities <=
O(log2n) per query
• Cons
– Client storage requires O( 𝑛)
4/11/2013
ICDE 2013 Tutorial
68
Access Privacy
• 1. Private Information Retrieval (PIR)
• 2. Oblivious RAM
• 3. Relaxing Privacy
4/11/2013
ICDE 2013 Tutorial
69
Bounding-Box PIR
[Wang et al. DBSEC’10]
e
y:
16
17
Get M2,3
QNR
N=35
M2,3
0
1
1
0
1
1
1
0
0
1
1
0
0
1
1
1
QNR={3,12,13,17,27,33}
QR={1,4,9,11,16,29}
Client
4/11/2013
z2=QNR => M2,3
ICDE 2013 Tutorial
z:
17
27
g
Bounding Box
Server
=1
70
Hybrid Approach with Homomorphic
Encryption [Wang et al. DAPD’13]
Rpub.B: [0,100)
S1:
BK1
BK2
S1
0
20
Q: [45, 65)
Q’: (E(0), E(1), E(1), E(1), E(0), E(0), E(0))
BK3
50
60
BK4
BK5
70
BK6 BK7
85
95 100
Kpub
V: (E(0)VBK1, E(1)VBK2, E(1)VBK3, E(1)VBK4, E(0)VBK5, E(0)VBK6, E(0)VBK7)
D(V[2]) = D(E(1)VBK2) = D(E(1))*VBK2 = VBK2
D(V[3]) = D(E(1)VBK3) = D(E(1))*VBK3 = VBK3
D(V[4]) = D(E(1)VBK4) = D(E(1))*VBK4 = VBK4
0.1. bucket summary S
0.2. public key Kpub
1. query vector Q’
2. answer vector V
Client
4/11/2013
3. decrypt V & filter
ICDE 2013 Tutorial
Server
71
Hybrid Approach with Homomorphic
Encryption [Wang et al. DAPD’13]
• Client selects subset of
buckets for server to
work on
Q’: (0, E(1), E(1), E(1), 0, E(0), E(0))
BHE
– Private query buckets
– Relevant frequently coaccessed sets of buckets
of other users
Server
Client
query
history
• Reasons for using
frequent bucket sets
Client
private distributed
frequent pattern mining
[TKDE04]
query
history
FBS
– Hide in crowd
– Less identifiable
HHE
Client
4/11/2013
BHE
ICDE 2013 Tutorial
Server
72
Access Pattern Privacy on Encrypted
Index [Vimercati et al. ICDCS’11]
• Not using any cryptographic protocols
• Cover searches
– Fake searches
• Cached searches
– Cache index nodes
• Index shuffling
– Exchange contents between index nodes
– Counteract node-data association attacks
4/11/2013
ICDE 2013 Tutorial
73
Index Shuffling
4/11/2013
ICDE 2013 Tutorial
74
Relaxing Privacy Review
• Pros
– More computationally efficient than PIR
• Cons
– (Incomplete) privacy tricky to define and quantify
4/11/2013
ICDE 2013 Tutorial
75
Outline
• Database Security and Privacy
• Data Security and Privacy in the Cloud
• Data Confidentiality
• Access Privacy
• Open Research Challenges
4/11/2013
ICDE 2013 Tutorial
76
Open Research Problems
• Homomorphic encryption for processing range/join
database queries on encrypted data
• Improve performance of querying encrypted data for
use in practical OLTP applications
– Pre-computation
– Parallel calculation
• End to end security in the cloud
– Need information flow control and auditing in addition to
cryptography or trusted computing based approaches
4/11/2013
ICDE 2013 Tutorial
77
Concluding Remarks
• Cloud security and privacy is not a completely
new problem. Some issues are amplified by the
cloud.
• Protecting data confidentiality and access privacy
• Maintaining practical functionality and
performance while achieving security and privacy
4/11/2013
ICDE 2013 Tutorial
78
References
•
•
•
•
•
•
•
•
•
[Bertino et al. TDSC’05] E. Bertino et al. Database security-concepts, approaches,
and challenges. In IEEE TDSC, 2(1), 2005.
[Samarati et al. TR’98] P. Samarati et al. Protecting privacy when disclosing
information: k-anonymity and its enforcement through generalization and
suppression. TR 1998.
[Machanavajjhala et al. ICDE’06] A. Machanavajjhala et al. l-diversity: privacy
beyond k-anonymity. In ICDE 2006.
[Li et al. ICDE’07] N. Li et al. t-closeness: privacy beyond k-anonymity and ldiversity. In ICDE 2007.
[Dwork ICALP’06] C. Dwork. Differential privacy. In ICALP(2) 2006.
[Verykios et al. SIGMOD’04] V. S. Verykios et al. State-of-the-art in privacy
preserving data mining. In SIGMOD 2004.
[Agrawal et al. SIGMOD’00] R. Agrawal et al. Privacy-preserving data mining. In
SIGMOD 2000.
[Clifton et al. KDD’02] C. Clifton et al. Tools for privacy preserving distributed data
mining. In KDD 2002.
[Anciaux et al. SIGMOD’07] N. Anciaux et al. GhostDB: querying visible and hidden
data without leaks. In SIGMOD 2007.
4/11/2013
ICDE 2013 Tutorial
79
References
•
•
•
•
•
•
•
•
[Chaudhuri et al. CIDR’11] S. Chaudhuri et al. Database access control & privacy: is
there a common ground? In CIDR 2011.
[Ristenpart et al. CCS’09] T. Ristenpart et al. Hey, you, get off of my cloud: exploring
information leakage in third-party compute clouds. In CCS 2009.
[Somorovsky et al. CCSW’11] J. Somorovsky et al. All your clouds are belong to us:
security analysis of cloud management interfaces. In CCSW 2011.
[Hacigümüs et al. ICDE’02] H. Hacigümüs et al. Providing database as a service. In
ICDE 2002.
[Song et al. S&P’00] D. Song et al. Practical techniques for searches on encrypted
data. In S&P 2000.
[Hacigümüs et al. SIGMOD’02] H. Hacigümüs et al. Executing SQL over encrypted
data in the database service provider mode. In SIGMOD 2002.
[Hore et al. VLDB’04] B. Hore et al. A privacy-preserving index for range queries. In
VLDB 2004.
[Agrawal et al. SIGMOD’04] R. Agrawal et al. Order preserving encryption for
numeric data. In SIGMOD 2004.
4/11/2013
ICDE 2013 Tutorial
80
References
•
•
•
•
•
•
•
•
•
[Popa et al. SOSP’11] R. A. Popa et al. Cryptdb: protecting confidentiality with
encrypted query processing. In SOSP 2011.
[Damiani et al. CCS’03] E. Damiani et al. Balancing confidentiality and efficiency in
untrusted relational DBMSs. In CCS 2003.
[Wang et al. SDM’11] S. Wang et al. A comprehensive framework for secure query
processing on relational data in the cloud. In SDM 2011.
[Aggarwal et al. CIDR’05] G. Aggarwal et al. Two can keep a secret: a distributed
architecture for secure database services. In CIDR 2005.
[Emekci et al. ICDE’06] F. Emekci et al. Privacy preserving query processing using
third parties. In ICDE 2006.
[Agrawal et al. SRDS’88] D. Agrawal et al. Quorum consensus algorithms for secure
and reliable data. In SRDS 1988.
[Bajaj et al. SIGMOD’11] S. Bajaj et al. Trusteddb: a trusted hardware based
database with privacy and data confidentiality. In SIGMOD 2011.
[Song et al. IEEE’12] D. Song et al. Cloud data protection for the masses. In IEEE
Computer, 45(1), 2012.
[Chor et al. JACM’98] B. Chor et al. Private information retrieval. In J. ACM, 45(6),
1998.
4/11/2013
ICDE 2013 Tutorial
81
References
• [Kushilevitz et al. FOCS’97] E. Kushilevitz et al. Replication is not needed:
single database, computationally private information retrieval. In FOCS
1997.
• [Sion et al. NDSS’07] R. Sion et al. On the computational practicality of
private information retrieval. In NDSS 2007.
• [Olumofin et al. FC’11] F. G. Olumofin et al. Revisiting the computational
practicality of private information retrieval. In FC 2011.
• [Williams et al. NDSS’08] P. Williams et al. Usable private information
retrieval. In NDSS 2008.
• [Wang et al. DBSEC’10] S. Wang et al. Generalizing PIR for practical private
retrieval of public data. In DBSec 2010.
• [Wang et al. DAPD’13] S. Wang et al. Towards practical private processing
of database queries over public data. In DAPD 2013.
• [Vimercati et al. ICDCS’11] S. D. C. Vimercati et al. Efficient and private
access to outsourced data. In ICDCS 2011.
4/11/2013
ICDE 2013 Tutorial
82