purpose - Rakesh Agrawal
Download
Report
Transcript purpose - Rakesh Agrawal
Implementing P3P
Using Database Technology
Rakesh Agrawal
Jerry Kiernan
Ramakrishnan Srikant
Yirong Xu
IBM Almaden Research Center
The Context for This Work
Central theme of our current research
– How to design information systems that respect the privacy of
individual information while not impeding information flow
An important aspect
– Users should be able to express how they would like their
information to be treated
– Businesses should be able to state what they are going to do with
the information they collect
– Data exchange should only happen if the two are compatible
– P3P provides mechanisms for accomplishing this goal
Other aspects
– Mechanisms for enforcing that businesses act according to their
stated policies (“Hippocratic Databases”)
– Mechanisms for doing analytics at aggregate level while respecting
privacy of individual data (“Privacy Preserving Data Mining”)
Outline
Overview of P3P (Platform for Privacy
Preferences)
Architectures for implementing P3P
Client-Centric (prevailing)
Server-centric (our proposal)
Use of database technology for implementing
server-centric architecture
Performance
Conclusion and future work
What is P3P
Traditional privacy policies do not work
– by the lawyers, for the lawyers
New W3C recommendation (standard) since April 2002
A standard way to communicate privacy practices
– Privacy Policies
encode a web site’s data-collection and data-use practices in the
P3P policy language
– Privacy Preferences
specify user’s preferences in the APPEL language
– Matching
programmatically compare a preference against a policy
P3P Policy for Volga
<POLICY>
... ...
<STATEMENT>
<PURPOSE><current/><telemarketing/></PURPOSE>
<RECIPIENT><ours/><delivery/></RECIPIENT>
<RETENTION><indefinitely/></RETENTION>
<DATA-GROUP>
<DATA ref="#user.name"/>
<DATA ref="#user.home-info.telecom.telephone"/>
</DATA-GROUP>
</STATEMENT>
<POLICY>
APPEL Preference for Jane
<appel:RULESET>
<appel:RULE behavior="block">
<POLICY>
<STATEMENT>
<PURPOSE appel:connective="or">
<telemarketing/><contact/>
</PURPOSE>
</STATEMENT>
</POLICY>
</appel:RULE>
<appel:RULE behavior="request"/>
<appel:OTHERWISE/>
</appel:RULE>
</appel:RULESET>
Current Implementations
Tools for creating policies
– IBM Tivoli Privacy Wizard
– P3PEdit
Tools for creating preferences
– JRC APPEL Preference Editor
Tools for matching preferences
– AT&T Privacy Bird
– Microsoft Internet Explorer 6.0
– JRC P3P Proxy
Policy-Preference Matching
(Client-Centric)
3
policy and
user
preference
APPEL
Engine
1
request
policy
2
send
policy
5
request web
page if
policy
conforms to
preference
Browser
4
result of
matching
Client Side Matching
Specialized Engine
Web Server
Server-Centric Architecture
We propose a server-centric architecture for deploying
P3P:
– Server-side matching
– Reuse proven database technology
Store privacy policies in a database system
Query the database for matching preferences against
privacy policies
Policy-Preference Matching
(Server-Centric)
1
send
preference
and URI of
a web page
Browser
5
6
send result
of matching
preference
against
policy
request web
page if policy
conforms to
preference
Web 2
Server
preference
and web
page URI
APPEL to
Query
Converter
3
4
query
results
query
Database
policy
metadata
Alternative Architectures
Two orthogonal dimensions for implementing P3P
– What matching engine should be used?
– Where should the matching take place?
Client
Server
Specialized
Engine
Current
?
Database
Engine
?
Proposed
Discussion of Server-Centric
Solution
Advantages of server-side matching
–
–
–
–
Support for thin, mobile clients
Better support for new privacy-sensitive applications
Extra information for policy refinement
Easier upgrade of P3P specification
Advantages of using database
– No reinvention, reuse of proven technology
– Better Management of policies
– Infrastructure for policy enforcement
Disadvantages
– Greater amount of trust in the server
Variations of the Server-Centric
Architecture
Relational tables
Relational tables + XML view
Native XML store
+
+
+
SQL queries
XQueries
XQueries
Storing Policies in Database
Policy Creation
Wizard
P3P
policies
Shredder
Database
SQL
inserts
policy
metadata
Storing Policies (cont.)
Policy
…
policy_id
name
Statement
statement_id
policy_id
retention
consequence
Purpose
statement_id
policy_id
purpose
required
Recipient
statement_id
policy_id
recipient
required
Datagroup
datagroup_id
statement_id
policy_id
base
data_id
datagroup_id
statement_id
Data
…
……
policy_id
ref
Converting APPEL into Queries
String main(Rule r) {
String sql = “SELECT” + r.behavior() +
“FROM” + applicablePolicy() +
“WHERE” + connect(r);
return sql;
}
String connect(Expression e) {
// matching attributes of e
String sqlAttr = genAttr(e);
// match subexpressions of e
String sqlSub;
let theta = e.connective(); // theta is either “or” or “and”
for each subexpression se of e do
sqlSub += “EXISTS(” + path(se) + “AND” + connect(se) + “)”;
sqlSub += theta;
return sqlAttr + “AND(” + sqlSub + “)”;
}
String path(Expression e) {
return “SELECT *” +
“FROM” + e.name() +
“WHERE” + e.foreignKey() + “=” +
e.parent().primaryKey();
}
Converting APPEL into SQL
APPEL
<appel:RULE
behavior="block">
<POLICY>
<STATEMENT>
<PURPOSE
appel:connective="or">
<telemarketing/>
<contact/>
</PURPOSE>
</STATEMENT>
</POLICY>
</appel:RULE>
Recursive algorithm
APPEL behavior Select list
APPEL elements SQL predicates
Link predicates by foreign keys
SQL
SELECT ‘block’
FROM Policy
WHERE
EXISTS(
SELECT *
FROM Statement
WHERE Statement.policy_id =
Policy.policy_id AND
EXISTS(
SELECT *
FROM Purpose
WHERE Purpose.statement_id =
Statement.statement_id
AND Purpose.policy_id =
Statement.policy_id
AND Purpose.purpose =
‘telemarketing’
OR
Purpose.purpose =
‘contact’
Converting APPEL into XQuery
APPEL
<appel:RULE
behavior="block">
<POLICY>
<STATEMENT>
<PURPOSE
appel:connective="or">
<telemarketing/>
<contact/>
</PURPOSE>
</STATEMENT>
</POLICY>
</appel:RULE>
XQuery
if (document(“policy”)
POLICY
/STATEMENT
/PURPOSE
[ telemarketing
OR
contact
]
then
return <block/>
Performance Experiments
Experiment Setup
– Windows NT 4.0 Sever with dual 600MHz processors and 512M
memory
– DB2 UDB 7.1
– Public domain APPEL engine (from JRC)
– XTable (aka XPERANTO) prototype for the XQuery alternative
Datasets
29 P3P policies from
5 APPEL preferences
Fortune 1000
company web site
from JRC test suite
Policy
Preference # Rules
Size
(KB)
# Statement
Size
(KB)
Average
2
4.4
Very High
10
3.1
Max
5
11.9
High
7
2.8
Min
1
1.6
Medium
4
2.1
Low
2
0.9
Very Low
1
0.3
Average
4.8
1.9
Experiment Results
APPEL
SQL
Engine Convert Query
XQuery
Total
Average
2.63
0.08
0.08
0.16
1.65
Max
9.08
0.14
0.24
0.34
5.00
Time for matching a preference against a policy (seconds)
Experiment Results
Preference APPEL
Engine Convert
SQL
Query
XQuery
Total
Very High
2.65
0.09
0.08
0.17
2.63
High
2.68
0.10
0.14
0.24
2.33
Medium
2.66
0.13
0.14
0.27
-
Low
2.60
0.06
0.03
0.09
1.51
Very Low
2.54
0.04
< 0.01
0.05
0.31
Matching times for different preferences (seconds)
Latency of the SQL implementation is more than
acceptable for practical deployment
Why APPEL is Slow
Significant cost for augmenting data elements
appearing in a policy with categories predefined in P3P
base schema
The APPEL engine incurs this cost for every
preference checking
SQL implementation only incurs this cost when
shredding policies into database, which is amortized
over a large number of matchings of different
preferences
Why XQuery is Slow
Significant cost for the XML view to convert
XQueries into SQL against relational database
Untapped optimization opportunities
Summary
P3P is an important application area for database
systems
Server-centric architecture reuses database
technology for implementing P3P
Adequate performance for it to be used in
practical deployment of P3P
Future Work
Checking policies against preferences before access
web sites is only a small aspect of enabling web users
gain control over their private information
P3P will not succeed unless it provides mechanism for
enforcing that a site acts according to its stated policy
To this end, we are implementing the Hippocratic
Database architecture (VLDB-02)
–
–
–
–
XPref: XPath-based privacy preference language (WWW-03)
Order preserving encryption
Access control through query analysis and rewriting
Nibbling open problems outlined in the Hippocratic Database
vision
Backup
Proxy model
Imagine a site that has policies for all
companies, and checks user preferences
– individual company can take our technology
also
Preference Matching
DB2
IBM policy
Browser
Preferences
Browser
DB2
……
Internet
IBM
policy
DB2
……
ATT
ATT policy
policy
Ford policy
DB2
Ford policy
Policy-Preference Matching (Client-Centric)
Reference File
Cache
APPEL
Engine
reference
file
URI of a
web page
URI of the
applicable
policy
policy and
user
preference
result of
matching
request
reference file
send reference
file
Browser
Web Server
request policy
send policy
request web
page if policy
conforms to
preference
Policy-Preference Matching (ServerCentric)
1
send preference
and URI of a
web page
Browser
2
preference
and web
page URI
APPEL to
SQL
Converter
Web Server
5
6
send result of
matching
preference
against policy
request web
page if policy
conforms to
preference
3
4
query
results
SQL
query
Database
policy
metadata