Implementing P3P Using Database Technology Rakesh Agrawal

Download Report

Transcript Implementing P3P Using Database Technology Rakesh Agrawal

Implementing P3P Using Database
Technology
Rakesh Agrawal Jerry Kiernan Ramakrishnan Srikant Yirong Xu
Presented by Yajie Zhu
03/24/2005
Outline
•
•
•
•
•
•
•
Introduction
Overview of P3P
Current P3P implementations
Server-centric implementation
Algorithms
Results of performance experiments
Conclusion and future work
Introduction
• Platform for Privacy Preferences(P3P)
– web users gain control over their private
information
– web site owners can express their privacy
policies in a standard format
– a user can programmatically check against
her privacy preferences to decide whether to
release her data to the web site
• P3P became a W3C Recommendation on
April 16, 2002
Overview of P3P
• Privacy Policies:
– An XML format in which a web site can
encode its data-collection and data-use
practices
• Privacy Preferences:
– A machine-readable specification of a user’s
preferences that can be programmatically
compared against a privacy policy
Detailed information: http://www.w3c.org/TR/P3P/
P3P Policy Description
• P3P policies are described as a sequence of
STATEMENT elements.
– CONSEQUENCE: the purpose for collecting
information in human-readable text
– PURPOSE: purposes for which information is
collected.
12 predefined values.
Ex: <current/>,<individual-decision/>, <contact/>
– RECIPIENT: the users of the collected information
6 predefined values.
Ex: <ours/>, <same/>, <unrelated/>
Opt-in or opt-out values can be assign to the required
attribute of PURPOSE and RECIPIENT elements
P3P Policy Description (Cont.)
– RETENTION: the duration for which the collected
information will be kept
5 predefined values
Ex: <stated-purpose/>, <business-practice/>,
<indefinitely/>
– DATA-GROUP and DATA: the list of individual data
items that are collected for stated purposes in the
statement.
predefined types of data items
DATA can contain related category information.
– CATEGORIES: provide hints to users as to the
intended uses of the data.
Ex: <physical/>, <online/>, <purchase/>
An Example Policy
Privacy Preferences
• Privacy preferences are expressed in
APPEL as a list of RULEs
– Rule behavior: specifies the action to be taken
if the rule fires.
request, block
– Rule body: Provides the pattern that is
matched against a policy.
Privacy Preferences (Cont.)
• Connective attribute: defines the logical
operators of the language.
– And (default): all of the contained expressions can be
found in the policy
– Or : one or more of the contained expressions can be
found in the policy
– And-exact
– Or-exact
– Non-and (negated and)
– Non-or (negated or)
Every element in an APPEL rule has a connective
associated with it.
An Example APPEL Preference
The Reference File
<META xmlns="http://www.w3.org/2002/01/P3Pv1">
• A site may have
<POLICY-REFERENCES>
multiple privacy policy
<EXPIRY max-age="172800"/>
for different web pages,
which may offer various
<POLICY-REF about="/P3P/Policies.xml#first">
<INCLUDE>/*</INCLUDE>
services.
<EXCLUDE>/catalog/*</EXCLUDE>
• A site’s reference file
<EXCLUDE>/cgi-bin/*</EXCLUDE>
assigns individual
<EXCLUDE>/servlet/*</EXCLUDE>
</POLICY-REF>
policies with subsets of
the URIs.
<POLICY-REF about="/P3P/Policies.xml#second">
• In the reference file,
<INCLUDE>/catalog/*</INCLUDE>
each policy has a set of
</POLICY-REF>
INCLUDE/EXCLUDE
<POLICY-REF about="/P3P/Policies.xml#third">
declarations of the
<INCLUDE>/cgi-bin/*</INCLUDE>
URIs.
<INCLUDE>/servlet/*</INCLUDE>
<EXCLUDE>/servlet/unknown</EXCLUDE>
</POLICY-REF>
</POLICY-REFERENCES>
</META>
Current P3P Implementation
• Client-Centric Architecture
– Web sites create and install policy files at their
sites.
• P3PEdit: a web-based privacy policy generator
• IBM Tivoli Privacy Wizard: a web-based GUI tool to
define privacy policies
– The users browse a web site, their preferences
are checked against a site’s policy before they
access the sit.
Client-Centric Architecture Implementation
• IE6 implementation of Compact P3P policies
– IE6 allows a user to specify her privacy preference for
handling cookies
• AT&T Privacy Bird
– It accepts user-defined APPEL privacy preference
– An APPEL engine compares a user’s APPEL
preference with a web site’s P3P policy
• Other Tools
– JRC APPEL Preference Editor: a Java-based editor
for preparing APPEL preferences.
– JRC P3P Proxy: a centralized proxy service that
conducts P3P privacy policy checking on behalf of
subscribed users
Server-Centric Architecture
• A website deploys P3P, and installs its privacy
policies in a database system
• Database querying at the server is used for
matching a user’s preferences against privacy
policies
– Convert privacy policies into relational tables and
convert an APPEL preference into an SQL query for
matching.
– Store privacy policies in relational tables, define an
XML view over them , and use an XQuery derived
from an APPEL preference for matching.
– Store privacy policies in a native XML store and use
an XQuery derived from an APPEL preference for
matching.
Server-Centric Architecture (Cont.)
• Advantages
– The preference checking at the server leads to lean
clients (mobile device)
– An upgrade in P3P specification only require an
upgrade in all the servers
– As new privacy-sensitive applications emerge, they
will reuse checking done at the server
– Site owner can refine their policies, when they know
that policies have a conflict with the users’ privacy
preferences
– Using databases for preference matching yields
additional advantages
• The privacy data tables can serve as meta data for ensuring
that polices are followed
• Can reuse the proven database technology for checking
preferences against policies.
• Versions of policies can be better managed
Server-Centric Architecture (Cont.)
• Disadvantages
– There needs to be a greater amount of trust
on the server
• The user has to trust the server
• The user has to trust the database software used
by the server
– By using Client-Centric to cache a reference
file, the client may avoid some checks, if a
user visits many pages that are governed by
the same policy
Algorithms for Server-Centric
Implementation
• Database Schema for
P3P policy
• Populate the tables
with the data
Algorithms for Server-Centric
Implementation (Cont.)
• Translating APPEL
Preferences into SQL
Queries
– The main() mirrors the structure
of the APPEL rule.
– The match() generates the SQL
code for matching an APPEL
expression
• Select elements in the P3P
policy from the table
• Ensure that the elements
belong to their parent elements
• Match any attributes specified
in the APPEL expression
• Recursively match any sub
expressions with the
appropriate connective.
Optimizations
• Reduce the number of tables in order to reduce
the number of joins in the generated SQL
queries
– Store P3P subelements in their parent table, not in
separate tables.
– Store the value of RETENTION in STATEMENT table,
since each STATEMENT can have only one
RETENTION element.
– Store the value of CONSEQUENCE in a nullable
column in STATEMENT table.
Translation Example
• Simplified First Rule
from Jane’s APPEL
preference
• SQL Translation
Algorithms for Server-Centric
Implementation (Cont.)
• Translating APPEL
Preferences into
XQuery
– The main() generates an
XQuery if statement
• Return the rule behavior if
the condition expressed
by the rule is met by the
application policy
– The match() translates
the body of the rule
Performance Experiments
• Measure the time to match a P3P policy with
an APPEL Preference
– Experimental Setup
• A native APPEL engine from the Joint Research
Center
• DB2 UDB 7.2 as a database engine
• Translating APPEL preference into XQuery, use
the XTABLE prototype
– Data Set
• 29 P3P policies (size from 1.6 to 11.9 Kbytes)
• 5 APPEL preference with 5 different levels of
sensitivity
Performance Results
Conclusion and Future work
• Contributions of the paper
– Identification of P3P as an important application area for database
systems.
– Investigation of alternative architectures for implementing P3P.
– Proposal for a server-centric architecture based on database
querying technology.
– Mapping of a P3P policy schema into a relational schema for storing
policy data.
– Algorithms for translating privacy preferences expressed in APPEL
into SQL as well as XQuery.
– Performance experiments showing that the proposed architecture
has adequate performance for it to be used in practical deployments
of P3P.
• Future work
• Explore the use of database query languages for directly expressing
and representing privacy preference
• Identify the minimal subset of SQL and XQuery
• Develop and implement database mechanisms for ensuring that the
privacy policies are indeed being followed
Questions and Discussions