Lecture9 - The University of Texas at Dallas

Download Report

Transcript Lecture9 - The University of Texas at Dallas

Data and Applications Security
Developments and Directions
Dr. Bhavani Thuraisingham
The University of Texas at Dallas
Lecture #11
Inference Problem - I
September 19, 2011
Outline
 History
 Access Control and Inference
 Inference problem in MLS/DBMS
 Inference problem in emerging systems
 Semantic data model applications
 Confidentiality, Privacy and Trust
 Directions
History
 Statistical databases (1970s – present)
 Inference problem in databases (early 1980s - present)
 Inference problem in MLS/DBMS (late 1980s – present)
 Unsolvability results (1990)
 Logic for secure databases (1990)
 Semantic data model applications (late 1980s - present)
 Emerging applications (1990s – present)
 Privacy (2000 – present)
Statistical Databases
 Census Bureau has been focusing for decades on statistical
inference and statistical database
 Collections of data such as sums and averages may be given out
but not the individual data elements
 Techniques include
- Perturbation where results are modified
- Randomization where random samples are used to compute
summaries
 Techniques are being used now for privacy preserving data mining
Access Control and Inference
 Access control in databases started with the work in System R and
Ingres Projects
- Access Control rules were defined for databases, relations,
tuples, attributes and elements
- SQL and QUEL languages were extended

GRANT and REVOKE Statements

Read access on EMP to User group A Where
EMP.Salary < 30K and EMP.Dept <> Security
- Query Modification:

Modify the query according to the access control rules

Retrieve all employee information where salary < 30K and
Dept is not Security
Query Modification Algorithm
 Inputs: Query, Access Control Rules
 Output: Modified Query
 Algorithm:
- Given a query Q, examine all the access control rules relevant to
the query
- Introduce a Where Clause to the query that negates access to
the relevant attributes in the access control rules

Example: rules are John does not have access to Salary in
EMP and Budget in DEPT

Query is to join the EMP and DEPT relations on Dept #

Modify the query to Join EMP and DEPT on Dept # and
project on all attributes except Salary and Budget
- Output is the resulting query
Security Constraints / Access Control Rules
 Simple Constraint: John cannot access the attribute Salary of
relation EMP
 Content-based constraint: If relation MISS contains information
about missions in the Middle East, then John cannot access MISS
 Association-based Constraint: Ship’s location and mission taken
together cannot be accessed by John; individually each attribute can
be accessed by John
 Release constraint: After X is released Y cannot be accessed by
John
 Aggregate Constraint: Ten or more tuples taken together cannot be
accessed by John
 Dynamic Constraint: After the Mission, information about the
mission can be accessed by John
Security Constraints for Healthcare
 Simple Constraint: Only doctors can access medical records
 Content-based constraint: If the patient has Aids then this
information is private
 Association-based Constraint: Names and medical records taken
together is private
 Release constraint: After medical records are released, names
cannot be released
 Aggregate Constraint: The collection of patients is private,
individually public
 Dynamic Constraint: After the patient dies, information about him
becomes public
Inference Problem in MLS/DBMS
 Inference is the process of forming conclusions from premises
 If the conclusions are unauthorized, it becomes a problem
 Inference problem in a multilevel environment
 Aggregation problem is a special case of the inference
problem - collections of data elements is Secret but the
individual elements are Unclassified
 Association problem: attributes A and B taken together is
Secret - individually they are Unclassified
Revisiting Security Constraints
 Simple Constraint: Mission attribute of SHIP is Secret
 Content-based constraint: If relation MISSION contains information
about missions in Europe, then MISSION is Secret
 Association-based Constraint: Ship’s location and mission taken
together is Secret; individually each attribute is Unclassified
 Release constraint: After X is released Y is Secret
 Aggregate Constraint: Ten or more tuples taken together is Secret
 Dynamic Constraint: After the Mission, information about the
mission is Unclassified
 Logical Constraint: A Implies B; therefore if B is Secret then A must
be at least Secret
Enforcement of Security Constraints
User Interface Manager
Security
Constraints
Constraint
Manager
Query Processor:
Constraints during
query and release
operations
MLS/DBMS
Update
Processor:
Database Design
Tool
Constraints during
database design
operation
Constraints
during
update
operation
MLS
Database
Query Algorithms
 Query is modified according to the constraints
 Release database is examined as to what has been released
 Query is processed and response assembled
 Release database is examined to determine whether the response
should be released
 Result is given to the user
 Portions of the query processor are trusted
Update Algorithms
 Certain constraints are examined during update operation
 Example: Content-based constraints
 The security level of the data is computed
 Data is entered at the appropriate level
 Certain parts of the Update Processor are trusted
Database Design Algorithms
 Certain constraints are examined during the database design time
- Example: Simple, Association and Logical Constraints
 Schema are assigned security levels
 Database is partitioned accordingly
 Example:
- If Ships location and mission taken together is Secret, then
SHIP (S#, Sname) is Unclassified,
LOC-MISS(S#, Location, Mission) is Secret
LOC(Location) is Unclassified
- MISS(Mission) is Unclassified
Data Warehousing and Inference
Challenge: Controlling access to the Warehouse and at the same time
enforcing the access control policies enforced by the back-end
Database systems
Users
Query
the Warehouse
Oracle
DBMS for
Employees
Data
Data Warehouse:
Data correlating
Employees With
Travel patterns
and Projects
Sybase
DBMS for
Projects
Data
Could be
any DBMS
e.g., relational
Informix
DBMS for
Travel
Data
Data Mining as a Threat to Security
 Data mining gives us “facts” that are not obvious to human analysts
of the data
 Can general trends across individuals be determined without
revealing information about individuals?
 Possible threats:
Combine collections of data and infer information that is private
 Disease information from prescription data
 Military Action from Pizza delivery to pentagon
 Need to protect the associations and correlations between the data
that are sensitive
-
Security Preserving Data Mining
 Prevent useful results from mining
- Introduce “cover stories” to give “false” results
- Only make a sample of data available and that adversary is
unable to come up with useful rules and predictive functions
 Randomization
- Introduce random values into the data or results; Challenge is to
introduce random values without significantly affecting the data
mining results
- Give range of values for results instead of exact values
 Secure Multi-party Computation
- Each party knows its own inputs; encryption techniques used to
compute final results
Inference problem for Multimedia Databases
 Access Control for Text, Images, Audio and Video
 Granularity of Protection
- Text

John has access to Chapters 1 and 2 but not to 3 and 4
- Images

John has access to portions of the image

Access control for pixels?
- Video and Audio

John has access to Frames 1000 to 2000

Jane has access only to scenes in US
- Security constraints

Association based constraints
E.g., collections of images are classified
Inference Control for Semantic Web
 According to Tim Berners Lee, The Semantic Web supports
-
Machine readable and understandable web pages
 Layers for the semantic web: Security cuts across all layers
 Semantic web has reasoning capabilities
S
E
C
U
R
I
T
Y
P
R
I
V
A
C
Y
Logic, Proof and Trust
Rules/Query
RDF, Ontologies
XML, XML Schemas
URI, UNICODE
Other
Services
Inference Control for Semantic Web - II
 Semantic web has reasoning capabilities
 Based on several logics including descriptive logics
 Inferencing is key to the operation of the semantic web
 Need to build inference controllers that can handle different
types of inferencing capability
Example Security-Enhanced Semantic Web
Technology
to be developed
by project
Interface to the Security-Enhanced
Semantic Web
Inference Engine/
Inference Controller
Security Policies
Ontologies
Rules
Semantic Web
Engine
XML, RDF
Documents
Web Pages,
Databases
Security, Ontologies and XML
 Access control for Ontologies
-
Who can access which parts of the Ontologies
E.g, Professor can access all patents of the department while the
Secretary can access only the descriptions of the patents
 Ontologies for Security Applications
-
Use ontologies for specifying security/privacy policies
Integrating heterogeneous policies
 Access control for XML (also RDF)
-
Protecting entire documents, parts of documents, propagations of
access control privileges; Protecting DTDs vs Document instances;
Secure XML Schemas
 Inference problem for XML documents
-
Portions of documents taken together could be sensitive, individually
not sensitive
Semantic Model for Inference Control
Dark lines/boxes contain
sensitive information
Cancer
Influenza
Has disease
John’s
address
Patient John
address
England
Travels frequently
Use Reasoning Strategies developed for Semantic Models such as
Semantic Nets and Conceptual Graphs to reason about the applications
And detect potential inference violations
Directions
 Inference problem is still being investigated
 Census bureau still working on statistical databases
 Need to find real world examples in the Military world
 Inference problem with respect to medial records
 Much of the focus is now on the Privacy problem
 Privacy problem can be regarded to be a special case of the
inference problem