Lecture9 - The University of Texas at Dallas
Download
Report
Transcript Lecture9 - The University of Texas at Dallas
Data and Applications Security
Developments and Directions
Dr. Bhavani Thuraisingham
The University of Texas at Dallas
Inference Problem - I
September 2012
Outline
History
Access Control and Inference
Inference problem in MLS/DBMS
Inference problem in emerging systems
Semantic data model applications
Confidentiality, Privacy and Trust
Directions
History
Statistical databases (1970s – present)
Inference problem in databases (early 1980s - present)
Inference problem in MLS/DBMS (late 1980s – present)
Unsolvability results (1990)
Logic for secure databases (1990)
Semantic data model applications (late 1980s - present)
Emerging applications (1990s – present)
Privacy (2000 – present)
Statistical Databases
Census Bureau has been focusing for decades on statistical
inference and statistical database
Collections of data such as sums and averages may be given out
but not the individual data elements
Techniques include
- Perturbation where results are modified
- Randomization where random samples are used to compute
summaries
Techniques are being used now for privacy preserving data mining
Access Control and Inference
Access control in databases started with the work in System R and
Ingres Projects
- Access Control rules were defined for databases, relations,
tuples, attributes and elements
- SQL and QUEL languages were extended
GRANT and REVOKE Statements
Read access on EMP to User group A Where
EMP.Salary < 30K and EMP.Dept <> Security
- Query Modification:
Modify the query according to the access control rules
Retrieve all employee information where salary < 30K and
Dept is not Security
Query Modification Algorithm
Inputs: Query, Access Control Rules
Output: Modified Query
Algorithm:
- Given a query Q, examine all the access control rules relevant to
the query
- Introduce a Where Clause to the query that negates access to
the relevant attributes in the access control rules
Example: rules are John does not have access to Salary in
EMP and Budget in DEPT
Query is to join the EMP and DEPT relations on Dept #
Modify the query to Join EMP and DEPT on Dept # and
project on all attributes except Salary and Budget
- Output is the resulting query
Security Constraints / Access Control Rules
Simple Constraint: John cannot access the attribute Salary of
relation EMP
Content-based constraint: If relation MISS contains information
about missions in the Middle East, then John cannot access MISS
Association-based Constraint: Ship’s location and mission taken
together cannot be accessed by John; individually each attribute can
be accessed by John
Release constraint: After X is released Y cannot be accessed by
John
Aggregate Constraint: Ten or more tuples taken together cannot be
accessed by John
Dynamic Constraint: After the Mission, information about the
mission can be accessed by John
Security Constraints for Healthcare
Simple Constraint: Only doctors can access medical records
Content-based constraint: If the patient has Aids then this
information is private
Association-based Constraint: Names and medical records taken
together is private
Release constraint: After medical records are released, names
cannot be released
Aggregate Constraint: The collection of patients is private,
individually public
Dynamic Constraint: After the patient dies, information about him
becomes public
Inference Problem in MLS/DBMS
Inference is the process of forming conclusions from premises
If the conclusions are unauthorized, it becomes a problem
Inference problem in a multilevel environment
Aggregation problem is a special case of the inference
problem - collections of data elements is Secret but the
individual elements are Unclassified
Association problem: attributes A and B taken together is
Secret - individually they are Unclassified
Revisiting Security Constraints
Simple Constraint: Mission attribute of SHIP is Secret
Content-based constraint: If relation MISSION contains information
about missions in Europe, then MISSION is Secret
Association-based Constraint: Ship’s location and mission taken
together is Secret; individually each attribute is Unclassified
Release constraint: After X is released Y is Secret
Aggregate Constraint: Ten or more tuples taken together is Secret
Dynamic Constraint: After the Mission, information about the
mission is Unclassified
Logical Constraint: A Implies B; therefore if B is Secret then A must
be at least Secret
Enforcement of Security Constraints
User Interface Manager
Security
Constraints
Constraint
Manager
Query Processor:
Constraints during
query and release
operations
MLS/DBMS
Update
Processor:
Database Design
Tool
Constraints during
database design
operation
Constraints
during
update
operation
MLS
Database
Query Algorithms
Query is modified according to the constraints
Release database is examined as to what has been released
Query is processed and response assembled
Release database is examined to determine whether the response
should be released
Result is given to the user
Portions of the query processor are trusted
Update Algorithms
Certain constraints are examined during update operation
Example: Content-based constraints
The security level of the data is computed
Data is entered at the appropriate level
Certain parts of the Update Processor are trusted
Database Design Algorithms
Certain constraints are examined during the database design time
- Example: Simple, Association and Logical Constraints
Schema are assigned security levels
Database is partitioned accordingly
Example:
- If Ships location and mission taken together is Secret, then
SHIP (S#, Sname) is Unclassified,
LOC-MISS(S#, Location, Mission) is Secret
LOC(Location) is Unclassified
- MISS(Mission) is Unclassified
Data Warehousing and Inference
Challenge: Controlling access to the Warehouse and at the same time
enforcing the access control policies enforced by the back-end
Database systems
Users
Query
the Warehouse
Oracle
DBMS for
Employees
Data
Data Warehouse:
Data correlating
Employees With
Travel patterns
and Projects
Sybase
DBMS for
Projects
Data
Could be
any DBMS
e.g., relational
Informix
DBMS for
Travel
Data
Data Mining as a Threat to Security
Data mining gives us “facts” that are not obvious to human analysts
of the data
Can general trends across individuals be determined without
revealing information about individuals?
Possible threats:
Combine collections of data and infer information that is private
Disease information from prescription data
Military Action from Pizza delivery to pentagon
Need to protect the associations and correlations between the data
that are sensitive
-
Security Preserving Data Mining
Prevent useful results from mining
- Introduce “cover stories” to give “false” results
- Only make a sample of data available and that adversary is
unable to come up with useful rules and predictive functions
Randomization
- Introduce random values into the data or results; Challenge is to
introduce random values without significantly affecting the data
mining results
- Give range of values for results instead of exact values
Secure Multi-party Computation
- Each party knows its own inputs; encryption techniques used to
compute final results
Inference problem for Multimedia Databases
Access Control for Text, Images, Audio and Video
Granularity of Protection
- Text
John has access to Chapters 1 and 2 but not to 3 and 4
- Images
John has access to portions of the image
Access control for pixels?
- Video and Audio
John has access to Frames 1000 to 2000
Jane has access only to scenes in US
- Security constraints
Association based constraints
E.g., collections of images are classified
Inference Control for Semantic Web
According to Tim Berners Lee, The Semantic Web supports
-
Machine readable and understandable web pages
Layers for the semantic web: Security cuts across all layers
Semantic web has reasoning capabilities
S
E
C
U
R
I
T
Y
P
R
I
V
A
C
Y
Logic, Proof and Trust
Rules/Query
RDF, Ontologies
XML, XML Schemas
URI, UNICODE
Other
Services
Inference Control for Semantic Web - II
Semantic web has reasoning capabilities
Based on several logics including descriptive logics
Inferencing is key to the operation of the semantic web
Need to build inference controllers that can handle different
types of inferencing capability
Example Security-Enhanced Semantic Web
Technology
to be developed
by project
Interface to the Security-Enhanced
Semantic Web
Inference Engine/
Inference Controller
Security Policies
Ontologies
Rules
Semantic Web
Engine
XML, RDF
Documents
Web Pages,
Databases
Security, Ontologies and XML
Access control for Ontologies
-
Who can access which parts of the Ontologies
E.g, Professor can access all patents of the department while the
Secretary can access only the descriptions of the patents
Ontologies for Security Applications
-
Use ontologies for specifying security/privacy policies
Integrating heterogeneous policies
Access control for XML (also RDF)
-
Protecting entire documents, parts of documents, propagations of
access control privileges; Protecting DTDs vs Document instances;
Secure XML Schemas
Inference problem for XML documents
-
Portions of documents taken together could be sensitive, individually
not sensitive
Semantic Model for Inference Control
Dark lines/boxes contain
sensitive information
Cancer
Influenza
Has disease
John’s
address
Patient John
address
England
Travels frequently
Use Reasoning Strategies developed for Semantic Models such as
Semantic Nets and Conceptual Graphs to reason about the applications
And detect potential inference violations
Directions
Inference problem is still being investigated
Census bureau still working on statistical databases
Need to find real world examples in the Military world
Inference problem with respect to medial records
Much of the focus is now on the Privacy problem
Privacy problem can be regarded to be a special case of the
inference problem