queries and results - The Stanford University InfoLab

Download Report

Transcript queries and results - The Stanford University InfoLab

Gio CS Forum Oct01-1
TIHI: Protecting Information when
Access is Granted for Collaboration
Gio Wiederhold
1. Stanford University CSD (mostly)
www-db.stanford.edu/people/gio.html
2. Symmetric Security Technologies
www.2ST.com
Gio CS Forum Oct01-2
Information for Collaboration
Medical Records Insurance Company
Medical Records Medical Researchers
Manufacturer’s Specs Subcontractor
Business Vendor Content Customer
Operational Data Logistics Provider
Intelligence Data Front-line soldier
Strategic Data Allied Forces
Gio CS Forum Oct01-3
Laboratory staff
Clinics
Laboratory
Accounting
Accreditation
Access Patterns versus Data:
Patient
Physician
Pharmacy
Inpatient
Billing
Insurance Carriers
Ward
staff
Etc..
CDC
Gio Wiederhold TIHI Oct96 3
Gio CS Forum Oct01-4
Primitive and Safe: Isolation
• No communication among disjoint systems
• All sharing of information by data re-entry
Discretionary security
airgaps
Mandatory
security
Gio CS Forum Oct01-5
Automation of Sharing
• Multi-level secure (MLS) system
– Involves OS and DBMS
– Programmed read up – write down permitted
– Complex – hard and lenghty (1y+) to validate
Gio CS Forum Oct01-6
MLS problem: inconsistency
• Information at each level is incomplete
– Make up cover stories ?
• Ok for enemies
• Not acceptable for our own staff/soldiers
Secret
|
Secret
Gio CS Forum Oct01-7
Multi-computer system approach
• Uses more computers – are cheap now
• Secure communication
– typically manually monitored
• Avoids complexity, lags of MLS systems
– Validation in communication portals
Gio CS Forum Oct01-8
Security and Cryptography
• Encryption is essential
– Hides information from enemies
– Isolates layers from each other
– Allows shared use of communication paths
• Encryption is not the solution, only a tool
–
–
–
–
Isolated data do not provide information
Software processes clear data
Software is too large, dynamic to validate timely
95% of failures are people failures
No obvious solution: new thinking needed
Gio CS Forum Oct01-9
False Assumption
Data in the files of an enterprise
are organized according
to external access rights
Inefficient and risky for
an enterprise
which uses information
mainly internally
and then
must serve external needs
Gio CS Forum Oct0110
The Gap: Assumption that
Access right = Retrievable data
• Access rights assume a certain partitioning of data
• Enterprise data are partitioned for internal needs
• Partitions only match in simple cases/artificial examples
firewall
customer
result
query
authentication
database access &
authorization agent
data sources are
rarely perfectly
matched to all
access rights
Gio CS Forum Oct0111
Technical Access Problems:
Military
More direct connectivity creates risks
`disintermediation’
Query can not specify object precisely
`Causes for low unit readiness?’
(helpful database gets extra stuff)
Objects (N) are not organized according to all
possible access classifications (a) = (Na)
`Problems with ship propulsion, but not propellers
Some objects cover multiple classes
`Units in Persian Gulf?’
Some objects are misfiled (happens easily to others),
costly/impossible to guarantee avoidance
Intel data in operational mission file
Gio CS Forum Oct0112
Technical Access Problems:
Health Care
Query do not specify object precisely
Relevant history for low-weight births
(helpful database gets extra stuff)
Objects (N) are not organized according to all
possible access classifications (a) = (Na)
Nursing hierarchy by bed and ward
Infectious disease hierarchy by risk
Some objects cover multiple classes
Patient with stroke and HIV
Some objects are misfiled (happens easily to others),
costly/impossible to guarantee avoidance
Psychiatric data in patient with alcoholism
Gio CS Forum Oct0113
Access Rights/Needs Overlap
NCA
C
O
T
S
Logistics
Intel
JC
Warfighters
Allies
PR
Gio CS Forum Oct0114
Security Objective in Collaboration?
Prevent Inappropriate Disclosure of Information!
differs from preventing access to computers and
information, as is needed to protect from invaders
and hackers
ACCESS CONTROL is based on Metadata
Descriptions and labels, set a priori, are checked
RELEASE CONTROL also sees contents
Works also when metadata cannot / does not
adequately describe content information
Gio CS Forum Oct0115
Dominant approach for Data
• Authenticate Customer in Firewall
• Validate query against database schema
• If both O.K., process query and ship results
customer
firewall
result
query
sources
authentication
database access &
authorization agent
Gio CS Forum Oct0116
Today: Many Coalitions
Foreign: NATO, +, British, French, Kosovo IFOR,
...
• Each has its own, intersecting requirement
• Discretionary access at lower levels
– Policies for dozens of countries
controlling release of Data and Metadata
• Many duplicated systems
– High rate of information transfer among them
– Excessive load creates high error rates
– Difficult to protect from hackers and enemies
Gio CS Forum Oct0117
Changing Security Protection
Yesterday
Internal Focus
Access is granted to
employees only
Today
External Focus
Payors, suppliers, customers and trusted
prospects all need some form of access
Centralized assets
Applications and data are
centralized in fortified IT bunkers
Distributed assets
Applications and data are
distributed across servers,
locations, and business units
Prevent losses
The goal of security is to protect
against confidentiality breaches
Generate revenue
The goal of security is to enable
e-Commerce & collaboration
IT control
DB/Network manager decides
who gets access
Local control
Functional units need the
authority to grant access
Gio CS Forum Oct0118
Access right = Retrievable data
• Access rights assume a certain partitioning of data
• Domain data are partitioned accord to internal needs
• They only match in simple cases / artificial examples
firewall
customer
result
query
authentication
database access &
authorization agent
data sources are
rarely perfectly
matched to all
access rights
Gio CS Forum Oct0119
Symmetric Solution
Symmetric checking both access to data and
the subsequent release of data
• Access Control with authentication and
authorization of collaborators upon entry
• Content-based release filtering of data
when exiting the secure parameter
Gio CS Forum Oct0120
Filling the Gap
Check the content
of the result before
it leaves the firewall
result
Security mediator :
Human & software
agent module
query
firewall
Gio CS Forum Oct0121
Security Mediator
• Dedicated hardware plus software module,
intermediate between "customers" and
databases within firewall
• A modern tool for the security officer
accessed via firewall protection by customers
(or collaborators) with assigned roles
• Managed by the security officer,
via simple security-specific rules that
match filters to roles
• Performs symmetric screening
(queries and results)
Gio CS Forum Oct0122
Result Checking
is understood and performed today in many
non-computerized settings:
• Briefcases are inspected when leaving
secure facilities
• Computers can not be taken (in nor) out of
SCIFs
• Vehicles are inspected also on exiting
warehouses with valuable contents
Computer security system requirements have
been modeled poorly wrt such practice
Gio CS Forum Oct0123
Overall Schematic
Firewall
External
Customer
Security
Officer's
Mediator
System
Database
Internal
Customer
Network
Gio CS Forum Oct0124
Hardware
• Computer workstation
– UNIX and NT implementation
– external access through firewall
? firewall can provide authentication
– internal access to database(s) that contain
releasable information
? multi (two)-level security provision
– internal storage, inside firewall:
• rules defining cliques - external roles
• log of accepted and denied requests
• mediator software
Gio CS Forum Oct0125
Software Components
C++ and Java implementations
service
maintenance
support
• Rule interpreter
• Primitives to support rule execution
•
•
•
•
•
Rule maintenance tools
Log analysis tool
Firewall interface
Domain database interface
Logger
Gio CS Forum Oct0126
Rule Processing
Features:
• Paranoia: Every applicable rule must be enforced
for a query to be successful or a result to be
releasable, else process by the security officer (SO)
• Default: If no rule applies rules then process by SO
• SO can pass, reject, or edit queries and results
• SO may inform customer, mediator software will not
• All queries and results, successful or not, are
logged for audit
• Rules are stored within the mediator, with exclusive
security access by the SO
Gio Wiederhold TIHI Oct96 26
Gio CS Forum Oct0127
The Rule Language
Goals:
• Simple and easy to formulate by the SO
• Easy to enter and observe into the system
• Employs a collection of primitive functions to
provide comprehensive and adequate security
• Functions can exploit views in RDBMS
• Some rule functions provide text validation
• Some functions may need domain knowledge
– Functions to process manufacturing designs
– Functions to extract text from images
Gio CS Forum Oct0128
Rule Organization
• Rules are categorized as:
– SET-UP (Maintenance)
– PRE-QUERY
– POST-PROCESSING
• External, authenticated users are grouped
into Cliques to simplify rule management
• Tables and their columns are grouped into
segments to simplify access mgmnt
• Rules use primitives supplied by specialists
Gio CS Forum Oct0129
Primitives Selected by rule for various clique roles
• Allow / disallow values
• Allow / disallow value ranges
• Limit results to approved good-word lists
• Disallow output containing bad words
• Limit output to specified times, places
• Limit number of queries per period
• Can augment queries for result filtering
• Etc.
Gio CS Forum Oct0130
Content primitives tested in TIHI*
*NSF/NIH funded HPCC projects
• Check against good-word dictionary
– dictionary created by processing ok records
• Check against a bad word dictionary
– less paranoid, less secure, used by Net-nanny etc.
• Check for seeded entries in high value files
– password files,
• Check for patterns in personal data
– credit cards, email addresses
• Check cell count in statistical results
– at query time append COUNT request
• Extraction of text from images
– for further filtering
Gio CS Forum Oct0131
Creating Wordlists
TIHI is Paranoid
• Result filtering primarily based on Good-word lists
– Created by processing examples of O.K. responses
– Augmented dynamically by terms found objectionable
by system, but approved by security officer
• Current work
– Image filtering, to omit and extract text from images
• Possible future work
– use nounphrases to increase specificity
Gio CS Forum Oct0132
Filtering of text
Not perfect:
• Words out-of-context can pass the filter
• ophtamology: don’t pass names: Iris Smith
– Risk reduces rapidly with multiple words
• Can never have all good-words in list
– Load for security officer -- seek a balance
• Cost: all of contents must be processed
– Good technology from spell checkers
– Domain-specific word-lists are modest in size
Gio CS Forum Oct0133
Rules implement policy
• Tight security policy:
–
–
–
–
–
simple rules
many requests/responses referred to security officer
much information output denied by security officer
low risk
poor public and community physician relations
• Liberal but careful security policy
–
–
–
–
–
complex rules
few requests/responses referred to security officer
of remainder, much information output denied by security officer
low risk
good public and community physician relations
• Sloppy security policy
–
–
–
–
–
simple rules
few requests/responses referred to security officer
little information output denied by security officer
high risk
unpredictable public and community physician relations
Gio CS Forum Oct0134
Security requires attention
• Security officer’s focus is security
:-(
– not for a computer system designer,
– nor database or network administrator,
– nor for management.
• Having and owning the tool enables the role
•
Security mediator provides logging for
– focused audit trail
– system improvements
– accountability
• Must be able to deal effectively with exceptions,
else encourages bypassing security without logging.
Gio CS Forum Oct0135
Responsibility Assignment
:-)
• Database administrator
– Primary task: assure availability of data
– Provides helpful services – broaden search: risk
:-|
• Network administrator
– Primary task: keep network running: transparent
:-|
• System administrator
– Buys glossy product to escape responsibility
:-(
• Security officer
– Not in loop, no tools
– Investigates violations, takes blame for failures
Needs tools as well
Gio CS Forum Oct0136
Coverage of Access Paths
Security officer
:-(
Authentication
based
good/bad control
prior use
good guy
Security Mediator
security
needs
-)
Database oo
administrator
good
query DB schemabased
ok
control
ancillary
information
validated
to be ok
history
result is
likely ok
processable query
performance,
function requests
Database
Gio Wiederhold TIHI Oct96 36
Gio CS Forum Oct0137
Rule system
• Optional: without rules every interaction
goes to the security officer (in & out)
• Creates efficiency: routine requests will
be covered by rules: 80%instances / 20%types
• Gives control to Security officer: rules
can be incrementally added/deleted/analyzed
• Primitives simplify rule specification:
source, transmit date/time, prior request, ...
Gio CS Forum Oct0138
Benign and ID areas in an X-ray
Integrated IDs are
crucial for practice
(40% of X-rays are lost)
Paranoid:
{
Benign is defined positively
a, value range
b. good-word list
else it is potentially bad
}
Gio CS Forum Oct0139
Application of Rules
authenticated ID
Query
Parse Query
Data
Requestor
Firewall
failure
External
success
error
rule
customer advice
else
edits
ancillary
information
Execute
Query
SO
results
authenticated ID
cleared results
Results
Query
Checking
else
Result
checking
edits
Gio CS Forum Oct0140
:-(
Security Officer
• Profile
– Human responsible for database security/privacy policies
– Must balance data availability vs. data security/privacy
• Tasks (current)
– Advises staff on how to try to follow policy
– Investigates violations to find & correct staff failures
– Has currently no computer-aided tools
• Tasks (with mediators)
– Defines and enters policy rules in security mediator
– Monitors exceptions, especially violations
– Monitors operation, to obtain feedback for improvements
Gio CS Forum Oct0141
Roles
:-(
Security officer manages security policy,
not a computer specialist or database administrator.
-)
oo
Computer specialist provides tools
agent workstation program for security mediation
Enterprise / institution defines policies
its security officer (SO) uses the program as the tool
Tool formalizes system practices
rules, managed by the SO define the practice
Gio CS Forum Oct0142
Assigning the Responsibility
 Database Administrator
:-)
– Can create views limiting access in RDMSs
– Prime role is to assure convenient data access
 Network Administrator
– Prime responsibility is security & privacy protection
– Implements security policy
– Interacts with database & network administrators
:-(
 Specialist Security Officer
:-|
– Can restrict incoming and outgoing IP addresses
– Prime role is to keep network up and
connected to the Internet
Gio CS Forum Oct0143
Hypothetical benefits: Prevents
1. Secure data are inadvertently shipped to
insecure backup by trusted user
2. HIV symptoms shown to cardiac researcher
3. US managers obtains EU-restricted
personnel data
4. Misclassified data are released at low level
5. Credit card numbers were released when
false customer appears to get an MP3 song
6. Passwords transmitted to hacker when
access control failed
Multiple Internal sources are covered
External Requestors
original
request
Firewall
certified
Security Mediator
S.O.
certified
query Integrating Mediator
Internal
Requestors
result
Logs
unfiltered
result
Protected, Shared Databases
Gio CS Forum Oct0145
Implementations
• UNIX prototype
• UNIX - Java at Incyte Corporation [SST]
– protect medical & genomic information
• NT - Java development system
• Primitives for Drawings, as Aircraft Specs
• Trusted Image Dissemination
• wavelet-based decomposition to locate texts,
• extract for OCR
• blank text frequency if not found in good rules
Gio CS Forum Oct0146
Effective Settings
• External access is a modest fraction of total use
collaboration, government oversight, safety monitoring
• Restructuring internal partitioning would induce
significant inefficiencies
for example: Hospital: MD/patients vs. research/insurance
• Errors are seriously embarrassing
in practice 2-5% of data are misfiled, doing better is costly
• Locus of control is needed
Security officer cannot trust/control DB / network admin’s
Gio CS Forum Oct0147
Intrusion detection – two-level
Model of
normal behavior
Observations,
initial, continuing
Compare
Events
Monitor
Assess
Stop
Stream of information 
Gio CS Forum Oct0148
TIHI Summary
Avoids the -- often false -- assumption that
access rights match data organization
Collaboration is an underemphasized issue
beyond encrypted transmits, firewalls, passwords, authentication
There is a need for flexible, selective access to data
without the risk of exposing related information in an enterprise
In TIHI service is provided by the Security Mediator:
a rule-based gateway processor of queries and results
under control of a security officer who implements enterprise policies
Our solution has been applied to Healthcare
also relevant to Collaborating (virtual) enterprises
in many
Military situations.
and
Gio Wiederhold TIHI Oct96 48
Gio CS Forum Oct0149
Security Mediator Benefits
• Dedicated to security task (may be multi-level secure)
• Uses only its rules and relevant function, all directly,
avoids interaction with DB views and procedures
• Maintained by responsible authority: the security officer
• Policy setting independent of database(s) and DBA(s)
• Logs just those transactions that penetrate the firewall,
records attempted violations independent of DB logs*
• Systems behind firewall need not be multi-level secure
• Databases behind firewall need not be perfect
*
also used for replication, recovery, warehousing
Gio CS Forum Oct0150
Backup
Gio CS Forum Oct0151
Security officer screen
Gio CS Forum Oct0152
Patient's own data screen
Gio CS Forum Oct0153
part of Patient result
Gio CS Forum Oct0154
Disallowed result
Gio CS Forum Oct0155
Security officer reaction
Choices:
1. Reject result
2. Edit result
3. Pass result
(& Update the
list of good-words,
making approval
persistent )
Gio CS Forum Oct0156
Security Table Definition...
(continued)
Security Function
Object Name
Object Value
Validate_text
table.column
invalid_words
Min_Rows_Retrieved
ALL/clique
Num_Queries_Segment
ALL/segment
Query_Intersection_Clique ALL/clique
Query_Intersection_Segment ALL/segment
Secure_Keyword_Clique
ALL/clique
Secure_Keyword_Segment ALL/segment
Session_Time
ALL/clique
User_Hours_Start
ALL/clique
User_Hours_End
ALL/clique
Segment_Hours_Start
ALL/segment
Segment_Hours_End
ALL/segment
Limit_Function_Clique
ALL/clique
integer
integer
integer
integer
keyword
keyword
TIME
start_time
end_time
start_time
end_time
function_name
Gio Wiederhold TIHI Oct96 56
Gio CS Forum Oct0157
Rule application - Overview
• Does customer belong to a clique? If yes, switch to it
• Does the customer clique satisfy all pre-query rules?
(e.g., Session_Start, Stat_Only, Queries_Per_session)
• Do the columns and tables belong to a segment?
• Does the query satisfy all pre-query rules?
(e.g., valid segments)
• Does query need re-phrasing or augmentation?
(e.g., Stat_Only to detailed Select)
• Send Query to appropriate Database (or mediator)
• Does query result satisfy all post-query rules?
(e.g. Min_Rows_Retrieved, Secure_Keyword_Clique)
• Apply any result transformation rules
(e.g. random falsification of data, aggregation)
• Update log and internal statistics
Gio Wiederhold TIHI Oct96 57
Gio CS Forum Oct0158
Implementation
Set-up
• Security Officer enters rules into a file
• Rule file is parsed to generated SQL script to insert rows
into the security_rules table
• SQL script is executed against the database
Gio Wiederhold TIHI Oct96 58
Gio CS Forum Oct0159
Implementation... (continued)
Customer Session Loop
•
•
•
•
•
•
Security Mediator Workstation accepts the customer query, logs it,
and passes control to the Security Mediator Software (SMS)
SMS reads the security_rules table and calls many different
modules (sub-routines) to validate the query (pre-query checks)
If okay, SMS executes the query (Embedded SQL calls)
Mediator Workstation gets results from the database and calls
other SMS modules to perform the post-query checks
If all checks are passed, the Mediator Workstation logs and
returns results; awaits another invocation
Result is accepted by customer and used or displayed
Gio Wiederhold TIHI Oct96 59
Gio CS Forum Oct0160
System Operations
• Customer connects remotely, via firewall for
authentication, to security officer's machine
• Clique membership is assessed
• System prompts customer for query
• Query is parsed and validated against rules
• Validated query is sent to database system
• Results are retrieved and validated against rules
• Validated results are made available to customer
Gio Wiederhold TIHI Oct96 60
Gio CS Forum Oct0161
Benign and ID areas in an X-ray
Integrated IDs are
crucial for practice
(40% of X-rays are lost)
Paranoid:
{
Benign is defined positively
a, value range
b. good-word list
else it is potentially bad
}
Gio CS Forum Oct0162
Processing Flow
Gio CS Forum Oct0163
Source X-ray image
Whitened to
protect privacy
for this
presentation
Gio CS Forum Oct0164
Wavelet decomposition
Gio CS Forum Oct0165
Candidate Text areas
Gio CS Forum Oct0166
Extracted textual fields
Blackened to
protect privacy
for this
presentation
Gio CS Forum Oct0167
OCR conversion & analysis
Name
Not in good-list
Not approved
Error in OCR
Not in good-list
Not approved
Gio CS Forum Oct0168
Reconstituted image
Identification
area blurred
by removing
high frequency
components
Gio CS Forum Oct0169
Removal of Ident’s from an MRI Image
Gio CS Forum Oct0170
Chest X-ray
Gio CS Forum Oct0171