Transcript Chapter 06

Databases
Collections of data.
 Set of rules to organize data.
 Types
◦ Relational: use (rows) & columns to organize.
◦ Object oriented: complex data (audio, video)
◦ Hierarchical Model
 Tree structure.
◦ Network
 Hierarchical but elements have multiple
parent and child records.

Database Characteristics
Centralizes data.
 Allows for easier backups.
 Provides transaction persistence.
 Allows for more consistency
◦ All data at one location
 Provides recovery and fault tolerance.
 Allows sharing of data with many users.
 Provides security controls.
◦ Integrity checking.
◦ Access control
◦ Confidentiality

Database Jargon







Record: collection of related data items.
File: collection of records of the same type.
Database: cross-referenced collection of data.
DBMS: Manages and controls the database.
Tuple: a row in a 2 dimensional database.
Attribute: a column in a 2 dimensional database.
View: virtual relation defined by DBA in order to
keep subjects from viewing certain data.
Database Jargon continued






Primary Key: Columns that make each row unique
Foreign key: attribute of one table that is related
to the primary key of another table.
Cell: an intersection of a row and column.
Schema: defines the structure of the database.
Data Dictionary: central repository of data
elements and their relationships.
Normalization: minimize redundancy.
Database Table
ACID Rules
Atomicity: all modification take effect or
none take effect.
 Consistency: follow integrity policy for a
database
 Isolation: transactions complete in isolation
until completed without interacting with
other transactions.
 Durability: once a transaction is verified it is
committed and cannot be rolled back.

Schema
Database Replication




Integrity is primary concern.
◦ Update one table, update them all
Locking
◦ Prevent 2 processes from updating a record.
Replication
◦ Mirror a live database allowing simultaneous reads and
writes.
Shadowing
◦ Mirrors all changes made to the primary database but
no one accesses it.
Sample Network
Database APIs

Open Database Connectivity (ODBC)
◦ API to connect to databases.

Object Linking and Embedding Database
(OLE DB)
◦ Replacement for ODBC, extending it.

ActiveX Data Objects (ADO).
◦ Set of COM objects for accessing data sources.

Java Database Connectivity (JDBC)
◦ Functionality like ODBC for Java.
Select NAME, AIRPORT from Project:NAME-ZIP, Project:ZIPAIRPORT where NAME-ZIP:ZIP == ZIP-AIRPORT:ZIP
Figure 6-2 Results of Select-Project-Join Query.
Database Integrity

Concurrency: two people drop same file
on a data server with different
information.
◦ Databases lock tables to prevent this.
Semantic Integrity: ensure structural and
semantic rules are enforced.
 Referential Integrity: all foreign keys
reference a primary key.

Advantages of Databases
Shared Access to data centralized.
 Minimal redundancy: normalization.
 Data consistency: change of data affects all
users.
 Data integrity: protect data.
 Controlled Access: authorized users only.

Database Integrity
Entity Integrity: tuples are uniquely
identified by primary key values.
 Rollback: ends a current transaction and
cancels the changes to the database.
 Commit: complete a transaction and
executes all changes made by a user.

Database Security Issues
Aggregation: act of combining information
from separate less sensitive sources,
forming new information.
 Inference: a subject deduces the full story
from pieces learned from aggregation.

Figure 6-3 Security versus Precision.
Security (confidentiality): reject any query requesting a sensitive field.
Precision: protect all sensitive data but reveal as much non-sensitive data as possible.
Want perfect confidentiality with maximum precision.
Database Defense






Content-dependent access control
◦ Based on sensitivity of data.
Context-dependent access control
◦ Software knows what to do based upon what should be
allowed based upon state and sequence of request.
Cell Suppression: hide specific cells to prevent inference
attacks.
Partitioning: divide database into parts.
Noise and perturbation: insert bogus information to
misdirect and confuse.
Database View: limit what each user can see.
Inference Attacks

Direct Attack
◦ Attempt to find values by making a query that
will return very few results.
◦ List Name where SEX=M and DRUGS=1

Indirect Attack
◦ Infer a final result based upon statistical
results.
◦ Sum, Count, Mean, Median, Min, Max.
Inference (if allowed)
Inference Attacks

Tracker Attack
◦ Fool database manager into locating the
desired data by using additional queries that
produce small results.
 Uses logic & algebra to learn information.
 Count (sex=F) – ((sex=F)&(race!=Caucasian) or
(dorm!=holmes)) = 1
Inference Defense
Apply controls to queries or
 Apply controls to individual items.
 Suppression: sensitive values not
provided.

◦ Query is rejected without a response.

Concealing
◦ provide an answer close to the value but not
the actual value.
Inference Rules

1. Suppress obviously sensitive information.
◦ Error on the side of suppression.

2. Track what the user knows.
◦ Expensive
◦ Must maintain information on all user queries.
◦ What if two people working together?

3. Disguise the data.
◦ Random perturbation and rounding of data.
Aggregation

Related to inference.
◦ Build sensitive results from less sensitive
information.

Must track results which user has seen.
◦ Conceal anything that might lead a user to
derive a more sensitive result.

Difficult to counter.
◦ Collusion: 2 people could work together.
Multilevel Databases





Human Resource Databases contain sensitive information
◦ Salary.
◦ Last performance review.
Security of a single element is different from other
elements.
Two levels, sensitive & non-sensitive may not be enough
◦ There are ranges of allowable knowledge.
Security of combining elements may differ than that of an
individual element.
These 3 issues similar to military sensitivity levels.
Differentiated Security
limit who can see what
Partition Database
separate sensitive information
Database Encryption
Need to know key to decipher
information.
 Each sensitivity level has an appropriate
key strength.
 Users with access might be able to
decipher key and control database if
single key used.
 Defense: encrypt each record with a
different key.

Figure 6-5 Cryptographic Separation: Different Encryption Keys.
Use different keys for each record to prevent leaking information.
Figure 6-6 Cryptographic Separation: Block Chaining.
Encryption dependant upon previous blocks.
Figure 6-7 Integrity Lock.
U.S.A.F. stored in plaintext
Figure 6-8 Cryptographic Checksum.
Used for error detection.
Figure 6-9 Sensitivity Lock.
Unique identifier and the sensitivity level.
Each lock belongs to one record.
Figure 6-10 Trusted Database Manager.
Integrity lock database.
Only the access procedure needs to be trusted and
would grant access to sensitive data.
Inefficient.
Figure 6-11 Trusted Front End.
Front end authenticates and authorizes access to data
Issues Queries to DBMS.
Verifies sensitivity levels and who wants access.
Figure 6-12 Commutative Filters.
Screens and reformats user input and
verifies user has permissions to access information.
Database Security
Views: create different views for different
users to limit access to part of the DB.
 Windows: a subset of a database.

◦ Contains information user is allowed to see.

Polyinstantiation
◦ Create custom records made from full
records which depend upon a user’s access
level.
Figure 6-13 Secure Database Decomposition.
Reference monitor employs Bell-LaPuda access controls.
Second level provides indexing.
Third layer translates views into the base relations.
These 3 layers make up the TCB Perimeter.
Remaining layers typical DBMS functions and user interface.
Data Warehousing
Combine data from multiple databases.
 Can be petabytes of data.
 Used for data analysis and inference.
 Data Mining

◦ Process of analyzing data in search of common
patterns.
◦ Search for signs of fraud
 credit card companies