Database tracker

Download Report

Transcript Database tracker

Query Size Restriction: The Database
Tracker Problem
EECS710: Information Security and Assurance
Professor H. Saiedian
From: Denning, et al “The Tracker: A Threat
to Statistical Database Security” ACM
TODBS, 1978
A statistical database
•
Construction of a characteristic formula C
–
•
A logical formula, operators: AND, OR, NOT (~)
Common queries
count (C)
– sum (C; j)
–
•
Examples
–
–
–
count (M AND CS) = 3 short for count (Sex=‘M’ AND Dept=‘CS’)
sum (M OR ~CS; Salary) = $176K
sum (salary <= 15K; Contributions) = $180
2
Compormise
•
When confidential info is deduced
–
–
–
•
Positive: deduce a value
Negative: learn that a value is not in a given field
(e.g., Baker did not contribute $200)
Secure: no compromise is possible
Example: a person knows that Dodd is a female
CS professor
count (F AND CS AND Prof) = 1
– count (F AND CS AND Prof AND Salary <= 15K) = 1
– If count = 0, Dodd’s salary is not <= $15K
–
3
Setting a lower bound?
•
Setting a lower bound value helps but not always
We know count (~C) = n – count (C)
–
Ask a tautology
count (Prof OR ~Prof) = 12
count (~(F AND CS AND Prof)) = 11  12-11 = 1 female prof
sum (Prof OR Prof; Salary) = $194K
sum (~(F AND CS AND Prof; Salary)) = $179K
Dodd’s salary = $194 - $179 = $15K
4
Need an upper bound also
Respond to query (C) if k ≤ count (C) ≤ n  k
reject otherwise
• Note: k ≤ n/2 (otherwise all queries will be
unanswerable)
•
5
What value for k?
If a questioner knows (from external sources)
that individual I is uniquely characterized by C,
then the questioner will seek whether I has
characteristic α
• Assume k = 2
• Because count(C AND α) ≤ count (C) = 1 < k
questioner cannot use the above example
• Questioner may divide C into two parts to
calculate count (C AND α)
•
6
The database tracker
•
How? Divide C into C = C1 AND C2 such that
count (C1 AND ~C2) and count (C1) are answerable
•
T = C1 AND ~C2 is called a tracker of I
–
it tracks down additional characteristics of I
7
Calculating the tracker
C = C1 AND C2
• T = C1 AND ~C2
• count (C) = count (C1) – count (T)
• count (C AND α) = count (T OR C1 AND α) – count (T)
• If count (C AND α) = 0  negative compromise
• If count (C AND α) = count (C)  positive compromise (I
has α)
• If count (C) = 1  arbitrary stats about I can be
computed from query (C) = query (C1) – query (T)
•
8
A tracker example
Suppose k = 2
• Query (C) is answerable if 2 <= count (C) <= 10
• Questioner believes C = F AND CS AND Prof is
Dodd
• Constructs T = C1 AND ~C2 where
C1 = “F”
C2 = “CS AND Prof”
•
9
To verify the tracker
count (F AND CS AND Prof) =
count (F) – count (F AND ~(CS AND Prof)) = 5 – 4 = 1
To find Dodd’s salary, apply
query (c) = query (A) – query (T)
sum (F AND CS AND Prof; salary) =
sum (F; Salary) – sum (F AND ~(CS AND Prof); salary)=
$90K - $75K = $15K
10
Negative compromise also possible
count (F AND CS AND Prof AND Salary > $15K) =
count (F AND ~(CS AND Prof) OR F AND Salary > $15K) – count (F AND CS AND Prof) =
4–4=0
11