Fair Information Practices in a World of Data Mining

Download Report

Transcript Fair Information Practices in a World of Data Mining

“Inside the MATRIX: Fair
Information Practices in a
World of Data Mining”
Professor Peter Swire
Ohio State University
DePaul Symposium on Privacy and Identity
October 15, 2004
The Challenge
Federal official, involved in funding information
sharing systems, recently asked me:
“What can we do to address the concerns of
privacy proponents so that they will stop
complaining about MATRIX and other needed
systems?”
– Today’s talk in that national security context.
– This was a good-faith question from an honorable
person.
– He was sobered by my answer.
Overview
• Pattern analysis and link analysis
• Current MATRIX as link analysis system
• Open questions on effectiveness of
MATRIX and overall lawfulness
• This talk: the hard privacy issues that exist
even if assume MATRIX is effective and
lawful
Pattern & Link Analysis
• Pattern analysis as “data mining”
• Seek statistical correlations, then act
• DeRosa/CSIS describes pattern analysis
issues
• Policy approaches:
– Original MATRIX system: use data mining
– Dempsey & Rosenzweig as D.C. policy
compromise
– ACLU and others oppose it entirely
Pattern & Link Analysis
• Link analysis: learn more about one
suspect
– More traditional police work
– Warrants, subpoenas, and public records,
depending on type of information
– Current MATRIX system and focus of this talk
MATRIX
• Multi-State Anti-Terrorism Information Exchange
(MATRIX)
– $12 million from DHS & DOJ
– Project security and access in Florida
• First proposed after 9/11
• At the peak,12 states had agreed to participate
– Currently FL, CT, MI, OH, PA are in program
– States that have left or decided not to join after
actively considering it: AL, CA, CO, GA, LA, KY, OR,
SC, TX, UT, WV
– Privacy and cost cited as reasons not to do it
The Current MATRIX
“Information accessible includes criminal history records,
driver’s license data, vehicle registration records, and
incarceration/corrections records, including digitized
photographs, with significant amounts of public records
data. This capability will save countless investigative
hours and drastically improve the opportunity to
successfully resolve investigations. The ultimate goal is
to expand this capability to all states.”
Official site: www.matrix-at.org
2 Early Objections
• System was created and pushed by admitted
drug smuggler, Herb Asher of Seisent
– This is not relevant to how we should view the current
system
– It made it harder to say “Trust Us” on MATRIX
• After 9/11, 120,000 names sent to law
enforcement for “high terrorism factor”
– This is data mining, without individualized suspicion,
with no transparency or known checks against abuse
– Today, “MATRIX is not a data mining application.”
Jan. 2003 Seisent Documents
HTF based on factors including:
• Age, gender & ethnicity
• “What they did with their driver’s licenses”
• Pilots or associations to pilots
• Proximity to “dirty addresses/phone numbers”
• Investigational data
• SSN anomalies
• Credit histories
Seisent Documents
• “The associative links, historical residential
information, and other information, such as
an individual’s possible relatives and
associates, are deeper and more
comprehensive than other commercially
available database systems presently on
the market.”
Answering the Federal Official
• Privacy experts (not necessarily
“advocates”) will have a list of questions:
– About current configuration of system and its
compliance with fair information practices
– About system as designed (it had original,
broader functions)
– How system could easily evolve over time
(mission creep)
Florida,
Other States
Police &
Other State
Subscribers
More States
Supply Data
MATRIX
Intel (?)
“Public”
Records
“Private”
Records (?)
Feds (?)
Florida,
Other States
The Inputs
Police &
Other State
Subscribers
MATRIX
Intel (?)
More States
Supply Data
“Public”
Records
“Private”
Records (?)
Feds (?)
Florida,
Other States
More States
Supply Data
“Public”
Records
“Private”
Records (?)
Questions on Inputs:
Data Quality: 2003 FBI
announcement that NCIC data
could no longer be subject to
“accuracy” requirements of the
Privacy Act
Are state criminal, prison, and
similar records more accurate?
If record are fixed in one place, is
that correction spread to all the
other databases?
Florida,
Other States
Questions on Inputs:
Sensitive data:
More States
Supply Data
“Public”
Records
“Private”
Records (?)
Sources of identity theft -- SSNs
are listed in many public
records; bank account records
in bankruptcy “public” records
Known privacy concerns of
American people on medical,
financial, children’s, & other
“sensitive” records
Florida,
Other States
Questions on Inputs:
Private sector data.
More States
Supply Data
“Public”
Records
“Private”
Records (?)
Was there notice & consent for
these uses? For medical, credit
history, and other sensitive
data? Are these “secondary”
uses appropriate?
Federal data under the Privacy Act,
with public oversight. What
similar checks and balances for
how private data is gathered and
used?
Questions on Outputs:
Police &
Other State
Subscribers
For secret/confidential data, assume
good security in data center.
How many people have access to the
outputs of MATRIX?
Intel (?)
800,000 uniformed police,
for traffic stops, etc.
Non-uniformed? Firefighters? Others?
Feds (?)
Questions on Outputs:
How to secure outputs to 1 million
people?
•Assume few/no secrets for what the
million can see about the system –
Swire paper on security/obscurity
•Training
•Audit trails
•Anti-browsing laws & enforcement
But, what can terrorist or organized
crime group learn by bribing one
out of the million?
Police &
Other State
Subscribers
Intel (?)
Feds (?)
Questions on the Data Center/System:
A principle: the more important the
decisions made, the more important it is to
have due process and fair information
practices. E.g., denied for mortgage or job,
so have FCRA.
Decisions here might include:
•Arrest the person (my student Greg Smith)
•Deny ability to travel, enter secured spaces
•Deny job, on a background check
•Suspicion on a person’s “associates”?
•Other uses over time?
Questions on the Data Center/System:
Access and correction as key fair
information practices.
Currently no access by individual to data
held in MATRIX. Instead, individual told to
go to every data source and get access
there.
Problems include:
•Burdensome to go to numerous sources
•Data sources not all publicly listed.
•Even if correct mistake once, it often
reappears
Questions on the Data Center/System:
Transparency & Governance
•No privacy policy posted until recently
•No individual identified as CPO
•Perhaps have outside experts or advisory
board?
•Most generally, how provide public
oversight, accountability, assurance?
The Sobering List of Privacy Issues for
the Federal Official
•
•
•
•
•
•
•
•
Inputs: data quality
Inputs: sensitive data
Inputs: private-sector data
Outputs: secrets when thousands or a million receive
data
Outputs: anti-browsing and good security at the edges
Important decisions by government require due process
Access and correction (when secrecy unlikely to work)
Transparency and governance, to reduce mistakes and
improve public acceptance
Is It Worth Answering
Those Questions?
• To the Homeland Security official:
– If the privacy homework assignment seems
too burdensome, then temptation is to
minimize or ignore privacy issues
– But the privacy homework is good policy and
good government
– Markle report and the need to do the privacy
homework or else watch public opposition
undermine the potential benefits of a system
– Transparent, good governance as the
touchstone
Conclusion
• The official who questioned me was
surprised and sobered by the number of
significant and difficult privacy issues in
MATRIX
• Should be sobering to all of us how little
the funders of MATRIX had worked
through these issues
• This conference, and ongoing vigilance,
are needed on these issues
Contact Information
• Professor Peter Swire
• Moritz College of Law of the Ohio State
University
• Phone: (240) 994-4142
• Email: [email protected]
• Web: www.peterswire.net