JFC`s notes for lecture 20
Download
Report
Transcript JFC`s notes for lecture 20
Design of Health Technologies
lecture 20
John Canny
11/21/05
Health Care Privacy
Health care privacy is an obvious concern for most people
as we inch toward computerization and out-sourcing.
The HIPAA (Health Insurance Portability and Accountability
Act) was created in 1996.
There are actually several related HIPAA regulations:
HIPAA Privacy rule (1996)
HIPAA Security rule (1998)
HIPAA Title II “Administrative Simplification”, including
transactions and code sets rules (part of 1996 Act).
HIPAA data
HIPAA protects “Protected Health Information” PHI:
Info created or received by a provider, employer etc.
Information that relates to the past, present or future
physical or mental health or condition of an individual.
Information about the health care of an individual.
Information about payments relating to an individual.
TPO is (Treatment, Payment, healthcare Operations):
normal uses of PHI in the course of providing care.
HIPAA Aug. 2002 changes (NPRM)
Under the Bush administration in 2002, a number of
changes were made which are known as NPRM (Notice
of Proposed RuleMaking).
In general, NPRM reduces the burdens on care providers to
protect privacy and narrows the situations in which
privacy regulations apply.
In particular, written patient consent was required in the
original HIPAA regulations for TPO uses. Post-NPRM,
written consent is not required for TPO use of PHI.
HIPAA allowed uses
Disclosure to the individual about whom the PHI applies
With individual consent or legal agreement related to
carrying out TPO.
Without individual authorization for TPO, with some
exceptions.
PHI can usually be shared without disclosure to partner
organizations in the course of providing TPO
PHI can generally be used for other purposes if it is “deindentified” – i.e. information that would allow tracing to
the individual has been removed.
HIPAA exceptions
It is OK to use PHI for these purposes without individual
consent, and opportunity to agree or object:
Quality assurance
Emergencies or concerns about public health & safety
Suspected abuse of the individual
Research**
Judicial and administrative proceedings
Law enforcement
Next-of-kin information
Government health data and specialized functions
Workers compensation
Organ donation
HIPAA exceptions
Identification of the deceased, or for cause of death.
Financial institution payment processing
Utilization review
Credentialing
When mandated by other laws
Other activities that are part of ensuring appropriate
treatment and payment
Privacy requests
An individual may request restriction of use of their PHI
in the course of TPO.
However, a provider is not required to accept those
restrictions.
If it does agree, it is legally bound by that agreement.
Compliance Dates
Healthcare providers, April 14, 2003
Health Plans:
–
–
Large: April 14, 2003
Small: April 14, 2004
Healthcare clearinghouses, April 14, 2003
Privacy Technology
Protecting medical privacy involves the usual set of
access control problems – i.e. most of traditional
cryptography and secure systems design.
It also poses some new challenges that are relatively
well-defined:
–
–
“Private” Data mining of PHI: i.e. analysis of PHI records
without actually access or risk to the personally-identifying
information.
Analysis of identifiability and de-identification: Its non-trivial
to determine how identifiable is an individual from particular
information about them.
Techniques - Anonymization
Anonymization is an umbrella term for a set of
techniques that separate data records from individual
users.
For PHI, an important concept is “linkability”, the
potential linkage between a person and their
information.
Joe Smith
Age: 35
Weight: 170
Alcohol use:##
Diabetes?:##
Allergies:##
Techniques - Anonymization
One can break the link and replace it with a pseudonym,
which still allows 2nd-order statistical analyses.
Joe Smith
Pseudo: 3143231
Pseudo: 3143231
Age: 35
Weight: 170
Alcohol use:##
Diabetes?:##
Allergies:##
But unfortunately, even a modest amount of personal
medical history may be unique to a patient. i.e. you can
figure out who owns the record at right in many cases.
Techniques – Statistical perturbation
One can also change the actual values in a personal
patient record by adding random “noise” so that the
actual values are hard to determine:
Advantages:
–
–
Easy to do
Relatively easy to analyze the effects of perturbation
Disadvantages:
–
–
–
–
Trades privacy for accuracy, and may do a poor job at both.
Requires large populations for accurate statistical aggregates.
Doesn’t work for all types of aggregates.
If random offsets are repeated per patient, the averages
converge to patient’s actual data. How to update patient
data?
Private Computation
Boundary-aware
private computation
All user data
Information that
service provider
needs
Information user
wants to keep
private
Private Computation
User data is obfuscated
before going to server
U
1
U
2
S
1
U
3
U
4
Server can compute
the final result only.
Private Computation
User data is obfuscated
before going to server
U
1
U
2
S
1
U
3
U
4
Server can compute
the final result only.
Private Arithmetic
There are two approaches:
– Homomorphism: User data is encrypted with a public
key cryptosystem. Arithmetic on this data mirrors
arithmetic on the original data, but the server cannot
decrypt partial results.
– Secret-sharing: User sends shares of their data to
several servers, so that no small group of servers
gains any information about it.
Challenges
Addition is easy with either method, multiplication is
possible but very tricky in practice.
Homomorphism is expensive (10,000x more than normal
arithmetic).
Secret-sharing is essentially free, but requires several
servers.
A hybrid approach
We proposed to use secret-sharing for privacy, and
homomorphic computation to validate user data.
Secret-sharing works over normal (32 or 64-bit) integers
and is very fast.
Homomorphism uses large integers (1024-bits) but with
randomization we only need to do O(log n) operations.
The result is a method for vector addition with validation
that runs in a few seconds.
Its not obvious, but vector addition is the building block
for many (perhaps most) numerical data mining tasks.
Secret-sharing
For secret-sharing, you need at least two servers that
will not collude. Where do these come from?
P4P: Peers for Privacy
In P4P, a group of users elects some “privacy providers”
within the group.
Privacy providers provide privacy when they are
available, but cant access data themselves.
U
P
U
U
S
Peer
Group
U
U
U
P
P4P
The server provides data archival, and synchronizes the
protocol
Server only communicates with privacy peers
occasionally (e.g. once per day at 2AM).
U
P
U
U
S
Peer
Group
U
U
U
P
Discussion Questions
HIPAA still gets mixed reactions from providers and
patients. Discuss its from both perspectives:
How privacy regulations can impede care-givers,
researchers, HMOs etc.
How the regulations don’t go far enough in some
scenarios.
Discuss the two approaches to privacy protection, i.e.
perturbation and private computation. What are some
trade-offs you can see that were not mentioned in the
papers?