Privacy-Preserving Support Vector Machines via Random Kernels

Download Report

Transcript Privacy-Preserving Support Vector Machines via Random Kernels

Privacy-Preserving Support
Vector Machines via Random
Kernels
The 2008 International Conference on Data Mining
July 7, 2015
Olvi Mangasarian
UW Madison & UCSD La Jolla
Edward Wild
UW Madison
HorizontallyData
Partitioned Data
Features
1 2 ..………….…………. n
Examples
1
2
.
.
.
.
.
.
.
.
m
A1
A
A2
A3
Problem Statement
• Entities with related data wish to learn a classifier
based on all data
• The entities are unwilling to reveal their data to
each other
• If each entity holds a different set of examples
with all features, then the data is said to be
horizontally partitioned
• Our approach: privacy-preserving support vector
machine (PPSVM) using random kernels
– Provides accurate classification
– Does not reveal private information
Outline
• Support vector machines (SVMs)
• Reduced and random kernel SVMs
• Privacy-preserving SVM for horizontally
partitioned data
• Summary
Linear kernel: (K(A, B))ij = (AB)ij = AiB j = K(Ai, B j)
¢
¢
kernel, parameter
:(K(A, B)) = exp(-||A -B || )
SupportGaussian
Vector
SVMs
Machines
¢
• x 2 Rn
• SVM defined by
parameters u and threshold
 of the nonlinear surface
• A contains all data points
•{+…+} ½ A+
•{…} ½ A
• e is a vector of ones
K(A, A0)u· e e
ij
_
__
0
j
2
K(A+, A0)u ¸ e +e
+
++
_
_
i
+
+
+
+ +
+ ++
Minimize e y (hinge loss or plus
_
+ + +
Minimize e s (||u|| at+
function or max{•, 0}) to fit _
+
solution)
__to reduce +
data
K(x , A )u = 
overfitting
+
_
_
K(x , A )u = 
_
__
_
Slack variable y ¸ 0
_
_
_ K(x , A )u = 1
_
allows points
to be on the
_
wrong side of the
_
_
bounding surface _
0
0
1
0
0
0
0
0
0
Random
Reduced
Support
Vector
Machine
Reduced
Support
Support
Vector
Vector
Machine
Machine
Using
random
kernel
L&M, the
M&T,
2006:
2001:
replace
the
0) is a key result
K(A,
for
kernelBmatrix
K(A, A0) with
generating
a simple
and
K(A, B
Ā0), where
B
Ā00 is
consists
a
accurate
privacy-preserving
of a randomly
completely
random
selected
matrix
subset
SVM
of the rows of A
Random Kernel AB0 Error
Error of Random Kernels is Comparable to Full Kernels:
Linear Kernels
B is a
random
matrix with
the same
number of
columns as
A and
either 10%
as many
rows, or
one fewer
row than
columns
Full Kernel AA0 Error
Equal error for
random and
full kernels
Each point represents one of
7 datasets from the UCI
repository
Random Kernel K(A, B0) Error
Error of Random Kernels is Comparable Full Kernels:
Gaussian Kernels
Full Kernel K(A, A0) Error
Horizontally Partitioned Data:
Each entity holds different examples with the same
features
A1
A2
A3
A1
A2
A3
Privacy Preserving SVMs for Horizontally
Partitioned Data via Random Kernels
• Each of q entities privately owns a block of data A1, …,
Aq that they are unwilling to share with the other q - 1
entities
• The entities all agree on the same random basis matrix
and distribute K(Aj, B0) to all entities
• K(A, B0) =
• Aj cannot be recovered uniquely from K(Aj, B0)
Privacy Preservation:
Infinite Number of Solutions for Ai Given AiB0
• Given
–
–
Feng and Zhang, 2007:
Every submatrix of a random
matrix has full rank
• Consider solving for row r of Ai, 1 · r · mi from the
equation
– BAir0 = Pir , Air0 2 Rn
– Every square submatrix of the random matrix B is nonsingular
– There are at least
=
B
Pir
Air0
• Thus there are
solutions Ai to the equation BAi0 = Pi
• If each entity has 20 points in R30, there are 3020 solutions
• Furthermore, each of the infinite number of matrices in the
affine hull of these matrices is a solution
Results for PPSVM on Horizontally
Partitioned Data
• Compare classifiers that share examples
with classifiers that do not
– Seven datasets from the UCI repository
• Simulate a situation in which each entity
has only a subset of about 25 examples
Error Sharing Data
Error Rate of Sharing Data is Better than not Sharing:
Linear Kernels
7 datasets represented by one
point each
Error Rate
Without Sharing
Error Rate
With Sharing
Error Without Sharing Data
Error Sharing Data
Error Rate of Sharing Data is Better than not Sharing:
Gaussian Kernels
Error Without Sharing Data
Summary
• Privacy preserving SVM for horizontally
partitioned data
– Based on using the random kernel K(A, B0)
– Learn classifier using all data, but without revealing
privately held data
– Classification accuracy is better than an SVM without
sharing, and comparable to an SVM where all data is
shared
• Related work
– Similar approach for vertically partitioned data to appear
in ACMTKDD
– Liu et al., 2006: Properties of multiplicative data
perturbation based on random projection
– Yu et al., 2006: Secure computation of K(A, A0)
Questions
• Websites with links to papers and talks:
http://www.cs.wisc.edu/~olvi
http://www.cs.wisc.edu/~wildt