YChen - Computer Science
Download
Report
Transcript YChen - Computer Science
Multiple-Instance Learning via
Embedded Instance Selection
Yixin Chen
Department of Computer Science
University of New Orleans
http://www.cs.uno.edu/~yixin
Wayne State University, 1/31/2006
1
Outline
An overview of multiple-instance learning
Multiple-instance learning via embedded
instance selection
Experimental results and discussions
Conclusions and future work
Wayne State University, 1/31/2006
2
Supervised Learning
Fixed-length vector of attribute values
(usually called a “feature vector”)
Unknown
Samplei , Resulti
Process
Resulti
Result = f (Sample)
Classification problem: if Result is discrete or categorical
Regression problem: if Result is continuous
Wayne State University, 1/31/2006
3
Multiple-Instance Learning Problem
Each instance is a fixed-length feature vector
Instance1
Instance2
Samplei =, Result
i
Instance
3
Bagi
Unknown
Process
Resulti i
Label
Instancen
discrete or continuous
A label is associated with a bag, not the instances in the bag
Wayne State University, 1/31/2006
4
Drug Activity Prediction
Bag: a molecule
Predict
whether a
candidate
Instance: adrug
shape
molecule
will bind
conformation
strongly to a target
Goal: predict
protein
whether a molecule
Binding
binds to strength
a proteinisof
largely
interest,determined
and find the
by
the shape of
drug
conformation
that
Different conformations a Butane molecule
molecules
binds
(C H ) can take on. The molecule can rotate
4
10
about the bond between the two central carbon
atoms. (© 1998 by Oded Maron)
Wayne State University, 1/31/2006
5
Object Recognition
Bag: an image
Instance: a salient region in an image
Goal: predict whether an image contains
an object, and identify the regions
corresponding to the object
Wayne State University, 1/31/2006
6
Multiple-Instance Learning Models
A bag is positive if and only if it contains
at least one positive instance
xAxis-Parallel
1
Rectangles
Algorithm (APR)
5 o
2
[Dietterich,
et al., AI 1997]
3
1
*
*
5
3
2o
3
5 * o
1
*
5
4
o
4
5
1
2
1
*
3
*
1
1
2
3
4
o
2
o
But there may not exist an
APR
APR thatMinimal
contains
at least
one instance from each
positive bag and no instance
from any negative bags
o
2
4
o
4
x2
Wayne State University, 1/31/2006
7
Multiple-Instance Learning Models
Diverse Density Algorithm (DD) [Maron and Lozano-Pérez, NIPS 1998]
x1
5
3
1
*
*
o
2
5
3
2o
3
5 * o
1
*
5
4
o
4
5
2
1
.
1
*
3
*
1
1
2
3
4
o
2
o
o
2
4
o
4
x
2
The diverse density at a
location is high if the location
is close to instances from
different positive bags and is
far way from all instances in
negative bags
Searching for an “axis-parallel
ellipse” with high diverse
density
Sensitive to noise
High computational cost
Wayne State University, 1/31/2006
8
Multiple-Instance Learning Models
EM-DD Algorithm [Zhang and Goldman, NIPS 2001]
x1
5
3
1
*
5
3
o
.
.
2
2 o eyes
3
5 * o
*
1
*
5
4
o
4
5
mouth
1
2
*
3
*
1
1
1
2
3
4
o
2
o
o
2
4
o
The diverse density is
approximated by the “most
likely” instance in each bag
Finding an “axis-parallel
ellipse” with high diverse
density
Sensitive to noise
Cannot learn complex
concepts
nose 4
x2
Wayne State University, 1/31/2006
9
Multiple-Instance Learning Models
DD-SVM Algorithm [Chen and Wang, JMLR 2004]
x1
5
3
1
*
*
o
2
5
3
2o
3
5 * o
1
*
5
4
o
4
5
.
Instance Prototype 1
z1
*
o
o
1
.
2
4
5
1
1
1
o
2
2
z2
4
*
3
*
3
2 1
z2
3 2
Instance Prototype
o
2
4
o
4
x2
z1
Sensitive to noise
Computational cost
Instance classification
Wayne State University, 1/31/2006
10
Outline
An overview of multiple-instance learning
Multiple-instance learning via embedded
instance selection
Wayne State University, 1/31/2006
11
Motivation
N3
N1
xk
xi
xj
A bag is positive if it
contains instances from
at least two different
distributions among N1,
N2, and N3
N2
20 positive bags, 20 negative bags
Wayne State University, 1/31/2006
12
Motivation
Embedding of bags
Bags can be
separated by a
hyperplane
Find the “right”
embedding and the
classifier
20 positive bags and 20 negative bags in the new feature space
Wayne State University, 1/31/2006
13
MILES: Multiple-Instance Learning
via Embedded Instance Selection
Instance-based feature mapping
Joint feature selection and classification
Minimizing a regularized training error
1-norm SVM
1-norm of w
Hinge loss function
Wayne State University, 1/31/2006
14
MILES: Multiple-Instance Learning
via Embedded Instance Selection
Instance classification
Wayne State University, 1/31/2006
15
Outline
An overview of multiple-instance learning
Multiple-instance learning via embedded
instance selection
Experimental results and discussions
Wayne State University, 1/31/2006
16
Drug Activity Prediction
MUSK1 and MUSK2 benchmark data
sets
A bag represents a molecule
An instance represents a low-energy
conformation of the molecule (166 features)
# of bags
# of instances/ bags
# of positive bags
Musk 1
92
5.17
47
Musk 2
102
64.69
39
Wayne State University, 1/31/2006
17
Prediction Accuracy
Wayne State University, 1/31/2006
18
Region-Based Image Categorization
COREL data set, 20 image categories,
each containing 100 images
Africa
Buildings
Dinosaurs
Flowers
Mountains
Beach
Buses
Elephants
Horses
Food
Dogs
Lizard
Fashion
Sunsets
Cars
Wayne State University, 1/31/2006
Waterfall
Antiques
Battle ships
Skiing
Dessert
19
Sample Images
Wayne State University, 1/31/2006
20
Confusion Matrix
Wayne State University, 1/31/2006
21
Misclassified Images
Wayne State University, 1/31/2006
22
Performance Comparison
Average classification accuracy
Wayne State University, 1/31/2006
23
Sensitivity to Labeling Noise
Wayne State University, 1/31/2006
24
Object Class Recognition
Caltech data set
Airplanes (800 images)
Cars (800 images)
Faces (435 images)
Motorbikes (800 images)
Background (2270 images)
Salient region detector [Kadir and Brady, IJCV, 2001]
Wayne State University, 1/31/2006
25
Sample Images
Wayne State University, 1/31/2006
26
Performance Comparison
True positive rate at the equal-error-rates point on the
ROC curve
Wayne State University, 1/31/2006
27
Selected Features
Patches related to positive features for object class ‘Airplanes’. Among 6821
features, 96 were selected as positive; 97 were selected as negative; 16
features were false positive.
Patches related to positive features for object class ‘Cars’. Among 10441
features, 97 were selected as positive; 98 were selected as negative; 9
features were false positive.
Wayne State University, 1/31/2006
28
Selected Features
Patches related to positive features for object class ‘Faces’. Among 6997
features, 42 were selected as positive; 27 were selected as negative; 14
features were false positive.
Patches related to positive features for object class ‘Motorbikes’. Among 9995
features, 101 were selected as positive; 90 were selected as negative; 3
features were false positive.
Wayne State University, 1/31/2006
29
Instance Classification
Wayne State University, 1/31/2006
30
Instance Classification
Wayne State University, 1/31/2006
31
Computation Time
Training time
Over 10 folds
500 images
Wayne State University, 1/31/2006
Cars
32
Outline
An overview of multiple-instance learning
Multiple-instance learning via embedded
instance selection
Experimental results and discussions
Conclusions and future work
Wayne State University, 1/31/2006
33
Summary
MILES
Instance-based feature mapping
Joint feature selection and classification
using 1-norm SVM
Instance classification
Competitive performance in terms of
accuracy, speed, and robustness to labeling
noise
Wayne State University, 1/31/2006
34
Future Work
Storage requirement
A data matrix of size
Sparseness
Constraints on instances
Model the spatial relationship among
instances
Learning parts in a 1-class setting
Wayne State University, 1/31/2006
35
Supported by
Louisiana Board of Regents RCS Grant
NASA EPSCoR DART Grant
NSF EPSCoR Pilot Fund
University of New Orleans
The Research Institute for Children
Wayne State University, 1/31/2006
36
Acknowledgement
Jinbo Bi, Siemens Medical Solutions
Ya Zhang, University of Kansas
Timor Kadir
Rob Fergus
Wayne State University, 1/31/2006
37
More Information
Papers in PDF, demonstrations, data
sets, etc.
http://www.cs.uno.edu/~yixin
[email protected]
Wayne State University, 1/31/2006
38