YChen - Computer Science

Download Report

Transcript YChen - Computer Science

Multiple-Instance Learning via
Embedded Instance Selection
Yixin Chen
Department of Computer Science
University of New Orleans
http://www.cs.uno.edu/~yixin
Wayne State University, 1/31/2006
1
Outline

An overview of multiple-instance learning

Multiple-instance learning via embedded
instance selection

Experimental results and discussions

Conclusions and future work
Wayne State University, 1/31/2006
2
Supervised Learning
Fixed-length vector of attribute values
(usually called a “feature vector”)
Unknown
Samplei , Resulti
Process
Resulti
Result = f (Sample)
Classification problem: if Result is discrete or categorical
Regression problem: if Result is continuous
Wayne State University, 1/31/2006
3
Multiple-Instance Learning Problem
Each instance is a fixed-length feature vector
Instance1
Instance2
Samplei =, Result
i
Instance
3

Bagi
Unknown
Process
Resulti i
Label
Instancen
discrete or continuous
A label is associated with a bag, not the instances in the bag
Wayne State University, 1/31/2006
4
Drug Activity Prediction

Bag: a molecule
Predict
whether a
 candidate
Instance: adrug
shape
molecule
will bind
conformation
strongly to a target
 Goal: predict
protein
whether a molecule
 Binding
binds to strength
a proteinisof
largely
interest,determined
and find the
by
the shape of
drug
conformation
that
Different conformations a Butane molecule
molecules
binds
(C H ) can take on. The molecule can rotate
4
10
about the bond between the two central carbon
atoms. (© 1998 by Oded Maron)
Wayne State University, 1/31/2006
5
Object Recognition

Bag: an image

Instance: a salient region in an image

Goal: predict whether an image contains
an object, and identify the regions
corresponding to the object
Wayne State University, 1/31/2006
6
Multiple-Instance Learning Models

A bag is positive if and only if it contains
at least one positive instance
xAxis-Parallel
1
Rectangles
Algorithm (APR)
5 o
2
[Dietterich,
et al., AI 1997]
3
1
*
*
5
3
2o
3
5 * o
1
*
5
4
o
4
5
1
2
1
*
3
*
1
1

2
3
4
o
2
o
But there may not exist an
APR
APR thatMinimal
contains
at least
one instance from each
positive bag and no instance
from any negative bags
o
2
4
o
4
x2
Wayne State University, 1/31/2006
7
Multiple-Instance Learning Models
Diverse Density Algorithm (DD) [Maron and Lozano-Pérez, NIPS 1998]
x1
5
3
1
*
*

o
2
5
3
2o
3
5 * o
1
*
5
4
o
4
5
2
1
.
1
*
3
*
1
1

2
3
4
o
2
o
o
2
4
o
4

x
2
The diverse density at a
location is high if the location
is close to instances from
different positive bags and is
far way from all instances in
negative bags
Searching for an “axis-parallel
ellipse” with high diverse
density
Sensitive to noise
High computational cost
Wayne State University, 1/31/2006
8
Multiple-Instance Learning Models
EM-DD Algorithm [Zhang and Goldman, NIPS 2001]
x1
5
3
1
*
5
3

o
.
.
2
2 o eyes
3
5 * o
*
1
*
5
4
o
4
5
mouth
1
2
*
3
*
1
1

1

2
3
4
o
2
o

o
2
4
o
The diverse density is
approximated by the “most
likely” instance in each bag
Finding an “axis-parallel
ellipse” with high diverse
density
Sensitive to noise
Cannot learn complex
concepts
nose 4
x2
Wayne State University, 1/31/2006
9
Multiple-Instance Learning Models
DD-SVM Algorithm [Chen and Wang, JMLR 2004]
x1
5
3
1
*
*
o
2
5
3
2o
3
5 * o
1
*
5
4
o
4
5
.
Instance Prototype 1
z1
*
o
o
1
.
2
4
5
1
1
1
o
2
2
z2
4
*
3
*
3
2 1
z2
3 2
Instance Prototype
o
2
4
o
4

x2


z1
Sensitive to noise
Computational cost
Instance classification
Wayne State University, 1/31/2006
10
Outline

An overview of multiple-instance learning

Multiple-instance learning via embedded
instance selection
Wayne State University, 1/31/2006
11
Motivation
N3
N1
xk
xi
xj
A bag is positive if it
contains instances from
at least two different
distributions among N1,
N2, and N3
N2
20 positive bags, 20 negative bags
Wayne State University, 1/31/2006
12
Motivation

Embedding of bags

Bags can be
separated by a
hyperplane

Find the “right”
embedding and the
classifier
20 positive bags and 20 negative bags in the new feature space
Wayne State University, 1/31/2006
13
MILES: Multiple-Instance Learning
via Embedded Instance Selection

Instance-based feature mapping

Joint feature selection and classification
Minimizing a regularized training error
1-norm SVM
1-norm of w
Hinge loss function
Wayne State University, 1/31/2006
14
MILES: Multiple-Instance Learning
via Embedded Instance Selection

Instance classification
Wayne State University, 1/31/2006
15
Outline

An overview of multiple-instance learning

Multiple-instance learning via embedded
instance selection

Experimental results and discussions
Wayne State University, 1/31/2006
16
Drug Activity Prediction

MUSK1 and MUSK2 benchmark data
sets
A bag represents a molecule
 An instance represents a low-energy
conformation of the molecule (166 features)

# of bags
# of instances/ bags
# of positive bags
Musk 1
92
5.17
47
Musk 2
102
64.69
39
Wayne State University, 1/31/2006
17
Prediction Accuracy
Wayne State University, 1/31/2006
18
Region-Based Image Categorization

COREL data set, 20 image categories,
each containing 100 images
Africa
Buildings
Dinosaurs
Flowers
Mountains
Beach
Buses
Elephants
Horses
Food
Dogs
Lizard
Fashion
Sunsets
Cars
Wayne State University, 1/31/2006
Waterfall
Antiques
Battle ships
Skiing
Dessert
19
Sample Images
Wayne State University, 1/31/2006
20
Confusion Matrix
Wayne State University, 1/31/2006
21
Misclassified Images
Wayne State University, 1/31/2006
22
Performance Comparison
Average classification accuracy
Wayne State University, 1/31/2006
23
Sensitivity to Labeling Noise
Wayne State University, 1/31/2006
24
Object Class Recognition

Caltech data set
Airplanes (800 images)
 Cars (800 images)
 Faces (435 images)
 Motorbikes (800 images)
 Background (2270 images)

Salient region detector [Kadir and Brady, IJCV, 2001]
Wayne State University, 1/31/2006
25
Sample Images
Wayne State University, 1/31/2006
26
Performance Comparison
True positive rate at the equal-error-rates point on the
ROC curve
Wayne State University, 1/31/2006
27
Selected Features
Patches related to positive features for object class ‘Airplanes’. Among 6821
features, 96 were selected as positive; 97 were selected as negative; 16
features were false positive.
Patches related to positive features for object class ‘Cars’. Among 10441
features, 97 were selected as positive; 98 were selected as negative; 9
features were false positive.
Wayne State University, 1/31/2006
28
Selected Features
Patches related to positive features for object class ‘Faces’. Among 6997
features, 42 were selected as positive; 27 were selected as negative; 14
features were false positive.
Patches related to positive features for object class ‘Motorbikes’. Among 9995
features, 101 were selected as positive; 90 were selected as negative; 3
features were false positive.
Wayne State University, 1/31/2006
29
Instance Classification
Wayne State University, 1/31/2006
30
Instance Classification
Wayne State University, 1/31/2006
31
Computation Time
Training time
Over 10 folds
500 images
Wayne State University, 1/31/2006
Cars
32
Outline

An overview of multiple-instance learning

Multiple-instance learning via embedded
instance selection

Experimental results and discussions

Conclusions and future work
Wayne State University, 1/31/2006
33
Summary

MILES
Instance-based feature mapping
 Joint feature selection and classification
using 1-norm SVM
 Instance classification


Competitive performance in terms of
accuracy, speed, and robustness to labeling
noise
Wayne State University, 1/31/2006
34
Future Work

Storage requirement
A data matrix of size
 Sparseness


Constraints on instances


Model the spatial relationship among
instances
Learning parts in a 1-class setting
Wayne State University, 1/31/2006
35
Supported by
Louisiana Board of Regents RCS Grant
 NASA EPSCoR DART Grant
 NSF EPSCoR Pilot Fund
 University of New Orleans
 The Research Institute for Children

Wayne State University, 1/31/2006
36
Acknowledgement
Jinbo Bi, Siemens Medical Solutions
 Ya Zhang, University of Kansas
 Timor Kadir
 Rob Fergus

Wayne State University, 1/31/2006
37
More Information

Papers in PDF, demonstrations, data
sets, etc.
http://www.cs.uno.edu/~yixin
[email protected]
Wayne State University, 1/31/2006
38