Transcript HTL Slides

Towards Heterogeneous Transfer
Learning
Qiang Yang
Hong Kong University of Science and Technology
Hong Kong, China
http://www.cse.ust.hk/~qyang
1
TL Resources

http://www.cse.ust.hk/TL
2
Learning by Analogy


Learning by
Analogy: an
important
branch of AI
Using
knowledge
learned in one
domain to help
improve the
learning of
another
domain
3
Learning by Analogy

Gentner 1983: Structural Correspondence

Mapping between source and target:





mapping between objects in different domains
e.g., between computers and humans
mapping can also be between relations
Anti-virus software vs. medicine
Falkenhainer,Forbus, and Gentner (1989)
Structural Correspondence Engine: incremental transfer of
knowledge via comparison of two domains


Case-based Reasoning (CBR)

e.g.,(CHEF) [Hammond, 1986] ,AI planning of recipes for
cooking, HYPO (Ashley 1991), …
4
Challenges with LBA
(ACCESS): find similar case
candidates
•
•

How to tell similar cases?
Meaning of ‘similarity’?
Access, Matching
and Eval:

MATCHING: between source and
target domains
•
•
•
Many possible mappings?
To map objects, or relations?
How to decide on the objective
functions?


Our problem:

EVALUATION: test transferred
knowledge
•
•
How to create objective
hypothesis for target domain?
How to ?
decided via prior
knowledge
mapping fixed
How to learn the
similarity
automatically?
5
Heterogeneous Transfer Learning
Multiple
Domain Data
Heterogeneous
Yes
Source
Domain
Homogeneous
No
Apple is a fruit that can
be found …
Different
Same
Banana is
the
common
name for…
Target
Domain
HTL
6
Cross-language Classification
WWW 2008: X.Ling et al.
Can Chinese Web Pages be Classified with English Data Sources?
Classifier
learn
Labeled
English
Web pages
classify
Unlabeled
Chinese Web
pages
Cross-language Classification
7
Heterogeneous Transfer Learning:
with a Dictionary
[Bel, et al. ECDL 2003]
[Zhu and Wang, ACL 2006]
[Gliozzo and Strapparava ACL 2006]
DICTIONARY
Labeled documents
in English (abundant)
Labeled documents
in Chinese (scarce)
Translation Error
Topic Drift
TASK: Classifying documents
in Chinese
8
Information
Bottleneck
Improvements:
over 15%
[Ling, Xue, Yang et al. WWW2008]
Domain Adaptation
9
HTL Setting: Text to Images


Source data: labeled or unlabeled
Target training data: labeled
Training: Text
Apple
Banana
Testing: Images
The apple is the pomaceous fruit of
the apple tree, species Malus
domestica in the rose family
Rosaceae ...
Banana is the common name for a
type of fruit and also the
herbaceous plants of the genus
Musa which produce this commonly
eaten fruit ...
10
HTL for Images: 3 Cases

Source Data Unlabeled, Target Data
Unlabeled


Source Data Unlabeled, Target Data
Training Data Labeled


Clustering
HTL for Image Classification
Source Data Labeled, Target Training
Data Labeled

Translated Learning: classification
Annotated PLSA Model for Clustering Z
Caltech 256 Data
Heterogeneous Transfer
Learning
Average Entropy
Improvement
5.7%
From Flickr.com
SIFT Features
Words from Source Data
Topics
Image features
Image
instances in
target data
… Tags
Lion
Animal
Simba
Hakuna
Matata
FlickrBigCats
…
12

“Heterogeneous transfer
learning for image classification”
 Y. Zhu, G. Xue, Q. Yang et al.
 AAAI 2011
13
HTL Setting: Text to Images


Source data: labeled or unlabeled
Target training data: labeled
Training: Text
Apple
Banana
Testing: Images
The apple is the pomaceous fruit of
the apple tree, species Malus
domestica in the rose family
Rosaceae ...
Banana is the common name for a
type of fruit and also the
herbaceous plants of the genus
Musa which produce this commonly
eaten fruit ...
14

A Picture is Worth
?
Words?
15
Y. Zhu, G. Xue, Q. Yang et al. Heterogeneous transfer
learning for image classification. AAAI 2011
Target data
Unlabeled Source data
A few labeled
images as
training
samples
Testing
samples: not
available during
training.
16
Social Media Data as a Bridge
The Heterogeneous Transfer
Learning Framework
Learn latent
representation for
auxiliary images
Using all source data
Target images
Latent
Representation
Projected
representation of
target images
18
Latent Feature Learning by Collective matrix
factorization
tags
√
√
~
√
√
.34
.21
.44
.40
.05
.31
.40
.38
.07
.36
.26
.37
.28
.03
.47
.29
.13
.05
.49
.34
.15
.38
.06
.24
.37
.30
.02
Cosine similarity
Based on image latent factors
After cofactorization
0.34
.36
?
The latent
factors
for tags are
the same
=
?
?
√
images
Olympic
track
country
road
blue
gym
images
√
tags
0.32
√
√
√
√
√
√
~
documents
Olympic
track
country
road
blue
gym
documents
√
tags
.22
.69
.08
.40
.05
.31
.40
.38
.07
.43
.41
.38
.28
.03
.47
.29
.13
.05
.48
.28
.43
.38
.06
.24
.37
.30
.02
19
Optimization:
Collective Matrix Factorization (CMF)
•
•
•
•
•
•
G1 - `image-features’-tag matrix
G2 – document-tag matrix
W – words-latent matrix
U – `image-features’-latent matrix
V – tag-latent matrix
R(U,V, W) - regularization to avoid over-fitting
The latent
semantic
view of
images
The latent
semantic
view of tags
20
HTL Algorithm
21
Experiment: # documents

Accuracy


# documents

To reach 75%
accuracy, need
about 100 labeled
images
But this is achieved
with 200 Text
Documents.
Thus, each image =
2 text
documents=1,000
words
Yes: one image is
indeed worth 1000
words!
22
Experiment: # documents
Accuracy
When more text
documents are
used in learning,
the accuracy
increases.
# documents
23
Experiment: # Tagged images
Accuracy
# Tagged Images
24
Experiment: Noise
Accuracy


Amount of Noise
We considered the
“noise” of the tagged
image.
When the tagged
images are totally
irrelevant, our method
reduced to PCA; and
the Tag baseline,
which depends on
tagged images,
reduced to a pure
SVM.
25
Structural Transfer Learning
26
Structural Transfer

Transfer Learning from Minimal Target Data by Mapping across
Relational Domains

Lilyana Mihalkova and Raymond Mooney

In Proceedings of the 21st International Joint Conference on Artificial
Intelligence (IJCAI-09), 1163--1168, Pasadena, CA, July 2009.

``use the short-range clauses in order to find mappings between the
relations in the two domains, which are then used to translate the
long-range clauses.’’

Transfer Learning by Structural Analogy.



Huayan Wang and Qiang Yang.
In Proceedings of the 25th AAAI Conference on Artificial Intelligence
(AAAI-11). San Francisco, CA USA. August, 2011.
Find the structural mappings that maximize structural similarity
Structural Transfer [H. Wang and Q. Yang AAAI
2011]
Goal:
Learn a correspondence structure between domains
Use the correspondence to transfer knowledge
mother
son
女儿
daughter
父亲
儿子
father
English
母亲
Chinese (汉语)
28
Transfer Learning by Structural Analogy

Algorithm Overview
1
2
Select top W features from both domains respectively
(Song 2007).
Find the permutation (analogy) to maximize their
structural dependency.


3
Iteratively solve a linear assignment problem (Quadrianto
2009)
Structural dependency is max when structural similarity is
largest by some dependence criterion (e.g., HSIC, see next…)
Transfer the learned classifier from source domain to
the target domain via analogous features
Structural Dependency: ?
Transfer Learning by Structural
Analogy

Hilbert-Schmidt Independence Criterion (HSIC) (Gretton
2005, 2007; Smola 2007)


Estimates the “structural” dependency between two sets of
features.
The estimator (Song 2007) only takes kernel matrices as
input, i.e., intuitively, it only cares about the mutual relations
(structure) among the objects (features in our case).
Cross-domain
Feature correspondence
feature dimension
We compute the kernel matrix by taking the inner-product
between the “profile” of two features over the dataset.
Transfer Learning by Structural Analogy

Ohsumed Dataset



Source: 2 classes from the dataset, no labels in target dataset
A linear SVM classifier trained on source domain achieves
80.5% accuracy on target domain.
More tests in the table (and paper)
Conclusions and Future Work

Transfer Learning




Instance based
Feature based
Model based
Heterogeneous Transfer Learning


Translator: Translated Learning
No Translator:


Structural Transfer Learning
Challenges
32
References




http://www.cse.ust.hk/~qyang/publicatio
ns.html
Huayan Wang and Qiang Yang. Transfer Learning by Structural Analogy. In
Proceedings of the 25th AAAI Conference on Artificial Intelligence (AAAI-11).
San Francisco, CA USA. August, 2011. (PDF)Yin Zhu, Yuqiang Chen, Zhongqi
Lu, Sinno J. Pan, Gui-Rong Xue, Yong Yu and Qiang Yang. Heterogeneous
Transfer Learning for Image Classification. In Proceedings of the 25th AAAI
Conference on Artificial Intelligence (AAAI-11). San Francisco, CA USA. August,
2011. (PDF)
Qiang Yang, Yuqiang Chen, Gui-Rong Xue, Wenyuan Dai and Yong Yu.
Heterogeneous Transfer Learning for Image Clustering via the Social Web. In
Proceedings of the 47th Annual Meeting of the ACL and the 4th IJCNLP of the
AFNLP (ACL-IJCNLP'09), Sinagpore, Aug 2009, pages 1–9. Invited Paper
(PDF)
Wenyuan Dai, Yuqiang Chen, Gui-Rong Xue, Qiang Yang, and Yong Yu.
Translated Learning. In Proceedings of Twenty-Second Annual Conference on
Neural Information Processing Systems (NIPS 2008), December 8, 2008,
Vancouver, British Columbia, Canada. (Link
Harbin 2011
33