Applications

Transcript Applications

Selected Applications of Transfer
Learning
杨强，Qiang Yang
Department of Computer Science and Engineering
The Hong Kong University of Science and Technology
Hong Kong
http://www.cse.ust.hk/~qyang
1
Case 1: 目标变化 目标迁移

Target Class Changes  Target
Transfer Learning




Training: 2 class problem
Testing: 10 class problem.
Traditional methods fail
Solution: find out what is not changed
bewteen training and testing
2
Our Work

Cross-Domain Learning





Translated Learning



TrAdaBoosting (ICML 2007)
Co-Clustering based Classification (SIGKDD
2007)
TPLSA (SIGIR 2008)
NBTC (AAAI 2007)
Cross-lingual classification (in WWW 2008)
Cross-media classification (In NIPS 2008)
Unsupervised Transfer Learning

Self-taught clustering (ICML 2008)
3
Our Work (cont)






Wenyuan Dai, Yuqiang Chen, Gui-Rong Xue, Qiang Yang, and Yong Yu.
Translated Learning. In Proceedings of Twenty-Second Annual Conference
on Neural Information Processing Systems (NIPS 2008), December 8,
2008, Vancouver, British Columbia, Canada. (Link)
Xiao Ling, Wenyuan Dai, Gui-Rong Xue, Qiang Yang, and Yong Yu. CrossDomain Spectral Learning. In Proceedings of the Fourteenth ACM SIGKDD
International Conference on Knowledge Discovery and Data Mining (ACM
KDD 2008), Las Vegas, Nevada, USA, August 24-27, 2008. 488-496 (PDF)
Wenyuan Dai, Qiang Yang, Gui-Rong Xue and Yong Yu. Self-taught
Clustering. In Proceedings of the 25th International Conference on Machine
Learning (ICML 2008), Helsinki, Finland, 5-9 July, 2008. 200-207 (PDF)
Wenyuan Dai, Qiang Yang, Gui-Rong Xue and Yong Yu. Boosting for
Transfer Learning. In Proceedings of The 24th Annual International
Conference on Machine Learning (ICML'07) Corvallis, Oregon, USA, June
20-24, 2007. 193 - 200 (PDF)
Wenyuan Dai, Gui-Rong Xue, Qiang Yang and Yong Yu. Co-clustering based
Classification for Out-of-domain Documents. In Proceedings of the
Thirteenth ACM SIGKDD International Conference on Knowledge Discovery
and Data Mining (ACM KDD'07), San Jose, California, USA, Aug 12-15,
2007. Pages 210-219 (PDF)
Dou Shen, Jian-Tao Sun, Qiang Yang and Zheng Chen. Building Bridges for
Web Query Classification. In Proceedings of the 29th ACM International
Conference on Research and Development in Information Retrieval (ACM
SIGIR 06). Seattle, USA, August 6-11, 2006. Pages 131-138. (PDF)
4
Query Classification and
Online Advertisement



ACM KDDCUP 05
Winner
SIGIR 06
ACM Transactions
on Information
Systems Journal
2006

Joint work with Dou
Shen, Jiantao Sun
and Zheng Chen
5
QC as Machine Learning
Inspired by the KDDCUP’05 competition



Classify a query into a ranked list of categories
Queries are collected from real search engines
Target categories are organized in a tree with
each node being a category
6
Related Works

Document/Query Expansion

Borrow text from extra data
source




Using hyperlink [Glover
2002];
Using implicit links from
query log [Shen 2006];
Using existing taxonomies
[Gabrilovich 2005];
Query expansion [Manning
2007]


Global methods:
independent of the queries
Local methods using
relevance feedback or
pseudo-relevance feedback

Query
Classification/Clustering




Classify the Web queries by
geographical locality
[Gravano 2003];
Classify queries according to
their functional types [Kang
2003];
Beitzel et al. studied the
topical classification as we
do. However they have
manually classified data
[Beitzel 2005];
Beeferman and Wen worked
on query clustering using
clickthrough data
respectively [Beeferman
2000; Wen 2001];
7
Target-transfer Learning in QC

Classifier, once trained, stays constant

Target Classes Before


Target Classes Now


Sports (Olympics, Football, NBA), Stock Market (Asian, Dow,
Nasdaq), History (Chinese, World) How to allow target to
change?
Application:



Sports, Politics (European, US, China)
advertisements come and go,
but our querytarget mapping needs not be retrained!
We call this the target-transfer learning problem
8
Solutions: Query Enrichment
+ Staged Classification
Target
Categories
Queries
Solution: Bridging classifier
Construction of
Synonym- based
Classifiers
Labels of
Returned
Pages
Search
Engine
Text of
Returned
Pages
Construction of
Statistical Classifier
Phase I: the training phase
Query
Classified
results
Classified
results
Finial Results
Phase II: the testing phase
9
Step 1: Query enrichment

Textual information
Title

Category information
Snippet
Category
Full text
10
Step 2: Bridging Classifier

Wish to avoid:


When target is changed, training needs to repeat!
Solution:

Connect the target taxonomy and queries by
taking an intermediate taxonomy as a bridge
11
Bridging Classifier (Cont.)

How to connect?
T
The relation between Ci
and C Ij
The relation between
and C Ij
q
Prior prob. of C Ij
The relation between
and CiT
q
12
Category Selection for
Intermediate Taxonomy

Category Selection for Reducing Complexity

Total Probability (TP)

Mutual Information
13
Experiment
─ Data Sets & Evaluation

ACM KDDCUP


Starting 1997, ACM KDDCup is the leading Data Mining and
Knowledge Discovery competition in the world, organized by ACM
SIG-KDD.
ACM KDDCUP 2005




Task: Categorize 800K search queries into 67 categories
Three Awards
(1) Performance Award ; (2) Precision Award; (3) Creativity Award
Participation



Evaluation data




142 registered groups;
37 solutions submitted from 32 teams
800 queries randomly selected from the 800K query set
3 human labelers labeled the entire evaluation query set
Evaluation measurements: Precision and Performance (F1)
3
We won all three. Overall F1  1 (F1 against human labeler i)
a
3
14 / 68

i 1
14
Result of Bridging Classifiers


Performance of the Bridging Classifier with Different
Granularity of Intermediate Taxonomy
Using bridging classifier allows the target
classes to change freely
 no the need to retrain the classifier!
15
Summary: Target-Transfer Learning
Intermediate
Class
Query
classify to
Similarity
Target class
16
Cross-Domain Learning
Input
Learning
Output
17
Case 1

Source


Target


Many labeled instances
Few labeled instances
Target and source domains



Same feature representation
Same classes Y (binary classes)
Different P(X,Y) distribution
18
TrAdaBoost = Transfer AdaBoost (cont.)


Given
 Insufficient labeled data from the target domain (primary
data)
 Labeled data following a different distribution (auxiliary
data)
The auxiliary data are weaker evidence for building the
classifier
source + target
Uniform weights (X)
Target training
19 19
TrAdaBoost = Transfer AdaBoost (cont.)

Misclassified examples:


increase the weights of the misclassified target
data
decrease the weights of the misclassified source
data
20 20
TrAdaBoost = Transfer AdaBoost (cont.)

Performance
21
Transfer Learning in Sensor Network
Tracking

Received-Signal-Strength (RSS) based localization
in an Indoor WiFi environment.
Access point 2
Mobile device
Access point 1
Access point 3
-30dBm
-70dBm
-40dBm
Where is the mobile device?
(location_x, location_y)
22
Distribution Changes



The mapping function f learned in the offline phase can be out
of date.
Recollecting the WiFi data is very expensive.
How to adapt the model ?
Night time period t 0
Day time period t1
Time
23
Transfer Learning in Wireless
Sensor Networks



Transfer across time
Transfer across space
Transfer across device
24
Latent Space based Transfer Learning
(Spatial Transfer)
Transfer Localization Models across Space [Pan, Yang et al. AAAI
08]



Some labeled data
collected in Area A
and unlabeled data
in B;
Only a few labeled
data collected in
Area B;
Want to:

Construct a
localization model
of the whole area
(Area A and Area
B)
25
Transfer across time
LeMan:
Static mapping
function learnt
from offline data;
LeMan2:
Area: 30 X 40 (81 grids)
Six time periods:
12:30am--01:30am
08:30am--09:30am
12:30pm--01:30pm
04:30pm--05:30pm
08:30pm--09:30pm
10:30pm--11:30pm
Relearn the
mapping function
from a few online
data
LeMan3:
Combine offline
and online data as a
whole training data
to learn the
mapping function.
26
Transfer knowledge via latent
manifold learning
Labeled WiFi Data
Labeled WiFi Data
Latent Manifold
Knowledge
Propagation
27
VIP Recommendation in Tencent Weibo
Friendship relations in
Tencent QQ, which is the
largest instant messenge
network
Knowledg
e
Transfer
Properties:
1. Data Sparsity: limited
neighbors for most users
2. Heterogeneous Links:
symmetric friendship vs.
asymmetric following
3. Large Data: 1 billion users
and tens of billion links
28
Social Relation based Transfer (SORT)
VIP Recommendation Based on One's
1. X: Friendship on QQ
2. S1: User Following Relations on Tencent Weibo
3. S2: VIP Following Relations on Tencent Weibo
Other Applications
Social App Recommendation in Tecent Qzone
Qzone (http://qzone.qq.com) is the largest social network in China.
Video Recommendation in Tencent Video
Four types of auxiliary data
1. binary ratings
2. social networks
3. context
4. video content
Rating Prediction
30
Activity Recognition

With sensor data collected on mobile
devices



Location

GPS, Wifi, RFID

From GPS, RFID, Bluetooth, etc.
Context: location, weather, etc.
Various models can be used


Non-sequential models:

Naïve Bayes, SVM …

HMM, CRF …
Sequential models:
Activity Recognition: Input &
Output (Vincent Zheng, A* Sg)

Input

Context and locations



Trained AR Model



Time, history, current/previous locations, duration, speed,
Object Usage Information
Training data from calibration
Calibration Tool: VTrack
Output:

Predicted Activity Labels




Running?
Walking?
Tooth brushing?
Having lunch?
http://www.cse.ust.hk/~vincentz/Vtrack.html
32
Datasets: MIT PlaceLab
http://architecture.mit.edu/house_n/placelab.html


MIT PlaceLab Dataset (PLIA2) [Intille et al.
Pervasive 2005]
Activities: Common household activities
33
Cross Domain Activity Recognition
[Zheng, Hu, Yang, Ubicomp 2009]

Challenges:


A new domain of
activities without
labeled data
Cleaning
Indoor
Cross-domain activity
recognition

Transfer some
available labeled data
from source activities
to help training the
recognizer for the
target activities.
Laundry
Dishwashing
34
How to use the similarities?
<Sensor Reading, Activity
Name>
Example: <SS, “Make
Coffee”>
Example:
sim(“Make Coffee”,
“Make Tea”) = 0.6
Similarity
Measure
THE WEB
Source Domain
Labeled Data
Example: Pseudo
Training Data: <SS,
“Make Tea”, 0.6>
Target Domain
Pseudo Labeled
Data
Weighted SVM
Classifier
35
Calculating Activity Similarities

How similar are two
activities?
◦
◦
◦
Use Web search
results
TFIDF: Traditional IR
similarity metrics
(cosine similarity)
Example

Calculated Similarity with
the activity "Sweeping"
Similarity
with the
activity
"Sweeping
"
Mined similarity
between the activity
“sweeping” and
“vacuuming”, “making
the bed”, “gardening”
36

Applications

Transcript Applications

Directory