Types of Cost in Inductive Concept Learning

Download Report

Transcript Types of Cost in Inductive Concept Learning

Types of Cost in Inductive
Concept Learning
Troy Schrader
CIS526
What are the types of cost that
are involved in Inductive
Learning?

In real-world applications, there are many
different types of cost.
 Most Machine Learning (ML) literature
largely ignores all types of cost.
 The only exception is Constant Error Cost.
Why is this important?

Many ML papers ignore many of the cost types.
 By ignoring methods of cost, ML does not work
most effectively in real life situations.
 A taxonomy may help to organize the literature on
cost-sensitive learning.
 Motivation is to inspire researchers to investigate
all types of cost in inductive concept learning in
more depth.
Taxonomy









Cost of Misclassification of Errors
Cost of Tests
Cost of Teacher
Cost of Intervention
Cost of Unwanted Achievements
Cost of Computation
Cost of Cases
Human-Computer Interaction Cost
Cost of Instability
Constant Error Cost
Error-Rate
1
1 0
2 1
… 1
j
1
2
1
0
1
1
…
1
1
0
1
Accuracy
i
1
1
1
0
1
1 1
2 0
… 0
j
0
2
0
1
0
0
…
0
0
0
0
i
0
0
0
1
Cost of Misclassification of
Errors

Constant Error Cost
 Conditional Error Cost
– Individual Case
– Time of Classification
– Classification of Other Cases
– Feature Value
Cost of Tests

Constant Cost Test
 Conditional Cost Test
–
–
–
–
–
–
Prior Test Selection
Prior Test Results
True Class of Case
Test Side-Effects
Individual Case
Time of Test
Other Costs

Cost of Teacher
– Constant
– Conditional

Cost of Intervention
– Constant
– Conditional

Cost of Unwanted Achievements
– Constant
– Conditional

Cost of Instability
Cost of Computation

Static Complexity
– Size Complexity
– Structural Complexity

Dynamic Complexity
– Time Complexity
– Space Complexity

Training Complexity
 Testing Complexity
Cost of Cases

Batch Learner
 Incremental Learner
Human-Computer Interaction
Cost (HCIC)

Data Engineering
 Parameter Setting
 Analysis of Learned Models
 Incorporating Domain Knowledge
Results

Presentation of Taxonomy
 Serves as a platform for organization of
literature on cost-sensitive learning
 Inspires research into under-investigated
types of cost.
Weak/Strong Points






STRONG – Interesting idea for incorporating
different, mostly unconsidered costs into
classification methods.
STRONG – May be more pragmatic in real-world
scenarios.
STRONG – Good domain examples.
WEAK – Lacks formalized support for the points
in the paper.
WEAK – Sections of the paper were imbalanced.
WEAK – No empirical evidence to support
methods.
Suggestions for Improvements

Gather some empirical data to support the
costing methods.
 Recommend better ways for use of costing
methods (rather than adding more classes).
 Perhaps different weighting based on
feature?
 Incorporation of a weighted cost matrix for
predictions.
Conclusions

Turney presents some interesting ideas for various
costing methods.
 Although these methods are not well supported,
the ideas behind them will hopefully drive
research in the area of costing methods for
inductive concept learning.
 This will possibly result in support for the
methods.
References



Turney, P. (2000) Types of Cost in Inductive Concept
Learning. Proceedings Workshop on Cost-Sensitive
Learning at the Seventeenth International Conference on
Machine Learning (WCSL at ICML-2000), pages 15-21,
Stanford University, California.
Elkan, C. (2001) The Foundations of Cost-Sensitive
Learning. In Proceedings of the Seventeenth International
Joint Conference on Artificial Intelligence (IJCAI'01), pp.
973-978.
Zadrozny, B. and Elkan, C. (2001) Learning and Making
Decisions When Costs and Probabilities are Both
Unknown. In Proceedings of the Seventh International
Conference on Knowledge Discovery and Data Mining
(KDD'01), pp. 204-213.