2010 Teradata Universe, Sydney

Download Report

Transcript 2010 Teradata Universe, Sydney

Teradata Universe
Fuchun Luan
1
Sequence for Today
Today's CoP will present
• 1 universe
• 2 questions
• 3 edges/scenarios
2
Teradata Universe
http://www.teradatauniverse.com.au/presentationLogin.aspx?returnURL=/uploads/presentations/sydney-universe
:: Presentations now available online ::
I am delighted you were able to join us at Teradata Universe 2010. I hope you found
the presentations informative and the experience hall and industry sessions a
valuable source of information and networking that will help you intelligently direct
your business activities. Presentations are now available online. Click here and enter
the code TDU2010 to access the presentations. If I can be of any assistance, please
do not hesitate to contact me. Yours sincerely,
Alan Ernst
Account Executive
ph: +61 2 6129 3570
[email protected]
3
Q1: Is zeroifNULL() the best?
Load dataset
Transform Missing
Transform Using Nolan’s
Kmeans & Clustering
Repeat multiple times
Generate Hotspots
Decision Tree Rules
Decision Tree Plot & Log Files
Risk Plot
4
Q2: Is over-fitting all evil?
•
•
Bad: missing cases (will not generalise to
new data)!
What about reducing false alarm cases?
反其道而行
坏:“宁可错杀三千,绝不放过一个“
好:“一线希望,百倍努力,绝不放弃”
5
6
The BIIA Hierarchy (OCKO)
• “The more I read, the smaller I feel”
7
Ways to run
• Three ways to deal with business
– Can only tell me what is wanted without training data
(data driven exploratory, e.g. hotspots)
deliverable=some interesting grouping to be examined
– Tell me what is wanted and provide business knowledge
(data mining + expert rules/models)
deliverable=actionable cases
– Tell me what is wanted and give me some examples
(predictive analytics)
deliverable=actionable cases
8