Data Mining with Oracle using Classification and

Download Report

Transcript Data Mining with Oracle using Classification and

Data Mining with Oracle
using Classification and
Clustering Algorithms
Presented by Nhamo Mdzingwa
Supervisor: John Ebden
Overview of Presentation

Recap of Proposal

Classification of Data Mining & DM Algorithms

Oracle Data Mining
Data Mining Process
Evaluation of Results
Progress so far
Updated Timeline
Plans





Objective

Investigate two types of algorithms
available in Oracle10g for data mining
(ODM).

Apply the two algorithms to actual data.
 Analyse
&
 Evaluate results in terms of performance.
Classification of Data Mining

Directed data mining/supervised learning
which build a model that describes one
particular attribute in terms of the rest of the
data.

Undirected DM / Unsupervised learning
builds a model to establish the relationships
amongst all the input attributes by grouping.
Classification of Data Mining
algorithms
Input attributes but
have no output
attributes
DM strategies
Unsupervised
learning
Supervised
learning
Input attributes and
output one or more
attributes
Classification
Clustering
k-Means
O-Cluster
Naive Bayes
Model Seeker
Adaptive Bayes
Estimation
Association Discovery
Prediction
Predictive variance
Visualization
Algorithms offered in Oracle10g
classification
1.
2.
3.
Adaptive Bayes Network
Naive Bayes
Model Seeker
clustering
1.
2.
3.
k-Means
O-Cluster
Predictive variance
association rules
1.
Apriori (association rules)
Evaluation of Results
Evaluation of unsupervised learning
models involves determining the level of
predictive accuracy.
 Evaluated using test data sets.
 Compare confidence and support levels of
models created from the same training
data to determine accuracy.

Progress
Literature Survey
 Oracle10g installed on Athena in Hons Lab
 Exploring the Oracle9i and 10g Suite
including JDeveloper
 Member of MetaLink (Oracle’s online support
service)

Updated Timeline
Continuation from literature and
tutorials
done
Investigate Clustering & Classification
done
algorithms (theory)
Find suitable computerised case
studies of the use of above algorithms
– with or without Oracle.
done
Search datasets for testing
(possibilities: AIDS data & faculty data)
In progress
Apply algorithms to data found then
Critically Analyse & assess results
Second semester
Write up paper
September vacation and 3rd term
Final project write up
Due 7/11