Transcript Data Mining

Data Mining
By Minh Osborne
Overview



What is data mining?
What can data mining do for you?
The technologies involved with data
mining.
What is Data Mining?


Data Mining is a tool that businesses use
to help them increase revenue by
understanding their consumer.
It helps find relationships and patterns in a
database that were previously unknown to
a business.
What can Data Mining do for
you?

Businesses are always looking for ways to
increase profits.




Cut costs
Cut labor
Increase productivity
Understand customer
What can Data Mining…… cont’d

By knowing your customer, you can better
tailor goods and services to them.




Targeted sales promotions.
Selectively sway the right type of people.
Understand consumer buying habits.
What’s hot and what’s not.
What can Data Mining…… cont’d

By knowing your customer and your
business, you can create better business
models and steer the company to
successful profitability.
What it can’t do



Like any tool, data mining is only useful
when use correctly within the right context.
It can’t produce miracle results and boost
revenues overnight.
Data mining allows you to analyze huge
databases and from the results, you can
make your own judgment on how useful it
is.
Technology of Data Mining

Data Mining and data warehousing

Cleansing data

Creating a subset database of cleansed data
Technology of Data Mining cont’d

Data Mining vs OLAP

OLAP – on-line analytical processing
 E.g. Let’s say you work at toy company
and you see that more non electronic toys
are being sold. You formulate a hypothesis
that the reason for this is because the non
electronic toys are cheaper. You query the
database and analyze the numbers.
Technology of Data Mining cont’d



This doesn’t explain why so you formulate
another hypothesis saying that it’s the age group
of the toys, there are more younger children
than older children. You query the data base
again and check your results.
OLAP is basically used to verify hypothesis by
querying the database.
Data Mining is different in that is uses the data
itself to uncover patterns.
Technology of Data Mining cont’d

Data classification methods






Statistical Algorithm
Neural Networks
Genetic algorithms
Nearest neighbor method
Rule induction
Data visualization
Pitfalls of Data Mining



Data mining doesn’t solve all problems.
It can only give you what patterns and
relationships it finds.
You must analyze the outcome for yourself
and produce models to see if to see if they
fit in the real world.