Transcript Document
CIS 600: Master's Project
Online Trading and Data MiningBased Marketing of IT Books
Supervisor : Dr. Haiping Xu
Student : Tsung-Ta Tu
Student ID : 999-20-1529
Outline
1. Introduction and Motivation
2. Data Mining Technology
3. System Architecture & Demo
4. Analyze and Discuss The Result
5. Conclusion
6. Future work
Introduction and Motivation
In Internet era, each E-Commerce
website contain a large database of
customer transactions, where each
transaction consists of a set of
items that purchased by a
customer in a visit.
All the data in the database is
treasure not garbage. When you
analyze the data, it can solve some
questions.
Introduction and Motivation (2)
Questions:
(1) How to keep touch with increasing customers?
(2) What are the characteristics, the requirement mode
and consuming patterns of the customers?
(3) How to design attractive binding products which
supply more convenient shopping options
for the customers?
Data Mining Techniques
(1) Association Rules
(2) Classification
(3) Clustering
(4) Neural Network
(5) Generalization
Association Rules
An association rule is a rule which implies
certain association relationships among a set
of objects (such as “occur together” or “one
implies the other”) in a database.
The intuitive meaning of such a rule is that
transactions of the database which contain X
tend to contain Y .
Association Rules (2)
This basic process for association rules analysis consist of
three important concerns
(1) Choosing the right set of items
(2) Generating rules by deciphering the counts in the cooccurrence matrix
(3) Overcoming the practical limits imposed by thousands
or tens of thousands of items appearing in combinations
large enough to be interesting
An Example
An example of an association rule is: ``75% of
transactions that contain diapers also contain
beer; 37.5% of all transactions contain both of
these items''. Here 75% is called the confidence
of the rule, and 37.5% is called the support of
the rule.
Jason Manager of IT Book
System Architecture and Skills
Ⅰ. System Architecture ( 3-Tier ) :
(1) Server Side
Oracle 9.0.2 Database + Windows XP
(2) Application Side Tomcat 5.0.18 + Windows XP
(3) Client Side
IE 6.0 + Windows XP
Ⅱ. Skills :
(1) UML
(2) HTML , JavaScript
(3) Java Program Language (J2SDK)
(5) JSP , Java Servlet
(6) JDBC , Java Bean
(8) Oracle SQL , PL/SQL ( Trigger , Procedure , Function )
(9) Oracle Database Management
Use Case Diagram
<<extend>>
Search Books
<<extend>>
Check Top10 Books
View Book Information
<<extend>>
Create Customer Profile
View Customer Profile
<<extend>>
Update Customer Profile
Customers
Place order for book
<<include>>
Payment
View Order History
Use Case Diagram
Add Book
<<extend>>
<<extend>>
Update Book Information
Check Books Information
<<extend>>
Remove Book
Manager
Analyze Association Rules of
Books
<<extend>>
Add Package for on Sale
<<extend>>
Update Package Information
Check on Sale List
<<extend>>
Remove Package
Class Diagram
Display System
Jason Manager of IT Book
Connect to Jason
Select Book Information
Search Book Information
Book Information
Login
My Profile
Place Order
Place Order
Place Order
Shopping Car
Place Order
Place Order
Order Information
Manager
Select Classification
Select Book
Profit Association Rule
Profit Association Rule
Promotion
Promotion
Analyze and Discuss The Result
Association rule help us to find out the association
in transaction, but too depend on it will lose the
consideration of other factor that influence the
customer behavior.
For example, classification and quantity of sale item
are also as an important factor that we need to
consider.
Analyze and Discuss The Result
Is the most confident rule the best rule ?
There is a problem. This rule is actually worse than if just
randomly saying that A appears in the transaction.
A occurs in 45 percent of the transactions but the rule only
gives 33 percent confidence. The rule does worse than just
randomly guessing.
Improvement
Improvement tells how much better a rule is at predicting
the result than just assuming the result in the first place.
It is given by the following formula:
P(A^B) / P (A)
Improvement = --------------------------P(B)
Improvement (2)
When improvement is greater than 1, then the
resulting rule is better at predicting the result than
random chance.
When it is less than 1 , it is worse than the random
probability.
The Profit Association Rules
The profit association rules that not only consider the
basic concept of association rule but also other influence
factor.
Three major portion of profit association rules are
(1) Frequency
(2) Quantity
(3) Auxiliary
Give each estimate a weight to calculate the final value
Frequency Portion
(1) Support : P(A^B)
(2) Confident : P(A^B) / P (A)
(3) Improvement : [P(A^B) / P (A)] / P(B)
Quantity Portion
(1) B’s sale quantity of B’s classification quantity
= Q(B) / Q (CB)
(2) A’s sale quantity of A’s classification quantity
= Q(A) / Q (CA)
(3) Comparative quality
= Q(B) / Q(A)
Auxiliary Portion
A and B have same author
A and B in same classification
Whether A in top 10 list or not
Whether B in top 10 list or not
Etc.
Case Study (1)
Case Study (2)
Case Study (3)
Conclusion
Profit association rule can suggest an evaluation value
that let marketing manager can make business decisions
include
(1) Catalog design
(2) What to put on sale
(3) How to design coupons
(4) Cross-marking.
Future work
Optimize the weight factor of Profit Association
Rule.
Integrate this system into CRM system (Data
Warehouse, Data Mining, Call Center)
Using AI technology to make Jason Manager
more like a human being.
Refine knowledge of domain know-how that
bring business intelligence (BI).
References
R. Agrawal, T. Imielinski, and A. Swami, “Mining association rules between sets
of items in large databases,” Proceedings of the ACM-SIGMOD International
Conference on Management of Data, Washington, DC, pp. 207-216, 1993.
C. H. Cai, “Mining association rules with weighted items,” Proceedings of the
International Database Engineering and Application Symposium, Cardiff, Wales,
UK, pp. 68-77, 1998.
A. Gyenesei, “Mining weighted association rules for fuzzy quantitative items,”
Techical Report, Turku Centre for Computer Science, no. 346, Finland, 2000.
R. Rastogi and K. Shim, “Mining optimized association rules with categorical and
numeric attributes,” IEEE Transactions on Knowledge and Data Engineering, vol.
14, no. 1, pp. 29 -50, 2002.
P. S. M. Tsai and C. M. Chen, “Mining quantitative association rules in a large
database of sales transactions,” Journal of Information Science and Engineering,
vol. 17, no.4, pp. 667-681, 2001.
Thank you