Transcript pptx
Final Exam Review
Spring 2011
Exam format
About 75 questions
45% multiple choice and T/F
30% short fill-ins
25% short-paragraph explanations
What to study
50 questions from Exam 1 and 2
12-15 questions about presentation topics
10-13 questions will come from labs
Concepts from
Market Basket: Data Mining
SCM: RFID (hardward) + XML(concept)
Fund Trading: DBs for Optimization
Pivot Chart: DBs for Discovery & Prediction
Wagemart: DBs for Decision Support
Market Basket Analysis
Support: Probability (P) that an item is in someone’s checkout basket
A,B,E
A,B,F
A,B
A,B,F,G
A,D,F
C,D
C,D,G
E,F,G
E,F
E,G
P(A) = 5/10 = 50%
P(AB) = 4/10 = 40%
P(C) = 2/10 = 20%
P(CD) = 2/10 = 20%
Market Basket Analysis
Confidence X Y = P(XY)/P(X) : If item X is purchased, what is the
probability that item Y is also purchased
Confidence B A
= P(AB)/P(A)
= 40%/50%
= 80%
Confidence C D
= P(CD)/P(C)
= 20%/20%
= 100%
Given: P(A) = 5/10 = 50%
P(AB) = 4/10 = 40%
P(C) = 2/10 = 20%
P(CD) = 2/10 = 20%
Market Basket Analysis
Quality X Y = Confidence X Y * P(YX)
High quality association rules
Quality A B
= 80% * 40%
= 32%
Quality C D
= 100% * 20%
= 20%
Apriori Algorithm:
Calculate high quality association rules given
billions of transactions
millions of items
Complex Association Rule
ADGMS CLPT (50% quality, 80% confidence), i.
5 items (A,D,G,M, and S) imply with great
confidence that 4 items (C, L, P, and T) are
purchased.
Without the Apriori Algorithm, the calculation
would take too long (millions of years).
Apriori Algorithm
How it works:
By setting minimum support level, the algorithm can
prune low confidence pairs (2-itemsets) to compute
3-itemsets.
Then, the pruned 3-itemsets can compute 4itemsets. The algorithm is guaranteed to return all
the itemsets above the minimum support level.
When you get to 5-, 6-, or 7-itemsets, the pruning
reduces the number of possible sets from trillions to
a few thousand or hundred, which can help
humans discover very complex, high quality
association rules.
Importance of Apriori
Algorithm
A process
An innovation that takes terabytes of data and
reduces it to meaningful rules
Raw Data Relevant and Timely Information
A.I.
Data Mining
Market Basket Analysis
Pivot Chart Lab
Great example of Online Analytical Processing
(OLAP)
Slice & Dice data (Temp., Mood, Day, Weather)
Drill Down (look at only incorrect predictions)
Unlike Data Mining,
the process is interactive
a person participates in the process
The process is Ad. Hoc.
The process is not pre-determined like Apriori Algo.
Significance of Pivot Chart
Lab
Business Intelligence (like A.I.)
Use OLAP to find patterns
Encode patterns as IF statements to predict
future cases.
The spreadsheet can automate the human
decision making process on a large scale, faster
than a human.
Such a system enables timely, accurate
predictions without a human decision-maker
(Business Intelligence System)
Excel Pivot Charts as a tool
First: Pattern is noticed
Second: Interactive analysis tools (Pivot Chart)
helps to confirm and pin-point the pattern
Example: A marketer thinks that geography plays
a role in sales; a Pivot chart shows that Southern
stores do have better sales.
Database queries as tools
First: The data mining reveals numerous patterns
(association rules)
Second: Human intelligence can derive the
theory behind the pattern.
Example: The Apriori algorithm discovers a high
quality association rule (Beer Diapers). Later,
Marketers try to unravel the reason why.
The data analysis must come before the hypothesis
because the data is too big for humans to analyze.
Fund Trading Lab
Decision Support Automation: Using a Database to
compute the optimal sequence of trades.
Too many combinations for a human to analyze
Another Example of Business intelligence
1. At first we use a graph and human intuition to make
the trades
2. We do better if we use a query to calculate and
sort all possible transactions
3. We use Database tools to pick the best one’s that
don’t overlap
Decision Support Systems:
Wagemart vs. Fund Trading
Wagemart
Fund Trading
start with tons of data
start with less data
individual salaries,
availability
reduce it to simple info
total cost, average rating
to help make a decision.
Fund value for each day
compute every possible
transaction
Much more data
Queries are used to find the
optimal transactions
Decision Support Systems:
Wagemart vs. Fund Trading
Both system model scenarios to compute the
outcome of decisions
one is structured
one scenario to optimize
the other unstructured
many different scenarios to consider
Fund Trading was more structured, i.e., you can only
buy and sell; you just have to decide the optimal
day and funds to buy/sell.
Wagemart was very unstructured, many different
ways to cut costs.
Porter’s 5-forces
Do companies complete because its fun?
Maybe some…
They compete because of the threat of going
out of business.
Profitability is the penultimate measure of success
Why?
What are the threats?
A new competitor
Will take away your sales and profits?
Because they are better?
In business what does better really mean?
The five forces/threats
New entrants
Substitute products
Rivalry
Bargaining power of consumers
Bargaining power of suppliers
Example
Target forces their supplier to use XML-formatted
shipment data and boxes tagged with RFID
chips.
Apple refuses and wins
Target has to use Apple’s system to sell Apple’s
products.
What force is this?
Example
Indirect: Brooke visits Google Shopping and
Shopzilla to compare prices on a new camera.
She’ll buy from the most inexpensive online retailer
Direct: Bradley uses Lending Tree.com where
banks try to underbid each other to get his
business.
Example
Disney World implements a new ride tracking
system, that directs visitors to the rides with
shortest wait times.
Forces Universal Studios to invest in a similar
system.
Example
Everyone at the gym is using their iPhone or
Android phone to listen to music
MP3 players are now collecting dust
Example
Netflix emerges and puts 120 Blockbuster videos
stores out of business
Competitive strategies
To fight the forces
1. Do something totally new (innovation)
2. Be inexpensive (cost leadership)
3. Be big to increase power (growth)
Lock-in your customers
Lock-out your competition
4. Make mutually beneficial partnerships (alliance)
5. Be different but in a good way (differentiation)
Put up barriers to the competition
Example
Imagine if Blockbuster decided to use
Internet/Mail delivery before Netflix.
But Blockbuster was NOT ______________
By the way, Netflix created a totally new process for
renting videos.
How does an IS make this possible?
How is the IS better than the old-fashioned
process.
E-commerce
It was an innovation at one point
Now it necessary to stay in business
Example
Walmart’s efficient supply chain cuts cost.
RFID and XML play a role
Their size allows them to negotiate low prices with
suppliers.
Large companies absolutely need information
systems for good management
Walmart’s strategy is 2-fold.
How do Information Systems really
help businesses to compete?
The labs provide many examples
RFID, XML
More accessible, timely information for improving
supply chain.
Market Basket
More relevant information for increasing sales/profits
How do Information Systems really
help businesses to compete?
The labs provide many examples
Wagemart
More accurate information for modeling decisions
Pivot Chart & Fund Trading
Flexible information; manipulated in real-time to
solve problems (prediction & optimization)
The 11 information attributes
are fair game
Flexibility and accessibility are different.
Putting something on the web makes it more
accessible
Storing data electronically can make it more flexible
Putting electronic data in a robust, standardized
format (XML) improves both.
Attribute Trade-offs
Simple vs. Complete
Secure vs. Accessible
Presentations
Don’t forget to review presentations
The websites will be linked on Tuesday
Textbook Reading
Low priority
Top Priority
Review past exams and lookup correct answers
(Text and Google)
Will post them on Tuesday
Skim lab materials and instructions on Blackboard
Create cheat sheet