MC 707 - Computer Information Systems
Download
Report
Transcript MC 707 - Computer Information Systems
MD240 - MIS
Oct. 4, 2005
Databases & the Data Asset
Harrah’s & Allstate Cases
Topics Covered
• Data & Information
– Data vs. information
• Architecture basics
– Data Warehouses & Data Marts
– Transactional vs. query systems
• Leveraging Data
– Harrah’s
– Allstate
Data, Information, & Knowledge
• Data - raw facts, figures, and details.
• Information - organized, meaningful, and
useful interpretation of data. Has a context,
answers a question.
• Knowledge - an awareness and
understanding of a set of information and
how that information can be put to best use.
• Many firms are data rich and info poor:
victims of an old or poorly planned
architecture
Examples of Data, Information, &
Knowledge
Data: raw, no context
900,000
1,200,000
1,150,000
1,100,000
Information: meaningful, has context
Quarter 1 Quarter 2
Post
900,000 1,150,000
Kellogg's 1,200,000 1,100,000
Knowledge: information above & other information
creates an awareness of impact
Post lowered its prices after the first quarter.
Price change has caused Post sales to rise at the expense
of Kellogg’s
Clients, Servers, DBMS, and Databases
Server - responds to client requests
request
response
Client - makes requests
of the server
• Database
DBMS - the program.
Manages interaction with
databases.
database - the collection of data.
Created and defined to meet the
needs of the organization.
– a collection of related data. Usually organized according to topics: e.g. customer
info, products, transactions
• Database Management System (DBMS)
– a program for creating & managing databases; ex. Oracle, MS-Access, MS SQL
Server, IBM DB2, mySQL
• SQL - Structured Query Language
– Most popular relational database standard. Includes a language for creating &
manipulating data.
A Simple Database
CUSTID
2001
2002
2003
2004
2005
FIRST
John
Abby
Warren
Peter
Charles
LAST
Gallaugher
Johnson
Buffet
Lynch
Schwab
CITY
Newton
Boston
Omaha
Marblehead
San Francisco
STATE
MA
MA
NE
MA
CA
…
…
…
…
…
…
• File/Table
– Customers
• Field/Column
– 5 shown: CUSTID, FIRST, LAST, CITY, STATE
• Record/Row
– 5 shown: one for each customer
Now With More Data
CUSTID
2001
2002
2003
2004
2005
FIRST
John
Abby
Warren
Peter
Charles
LAST
Gallaugher
Johnson
Buffet
Lynch
Schwab
One
Many
CITY
Newton
Boston
Omaha
Marblehead
San Francisco
CUSTID
2001
2001
2001
2002
2002
…
BROKID
B003
B001
B003
B001
B003
…
STATE
MA
MA
NE
MA
CA
Customer Table
Transaction Table
BUY/SELLSTOCK SHARES PRICE DATE
Buy
MSFT
1000 90 1/4
12/24/2003
Buy
INTC
2400 80 1/8
7/3/2004
Sell
IBM
3000 114 3/8
7/1/2004
Sell
IBM
3000 110 1/8
6/30/2005
Sell
INTC
2000 94 7/8
8/30/2005
…
…
…
…
…
TIME
12:01 PM
10:51 AM
9:03 AM
4:53 PM
3:15 PM
…
Many
Broker Table
One
BROKID
B001
B002
B003
FIRST
Ivan
Dennis
Michael
LAST
Boesky
Levine
Milken
…
…
…
…
Customer Table
Col. Name Length Type …
CUSTID
4 Char …
FIRST
10 Char …
LAST
15 Char …
CITY
15 Char …
STATE
2 Char …
…
…
…
…
Meta-Data
m
1
m
Broker Table
Col. Name Length Type
BROKID
4 Char
FIRST
10 Char
LAST
15 Char
…
…
…
…
Transaction Table
Col. Name Length Type
CUSTID
4 Char
BROKID
4 Char
BUY/SELL
1 Bool
STOCK
4 Char
SHARES
8 Num
PRICE
6.2 Money
…
…
…
…
…
…
…
…
…
…
1
…
…
…
• Data that describes the characteristics of stored data
• Enterprise Data Model
– consistent, cross-functional, shareable meta-data model
– standardization increases flexibility & use (data to info)
– facilitates the creation of data warehouses
Warehouses & Marts
• Data Warehouse
– a database designed to support decision-making in
an organization. It is structured for fast online
queries and exploration. Data warehouses may
aggregate enormous amounts of data from many
different operational systems.
• Data Mart
– a database focused on addressing the concerns of
a specific problem or business unit (e.g. Marketing,
Engineering). Size doesn’t define data marts, but
they tend to be smaller than data warehouses.
Data Warehouses & Data Marts
3rd party data
Data Mart
(Marketing)
TPS
& other
operational
systems
= operational clients
= query, mining, etc.
Data
Warehouse
Data Mart
(Engineering)
Differing System Demands
Operational Systems
network traffic
& processor
demands
time
Managerial Systems
network traffic
& processor
demands
time
Query Tools & OLAP
• Query Tools
– user-lead discovery. Can return individual records
or summaries. Requests are formulated in advance
(e.g. “show me all delinquent accounts in the
northeast region during Q1”).
• OLAP - Online Analytical Processing
– user-lead discovery. Data is explored via “drill
down” into the data by selecting variables to
summarize on. Results are usually reported in a
cross-tab report or graph (e.g. “show me a tabular
breakdown of sales by product, customer, and
date”).
OLAP
Online
Analytical
Processing
OLAP
• Online Analytical Processing. (example
of cross-tab results presented below)
1. business unit
2. product type
3. year
Executive Dashboard – aggregated report presentation of key business
indicators
Data Mining
• automated information discovery process,
uncovers important patterns in existing
data
– can use neural networks, regression, decision
trees, or other approaches.
– Requires ‘clean’, consistent data. Historical
data must reflect the current environment.
• e.g. “What are the characteristics that
identify if a customer is likely to default on
a loan?”
Insuring Success
Over 5 years:
• Revenues up 26%
to $33.9 billion
• Profits up 180% to
$3.1 billion
• Stock up 187%
– Insurance stocks
up only 31%