Lecture 6 - Boston College Personal Web Server

Download Report

Transcript Lecture 6 - Boston College Personal Web Server

MD703
Final Class:
Database
Internal / External Integration
Outsourcing
Database Technologies
Relational Concepts
Data Warehouses & Marts
Queries, OLAP, Data Mining
Terms/Examples
Server - responds to client requests
request
response
Client - makes requests
of the DBMS server
• Database
DBMS - the program.
Manages interaction with
databases.
database - the collection of data.
Created and defined to meet the
needs of the organization.
– a collection of related data. Usually organized according to
topics: e.g. customer info, products, transactions
• Database Management System (DBMS)
– a program for creating & managing databases; ex. Oracle,
MS-Access, Sybase
A Simple Database
CUSTID
2001
2002
2003
2004
2005
FIRST
John
Abby
Warren
Peter
Charles
LAST
Gallaugher
Johnson
Buffet
Lynch
Schwab
CITY
Newton
Boston
Omaha
Marblehead
San Francisco
STATE
MA
MA
NE
MA
CA
…
…
…
…
…
…
• File/Table
– Customers
• Field/Column
– 5 shown: CUSTID, FIRST, LAST, CITY, STATE
• Record/Row
– 5 shown: one for each customer
A More Complex Example
LAST
Gallaugher
Gallaugher
Gallaugher
Johnson
Johnson
Buffet
Buffet
Buffet
Lynch
Lynch
Schwab
Schwab
CITY
Newton
Newton
Newton
Boston
Boston
Omaha
Omaha
Omaha
Marblehead
Marblehead
San Francisco
San Francisco
STATE
MA
MA
MA
MA
MA
NE
NE
NE
MA
MA
CA
CA
BUY/SELLSTOCK SHARES
Buy
MSFT
1000
Buy
INTC
2400
Sell
IBM
3000
Sell
IBM
3000
Sell
INTC
2000
Buy
INTC
1500
Buy
IBM
1700
Sell
AAPL
1900
Buy
AAPL
2000
Sell
AAPL
10000
Buy
MSFT
4500
Buy
INTC
17000
PRICE DATE
90 1/4 12/24/98
80 1/8
7/3/99
114 3/8
7/1/99
110 1/8 6/30/99
94 7/8 8/30/99
90 3/8
7/2/99
101 7/8
1/4/99
18 1/2 2/14/99
19 2/14/99
21 7/8 3/15/99
101 1/8 1/15/99
80 1/8
7/2/99
TIME
CUSTID
12:01 PM
2001
10:51 AM
2001
9:03 AM
2001
4:53 PM
2002
3:15 PM
2002
11:27 AM
2003
2:02 PM
2003
5:00 PM
2003
5:30 PM
2004
11:44 AM
2004
12:38 AM
2005
4:53 PM
2005
• Entry & Maintenance is complicated
– redundant data exists, increases chance of error,
complicates updates/changes, takes up space
Normalize Data:
Remove Redundancy
CUSTID
2001
2002
2003
2004
2005
FIRST
John
Abby
Warren
Peter
Charles
LAST
Gallaugher
Johnson
Buffet
Lynch
Schwab
CITY
Newton
Boston
Omaha
Marblehead
San Francisco
STATE
MA
MA
NE
MA
CA
Customer Table
Transaction Table
One
Many
CUSTID
2001
2001
2001
2002
2002
2003
2003
2003
2004
2004
2005
2005
BUY/SELLSTOCK SHARES
Buy
MSFT
1000
Buy
INTC
2400
Sell
IBM
3000
Sell
IBM
3000
Sell
INTC
2000
Buy
INTC
1500
Buy
IBM
1700
Sell
AAPL
1900
Buy
AAPL
2000
Sell
AAPL
10000
Buy
MSFT
4500
Buy
INTC
17000
PRICE DATE
90 1/4 12/24/98
80 1/8
7/3/99
114 3/8
7/1/99
110 1/8 6/30/99
94 7/8 8/30/99
90 3/8
7/2/99
101 7/8
1/4/99
18 1/2 2/14/99
19 2/14/99
21 7/8 3/15/99
101 1/8 1/15/99
80 1/8
7/2/99
TIME
12:01 PM
10:51 AM
9:03 AM
4:53 PM
3:15 PM
11:27 AM
2:02 PM
5:00 PM
5:30 PM
11:44 AM
12:38 AM
4:53 PM
Key Terms
• Relational DBMS
– manages databases as a collection of
files/tables in which all data relationships are
represented by common values in related tables
(referred to as keys).
– a relational system has the flexibility to take
multiple files and generate a new file from the
records that meet the matching criteria (join).
• SQL - Structured Query Language
– Most popular relational database standard.
Includes a language for creating & manipulating
data.
Now With More Data
CUSTID
2001
2002
2003
2004
2005
FIRST
John
Abby
Warren
Peter
Charles
LAST
Gallaugher
Johnson
Buffet
Lynch
Schwab
One
Many
CITY
Newton
Boston
Omaha
Marblehead
San Francisco
CUSTID
2001
2001
2001
2002
2002
…
BROKID
B003
B001
B003
B001
B003
…
STATE
MA
MA
NE
MA
CA
BUY/SELLSTOCK SHARES PRICE DATE
Buy
MSFT
1000 90 1/4 12/24/97
Buy
INTC
2400 80 1/8
7/3/99
Sell
IBM
3000 114 3/8
7/1/99
Sell
IBM
3000 110 1/8 6/30/99
Sell
INTC
2000 94 7/8 8/30/99
…
…
…
…
…
Many
One
BROKID
B001
B002
B003
FIRST
Ivan
Dennis
Michael
LAST
Boesky
Levine
Milken
…
…
…
…
TIME
12:01 PM
10:51 AM
9:03 AM
4:53 PM
3:15 PM
…
Customer Table
Col. Name Length Type …
CUSTID
4 Char …
FIRST
10 Char …
LAST
15 Char …
CITY
15 Char …
STATE
2 Char …
…
…
…
…
Meta-Data
m
1
m
Broker Table
Col. Name Length Type
BROKID
4 Char
FIRST
10 Char
LAST
15 Char
…
…
…
…
Transaction Table
Col. Name Length Type
CUSTID
4 Char
BROKID
4 Char
BUY/SELL
1 Bool
STOCK
4 Char
SHARES
8 Num
PRICE
6.2 Money
…
…
…
…
…
…
…
…
…
…
1
…
…
…
• Data that describes the characteristics of stored data
• Enterprise Data Model
– consistent, cross-functional, shareable meta-data model
– standardization increases flexibility & use (data to info)
– facilitates the creation of data warehouses
Management Levels of IS
Strategic
Planning
Management
Control
Operational
Control
DSS
MIS
TPS
Warehouses & Marts
• Data Warehouse
– a database designed to support decision-making in
an organization. It is batch-updated and structured
for fast online queries and exploration. Data
warehouses may aggregate enormous amounts of
data from many different operational systems.
• Data Mart
– a database focused on addressing the concerns of
a specific problem or business unit (e.g. Marketing,
Engineering). Size doesn’t define data marts, but
they tend to be smaller than data warehouses.
Data Warehouses & Data Marts
3rd party data
Data Mart
(Marketing)
TPS
& other
operational
systems
Data
Warehouse
= operational clients
= query, OLAP, mining, etc.
Data Mart
(Engineering)
Differing System Demands
Operational Systems
network traffic
& processor
demands
time
Managerial Systems
network traffic
& processor
demands
time
Transform Data from TPS to
Warehouse
• Consolidate data
– e.g. from multiple TPS around the country/world
• “Scrub” the data
– keep definitions consistent (e.g. translate part
numbers/product names if they differ per country)
• Calculate fields (decrease processor load)
• Summarize fields (decrease processor load)
• De-normalize data (ease of use)
Calculated Fields
Customer Service Application:
Customer support person
TPS - focuses on customer info
Total is calculated on the fly
Customer Date
Gallaugher 3/25/98
Johnson
3/26/98
Buffet
3/27/98
Stock
INTC
AAPL
MSFT
Database Query Application:
Marketing manager
Aggregate reporting of
business intelligence
Total calculated in advance
Shares Price
Total
1000
76 1/2 $
76,500
2500
23 1/4 $
58,125
3000
84 $
252,000
Query Tools & OLAP
• Query Tools
– user-lead discovery. Can return individual records
or summaries. Requests are formulated in advance
(e.g. “show me all delinquent accounts in the
northeast region during Q1”).
• OLAP - Online Analytical Processing
– user-lead discovery. Data is explored via “drill
down” into the data by selecting variables to
summarize on. Results are usually reported in a
cross-tab report or graph (e.g. “show me a tabular
breakdown of sales by business unit, product type,
and year”).
OLAP
• Online Analytical Processing. (example
of cross-tab results presented below)
1. business unit
2. product type
3. year
Data Mining
• automated information discovery process,
uncovers important patterns in existing
data
– can use neural networks or other approaches.
Requires ‘clean’, reliable, consistent data.
Historical data must reflect the current
environment.
• e.g. “What are the characteristics that
identify when we are likely to lose a
customer?”
Data Mining Uses
•
•
•
•
•
•
Market Segmentation - e.g. Target
Direct Marketing - e.g. Chase
Market basket analysis - e.g. Wal-Mart
Customer Churn - e.g. Fleet Bank
Fraud Detection - e.g. Bank of America
Cost Reduction Prospecting - e.g. Merk
Medco.
Stupid Data-Miner Tricks
• Ad-Hoc Theories
– when an oddity jumps out of the data, it’s
tempting to develop a theory for it. Sometimes
findings are just statistical flukes.
• Using Too Many Variables
– the more factors considered, the more likely a
relationship will be found - valid or not.
• Not Taking No for an Answer
– it’s OK to stop looking if you can’t find anything.
There are no silver bullets.
• Limited or incorrect interpretation
External & Internal Integration
EDI & XML
Enterprise Resource Planning (ERP)
Outsourcing
External Integration
• EDI - Electronic Data Interchange
– uses standard formats to pass data between
disparate systems
– US format - X.12, European format - UN/EDIFACT
• Cost Savings
– paper order = $50 - $70
– EDI order = $2.50 (VANs / private networks)
– I-EDI order = < $1 (Internet)
• XML - eXtensible Markup Language
– tagging language for the web
Challenges Facing IS Depts.
• Y2K & Legacy Systems
• Globalization (euro, currency issues)
• Rapid Technology Advancement
– e.g. Client/Server & Internet
• IS Staffing & Retention
• Changing Organizational Structures
– e.g. Owens Corning
• Tighter Integration with Buyers &
Suppliers
Legacy Systems
Many firms have limited to no integration across
geographic areas
functional areas (v-chain)
products, plants, &
business units
Infrastructure: general mgmt, planning, finance, IS
HRM: recruiting, hiring, training, and development
Tech. Development: R&D
Procurement
Inbound
Operations
Outbound
Marketing
logistics
logistics
& Sales
Suppliers
Service
Buyers
What is ERP?
• ERP - Enterprise Resource Planning Software
– sometimes called Enterprise
Applications/Packages/Suites/Systems
– connects all of the information which flows through
a company to a single integrated set of systems
– implemented in modules which can be integrated
(all at once or at a later date) e.g. Financials,
Logistics, HR
– may work with a wide variety of databases,
hardware, and operating systems
• Leading Vendors
– SAP, Oracle, JD Edwards, Baan, Peoplesoft
ERP in Action
Sales
ERP in Action
Inventory
Sales
ERP in Action
Production
Inventory
Sales
ERP in Action
Production
Staffing
Inventory
Sales
ERP in Action
Production
Staffing
Purchasing
Inventory
Sales
ERP in Action
Production
Staffing
Inventory
Purchasing
Order Tracking
Sales
ERP in Action
Production
Staffing
Inventory
Purchasing
Order Tracking
Source: BusinessWeek Int’l, 1997
Sales
Planning
The Benefits
• Systems integration - enterprise data model
– squeeze out waste & enable strategies
• Standard software enables – inter-organizational systems (easier if buyers &
suppliers use the same system, e.g. petrochem.
ind.)
– broad selection of add-on packages (e.g. data
warehouses, etc.)
• Package upgrading and new technology
development is handled by vendor
• Speed of deployment
The Risks
• Staff retention
• Tied to a single vendor
• Flexibility limited by options offered by
the vendor
– may inappropriately force generic
processes
– may inappropriately force structure
• Complexity - particularly regarding
mapping and standardizing processes
across the organization.
Make vs. Buy
Comp. Adv.
Will the proposed system offer
proprietary comp. adv.?
Security
Is the process or data highly
confidential?
IT Competency Is IT a core competency?
Tech. Skill
Does the firm have sufficient
expertise with tech.?
Suitability/Fit
Is a suitable partner/package
available?
Cost/Benefit
Is the package cheaper than inhouse dev.?
Time
Is there sufficient time to develop
the system
Adapted from Applegate et al., p. 61.
Make Buy
Yes No
Yes
No
Yes
Yes
No
No
No
Yes
No
Yes
Yes
No
Successful Deployment of ERP
• Business Case
– benchmark, cost justify (e.g. unplug mainframes)
• Leadership
– from the highest levels (e.g. success at Owens Corning,
failure at Westinghouse)
• Staffing
– largely from business, not IT (users know the process)
– ‘compensation handcuffs’ (e.g. end of deployment
bonuses, training payback agreements)
– experienced consultants - check refs., clients
• Execute with proven methodologies
When Outsourcing is Attractive
• IS Function
– commodity services are more easily outsourced
(networking, operations, PC maintenance,
training)
• Project Structure
– structured projects are most appropriate
• Profit / M&A Concerns
– outsourcing generates cash, liquefies assets,
and may make some firms (e.g. banks) more
attractive to suitors (no IS org. to integrate)