Ch 3 and Mod C

Download Report

Transcript Ch 3 and Mod C

Chapter 3 and Module C
DATABASES AND DATA
WAREHOUSES
Supporting the Analytics-Driven
Organization
Opening Case:
Did You Know CDs Come from
Dead Dinosaurs?
In 2010, more than half of all music was in digital form; physical music
will never again be the norm
INTRODUCTION

Business intelligence (BI)


Knowledge about your customers,
competitors, business partners, competitive
environment, and internal operations to make
effective, important, and strategic business
decisions
Analytics


Fact-based decision-making
Integrated use of IT and statistical techniques
to create BI
Data Processing

IT tools help process information to create
business intelligence according to…


OLTP
OLAP
Data Processing

Online transaction processing (OLTP)

The gathering and processing transaction information, and
updating existing information to reflect the transaction



Databases support OLTP
Operational database – databases that support OLTP
Online analytical processing (OLAP)

The manipulation of information to support decision
making



Databases can support some OLAP
Data warehouses only support OLAP, not OLTP
Data warehouses are special forms of databases that support
decision making and help build BI
OLTP, OLAP, and Business Intelligence
THE RELATIONAL DATABASE
MODEL

There are many types of databases
The relational database model is the most
popular

Relational database

Database Characteristics
1.
2.
3.
4.
Collections of information
Created with logical structures
Include logical ties within the information
Include built-in integrity constraints
1. Database – Collection of Information
2. Database – Logical Structure
Advisor
Class
Advisor ID ALastName AFirstName






Character
Field
Record
File (Table)
Database
Data
Warehouse
101
Leonard
Lori
102
Aurigemma
Sal
103
Bajaj
Akhilesh
104
Platner
Steve
105
McCrary
Mike
Class
Synonym
10342
10344
10359
10450
10578
10643
Student
Student ID
1011
1012
1013
1014
1015
1016
1017
SLastName
Berry
Smith
Sanders
Anderson
Whitman
Jones
Phillips
SFirstName Advisor ID
Jeff
101
Tom
103
Tally
101
Cindy
103
Amy
102
Kelsi
105
Susan
104
Class Prefix
Class No
MIS
MIS
MIS
MIS
MIS
MIS
3003
1123
4133
1123
2013
4053
Class
Section
3
2
2
1
3
1
Student-Class
Student ID
Class Synonym
1011
10342
1011
10643
1013
10578
1014
10342
1014
10359
1014
10450
1015
10578
1016
10342
1017
10344
1017
10450
Logical Structure: Character
Advisor
Class
Advisor ID ALastName AFirstName






Character
Field
Record
File (Table)
Database
Data
Warehouse
101
Leonard
Lori
102
Aurigemma
Sal
103
Bajaj
Akhilesh
104
Platner
Steve
105
McCrary
Mike
Class
Synonym
10342
10344
10359
10450
10578
10643
Student
Student ID
1011
1012
1013
1014
1015
1016
1017
SLastName
Berry
Smith
Sanders
Anderson
Whitman
Jones
Phillips
SFirstName Advisor ID
Jeff
101
Tom
103
Tally
101
Cindy
103
Amy
102
Kelsi
105
Susan
104
Class Prefix
Class No
MIS
MIS
MIS
MIS
MIS
MIS
3003
1123
4133
1123
2013
4053
Class
Section
3
2
2
1
3
1
Student-Class
Student ID
Class Synonym
1011
10342
1011
10643
1013
10578
1014
10342
1014
10359
1014
10450
1015
10578
1016
10342
1017
10344
1017
10450
Logical Structure: Field
Advisor
Class
Advisor ID ALastName AFirstName






Character
Field
Record
File (Table)
Database
Data
Warehouse
101
Leonard
Lori
102
Aurigemma
Sal
103
Bajaj
Akhilesh
104
Platner
Steve
105
McCrary
Mike
Class
Synonym
10342
10344
10359
10450
10578
10643
Student
Student ID
1011
1012
1013
1014
1015
1016
1017
SLastName
Berry
Smith
Sanders
Anderson
Whitman
Jones
Phillips
SFirstName Advisor ID
Jeff
101
Tom
103
Tally
101
Cindy
103
Amy
102
Kelsi
105
Susan
104
Class Prefix
Class No
MIS
MIS
MIS
MIS
MIS
MIS
3003
1123
4133
1123
2013
4053
Class
Section
3
2
2
1
3
1
Student-Class
Student ID
Class Synonym
1011
10342
1011
10643
1013
10578
1014
10342
1014
10359
1014
10450
1015
10578
1016
10342
1017
10344
1017
10450
Logical Structure: Record
Advisor
Class
Advisor ID ALastName AFirstName






Character
Field
Record
File (Table)
Database
Data
Warehouse
101
Leonard
Lori
102
Aurigemma
Sal
103
Bajaj
Akhilesh
104
Platner
Steve
105
McCrary
Mike
Class
Synonym
10342
10344
10359
10450
10578
10643
Student
Student ID
1011
1012
1013
1014
1015
1016
1017
SLastName
Berry
Smith
Sanders
Anderson
Whitman
Jones
Phillips
SFirstName Advisor ID
Jeff
101
Tom
103
Tally
101
Cindy
103
Amy
102
Kelsi
105
Susan
104
Class Prefix
Class No
MIS
MIS
MIS
MIS
MIS
MIS
3003
1123
4133
1123
2013
4053
Class
Section
3
2
2
1
3
1
Student-Class
Student ID
Class Synonym
1011
10342
1011
10643
1013
10578
1014
10342
1014
10359
1014
10450
1015
10578
1016
10342
1017
10344
1017
10450
Logical Structure: File
Advisor
Class
Advisor ID ALastName AFirstName






Character
Field
Record
File (Table)
Database
Data
Warehouse
101
Leonard
Lori
102
Aurigemma
Sal
103
Bajaj
Akhilesh
104
Platner
Steve
105
McCrary
Mike
Class
Synonym
10342
10344
10359
10450
10578
10643
Student
Student ID
1011
1012
1013
1014
1015
1016
1017
SLastName
Berry
Smith
Sanders
Anderson
Whitman
Jones
Phillips
SFirstName Advisor ID
Jeff
101
Tom
103
Tally
101
Cindy
103
Amy
102
Kelsi
105
Susan
104
Class Prefix
Class No
MIS
MIS
MIS
MIS
MIS
MIS
3003
1123
4133
1123
2013
4053
Class
Section
3
2
2
1
3
1
Student-Class
Student ID
Class Synonym
1011
10342
1011
10643
1013
10578
1014
10342
1014
10359
1014
10450
1015
10578
1016
10342
1017
10344
1017
10450
Logical Structure: Database
Advisor
Class
Advisor ID ALastName AFirstName






Character
Field
Record
File (Table)
Database
Data
Warehouse
101
Leonard
Lori
102
Aurigemma
Sal
103
Bajaj
Akhilesh
104
Platner
Steve
105
McCrary
Mike
Class
Synonym
10342
10344
10359
10450
10578
10643
Student
Student ID
1011
1012
1013
1014
1015
1016
1017
SLastName
Berry
Smith
Sanders
Anderson
Whitman
Jones
Phillips
SFirstName Advisor ID
Jeff
101
Tom
103
Tally
101
Cindy
103
Amy
102
Kelsi
105
Susan
104
Class Prefix
Class No
MIS
MIS
MIS
MIS
MIS
MIS
3003
1123
4133
1123
2013
4053
Class
Section
3
2
2
1
3
1
Student-Class
Student ID
Class Synonym
1011
10342
1011
10643
1013
10578
1014
10342
1014
10359
1014
10450
1015
10578
1016
10342
1017
10344
1017
10450
Database – Physical Structure



Bits
Bytes
Words
Databases – Created with Logical
Structures



Databases have many tables
In databases, the row number is irrelevant;
not true in spreadsheet software
In databases, column names are very
important. Column names are created in
the data dictionary
Database – Created with Logical
Structures

Data dictionary – contains the logical
structure for the information in a database
Before you can enter information
into a database, you must define
the data dictionary for all the
tables and their fields. For
example, when you create the
Truck table, you must specify that
it will have three pieces of
information and that Date of
Purchase is a field in Date
format.
3. Databases – With Logical Ties
Within the Information





Logical ties must exist between the tables
or files in a database
Logical ties are created with primary and
foreign keys
Primary key (PK)
Composite primary key (CPK)
Foreign key (FK)
Database – Logical Ties within the
Information
Customer Number is the
primary key for Customer and
appears in Order as a foreign
key
Logical Ties – Keys





A PK and a FK do not have to have the
same name.
If a record can be uniquely identified with
only one PK, then the file should only have
one.
A PK is required (or CPKs) for each file.
A FK may or may not exist for each file.
All CPKs do not have to be FKs.
4. Databases – Built-In Integrity
Constraints


Integrity constraints – rules that help
ensure the quality of the information
Examples




Primary keys must be unique
Foreign keys must be present
Sales price cannot be negative
Phone number must have area code
Steps in Developing a Database


Step 1: Define Entity Classes (tables) and
Primary Keys
Step 2: Defining Relationships Among Entity
Classes




ERD (entity relationship diagram)
Normalization: (1) eliminate M:M; (2) fields
must depend on PK; (3) no derived fields
Step 3: Defining Information For Each
Relation
Step 4: Use A Data Definition Language To
Create Your Database
DATABASE MANAGEMENT
SYSTEM TOOLS
5 Components of a DBMS
1.
2.
3.
DBMS engine
Data definition subsystem
Data manipulation subsystem
Views
 Report generators
 QBE tools
 SQL

4.
5.
Application generation subsystem
Data administration subsystem
View

View – allows you to see the contents of a
database file, make changes, and query it to
find information
Report Generator

Report
generator –
helps you quickly
define formats of
reports and what
information you
want to see in a
report
Query-by-Example Tool

QBE tool – helps you graphically
design the answer to a question
Structured Query Language

SQL – standardized fourth-generation
query language found in most DBMSs
 Sentence-structure equivalent to QBE
 Mostly used by IT professionals
 Non-procedural language, which makes it
different from other programming
languages
DATA WAREHOUSES AND DATA
MINING





Data warehouses support OLAP and
decision making
Data warehouses do not support OLTP
Data warehouse
Data mart
Data-mining
Data Warehouse Example
Data Mart Example
Data-Mining Tools
Data Warehouse Considerations
 Do
you really need one, or does your
database environment support all your
functions?
 Do all employees need a big data
warehouse or a smaller data mart?
 How up-to-date must the information be?
 What data-mining tools do you need?
INFORMATION OWNERSHIP
 Information
is a resource you must
manage and organize to help the
organization meet its goals and objectives
 You need to consider
 Strategic
management support
 Sharing information with responsibility
 Information cleanliness
Strategic Management Support
•
•
Data administration – function that
plans for, oversees the development of,
and monitors the information resource
Database administration – function
responsible for the more technical and
operational aspects of managing
organizational information
Sharing Information
 Everyone
can share – while not consuming
– information
 But someone must “own” it by accepting
responsibility for its quality and accuracy
Information Cleanliness
 Related
to ownership and responsibility for
quality and accuracy
 No duplicate information
 No redundant records with slightly
different data, such as the spelling of a
customer name
 GIGO – if you have garbage information
you get garbage information for decision
making