28-311 Management Information Systems

Download Report

Transcript 28-311 Management Information Systems

710: Foundations of Information
Systems
Database Systems
Chapter 3
Database
• Organized collection of related data
– Databases: A list of phone numbers and
names. List of book titles and authors. List
of customers and sales figures.
– Not databases: List of book titles and
phone numbers. Pile of papers on a desk.
• Major Problems
– Data are usually not collected in a very
organized fashion
– Too much data – not enough information
2
© Jakob Iversen, 2002
The Data Hierarchy
3
© Jakob Iversen, 2002
Traditional File Environment
4
© Jakob Iversen, 2002
Problems with the File Approach
• data redundancy - the same piece of
information could be duplicated in
several places
• data inconsistency - the various copies
of the data no longer agree
• data isolation - difficulty in accessing
data from different applications
• data integrity - data values don’t adhere
to integrity constraints
5
© Jakob Iversen, 2002
Database : The Modern Approach
• One database – several programs
• Data changes only in one place
• Examples: MS Access, Oracle, DB2
6
© Jakob Iversen, 2002
Advantages of Database Approach
• Reduced data
redundancy
• Shared data and
information resources
• Improved data integrity
• Easier modification and
updating
• Data and program
independence
• Better access to data
and information
• Standardization of data
access
• Framework for program
development
• Better overall protection
of the data
• Improved strategic use
of corporate data
7
© Jakob Iversen, 2002
Disadvantages of Database Approach
8
© Jakob Iversen, 2002
DBMS Components
• Data model
– defines the way data are conceptually structured
• Data definition language (DDL)
– defines what types of information are in the database and how they
will be structured
– functions of the DDL
• provide a means for associating related data
• indicate the unique identifiers (or keys) of the records
• set up security access and change restrictions
• Data manipulation language (DML)
– query the contents of the database, store or update information in
the database, and develop database applications
– Structured query language (SQL) - most popular relational database
language, combining both DML and DDL features
• Data Dictionary (metadata)
– stores definitions of data elements and data characteristics
© Jakob Iversen, 2002
9
DBMS: Logical versus Physical View
• Physical view
– Actual, physical arrangement and location of
data
– Described in a schema (describes entire
database)
• Logical view
– represents data in a format that is meaningful to
a user and to the software programs that
process that data
– Can be different for different users as described
in subschemas
– Underlying structure may change but
subschema (user view) remains the same
10
© Jakob Iversen, 2002
Use of Schemas and Subschemas
11
© Jakob Iversen, 2002
Database : Centralized database
• all related files in one
location
• single mainframe
computer
• Users can work on a
database as a whole at
one location
• files only accessible via
the host computer
• disaster recovery can be
more easily
accomplished at a central
location
• vulnerable to a single
point of failure
• speed problem
12
© Jakob Iversen, 2002
Database : Distributed database
• complete copies (or
portions) of a
database, in more
than one location
• replicated database complete copies of
entire database
available at many
locations: No singlepoint-of-failure and
increased
responsiveness
• partitioned database
- a portion of the
entire database in
each location
• This is planned
redundancy (p. 106)
13
© Jakob Iversen, 2002
Logical Data Models
• A manager’s ability to use a database is
highly dependent on how the database
is structured logically and physically.
• In logically structuring a database,
businesses need to consider the
characteristics of the data and how the
data will be accessed.
• Three common data models:
hierarchical, network, and relational
14
© Jakob Iversen, 2002
Hierarchical Model
• Fast access, large installed base
• Best with one-to-many relationship btwn data
• Cumbersome, redundant data
15
© Jakob Iversen, 2002
Network Model
•
•
•
•
Related data ordered in sets
Member/owner of set
Can handle many-to-many relationships
How do we notify customers who ordered a
defective product?
© Jakob Iversen, 2002
16
Relational Model
• Data organized in tables
that are related through
stored values
• Tables
– Files
• Tuples
– Records
• Attributes
– Fields
• Structured Query
Language (SQL)
– Query language that
simplifies access to
data
– MS Access makes it
even simpler!
SQL Example:
SELECT (Customer_Name and Customer_Address)
FROM Customer_Table
WHERE Credit_Limit > 5000
17
© Jakob Iversen, 2002
The Entity Relationship Model
Relationship
Entity
Attribute
© Jakob Iversen, 2002
Primary Key
18
Relational Database Example
19
© Jakob Iversen, 2002
A Relational Database Model
• Identify for each table:
–
–
–
–
–
Records
Fields
Field values
Primary keys
Foreign keys
20
© Jakob Iversen, 2002
Queries can combine data
21
© Jakob Iversen, 2002
Problems with redundant data
22
© Jakob Iversen, 2002
Comparing Data Models
Model Advantages
Disadvantages
Hierarchi- Speed and efficiency in search
cal
Access to data is predefined by
exclusively hierarchical
relationships, predetermined by
administrator. Limited search/
query flexibility. Not all data is
naturally hierarchical.
Network
Many more relationships
between data elements can be
defined. Greater speed and
efficiency than relational
database models.
Relational Conceptual simplicity; no
predefined relationships among
data. High flexibility in ad hoc
querying. New data and records
can be added easily
The most complicated model to
design, implement, and maintain.
More flexibility than hierarchical
model, but less than relational
model.
Lower processing efficiency and
speed. Data redundancy is
common, requiring additional
maintenance.
23
© Jakob Iversen, 2002
Worldwide Dabase Market Share, 2002
NCR
3%
Others
9%
Microsoft
18%
IBM
36%
Oracle
34%
© Jakob Iversen, 2002
Total revenue: $6.6 billion
Source: Gartner Dataquest
24
Selecting a DBMS
• Database Size
• Number of
concurrent users
• Performance
• Integration
• Features
• The Vendor
• Cost
25
© Jakob Iversen, 2002
Data Warehousing
• Data extracted from production systems
• Historical data for decision making
• Concerns
–
–
–
–
–
Data extraction (when, from where, what data)
Data cleaning
Timeliness
Business mergers
Analysis: Data mining and Online Analytical
Processing (OLAP)
26
© Jakob Iversen, 2002
Data Warehouse
Elements
27
© Jakob Iversen, 2002
Data Mart
• A data warehouse for single division or
department
• Easier and cheaper to set up
• More detailed data
• But might create ’islands’ of unlinked
information
28
© Jakob Iversen, 2002
OLAP and Data Mining
29
© Jakob Iversen, 2002
Next Week
• Lecture
– Chapter 4: Telecommunication, Internet,
Extranet
• Presentations from Team 5 and 1
• Assignment 1 due for Team 3 and 4
30
© Jakob Iversen, 2002