Why Manage Data?
Download
Report
Transcript Why Manage Data?
Data Management
Options
Dr. Merle P. Martin
MIS Department
CSU Sacramento
Acknowledgments
Dr. Russell Ching (MIS Dept)
Source Materiel / Graphics
Edie Schmidt (UMS) - Graphic Design
Prentice Hall Publishing (Permissions)
Martin, Analysis and Design of
Business Information Systems, 1995
Agenda
Why manage data?
Definitions
Typical problems
Data Administrator
The DBMS
Distributing data
Why Manage Data?
Delayed output
(paycheck)
Locate a resource
Where is the stock
item stored?
Where does the
employee work?
Why Manage Data?
Make resource decisions
Should we turn account
over to collection agency?
Should we send customer
letter asking why he / she
hasn’t shopped here in 6 months?
Should we give employee overtime?
Why Manage Data?
Determine resource status
Is there enough stock in
warehouse to satisfy this
customer’s order?
How much should I order?
What is the value of
this resource?
balance sheet
Definitions
File: resource inventory:
Material
People
Employees, customers
Funds
Customer balances
Accounts Payable
Definitions
Data Organization
Bit / byte
Character
Field
Record
File
DBMS
Data Hierarchy for
Stereos to Go
Database
File
{
12345 Smith John A 123 Main Street Sacramento CA 95819
12345 Smith John A 123 Main Street Sacramento CA 95819
12345 Smith John A 123 Main Street Sacramento CA 95819
Record
Field
Character
(Byte)
Smith
10110011
Bit 1
Definitions
Views:
Physical - how stored
Logical - how viewed
and used
Volatility: - % records that change
Immediacy: rapidity of change
Storage Problems
Redundancy
Accuracy
Security
Lack of data sharing
Report inflexibility
Inconsistent data definitions
Too much data
information overload
Data Administrator
Clean up data definitions
Control shared data
Manage distributed data
Maintain data quality
Clean Up Definitions
Synonyms / aliases
Standard data definitions
names and formats
Date of Birth (AJIS)
mm/dd/yy (courts)
dd/mm/yy (corrections)
Data Dictionary
COBOL
Control Shared Data
Local - used by one unit
Shared - used by two
or more activities
Impact of proposed program
changes on shared data
Program-to-data element matrix
Control or clearinghouse?
Manage Distributed
Data
Geographically dispersed
whether shared data
or not
Different levels of detail
different management levels
Low
Infrequent
Frequency of Use
Very frequent
Quite old
Currency
Required Accuracy
Future
Time Horizon
High
Highly current
Historical
Aggregate
Wide
Scope
Well defined
Detailed Level of Aggregation
External
Operational
Control
Source
Management
Control
Internal
Strategic
Planning
Maintain Data Quality
Put owners in charge
of data
verify data accuracy
and quality
Fairbanks Court example
Who owns the data?
Issue
Should the Data
Administrator control
ALL data,
or just that data that crosses
organizational boundaries?
WHAT DO YOU THINK?
The DBMS
Data Base Management
System: software that
permits a firm to:
centralize data
manage them efficiently
provide access to applications
such as payroll, inventory
DBMS Components
Data Design
Language (DDL)
Data Manipulation
Language (DML)
Inquiry Language (IQL)
Teleprocessing Interface (TP)
Martin, Figure 16-5
Designers
Teleprocess
DDL
Database
DML
Update
Applic. Software
Programmers
IQL
Interface
Retrieve
End-Users
IQL LANGUAGE
Data
Base
IQL
SELECT EMP-ID,
EMP-FIRSTNAME,
EMP-LASTNAME,
EMP-YTD-PAY
FROM
EMPLOYEE
WHERE EMPID=1234
.
3-level Database Model
James Martin
Sprague / McNurlin,
Fig. 7-2, pg. 207
External Level (1)
User views (logical)
By application program
Each has unique view
Schema / subschema
Schema and Subschemas
Physical Database
Individual
Views
Subschema
User
User
DBMS
DBMS Software
Schema
Overall View
of the Data
Subschema
User
User
Subschema
User
User
Enterprise Level (2)
Under control of Data
Administrator
DBMS
Implementation data removed
passwords
report views
Physical Level (3)
Schema
Pointers
(e.g., next record)
Flags
(e.g., record frozen)
Traditional Data
Models
Hierarchical - one parent
Network
more than one parent
student to course, major
Relational (tables)
Hierarchical Model
Project 1
Dept. A
Dept. B
Dept C
1
3
5
2
4
Employees
6
Network Model
John Smith
Jane Smith
Savings
Mortgage
Checking
Account Number
First Name Middle Initial
Last Name
...
Credit Limit
Customer
Order Number
Order Date
Account Number
Date Shipped
Orders
Order Number
Line Item Number Product Code Quantity
Line Items
Product Code Product Name
Price
Unit
Manufacturer Code
Products
Relational
Manufacturer Code Manufacturer Name
Manufac(turer)
Object-oriented DBMS
An object is:
a piece of data PLUS
procedures performed
on data PLUS
attributes describing data
PLUS
relationship between object
and other objects
Distributed Data
Goals:
move processing as
close to users as possible
allow several applications to run
simultaneously on same data
Distributed Types
Fragmented
distribute data without
duplication
users unaware of
where data located
Segmented
data duplicated
one site has master file
problem with data synchronization
Why Distribute?
Save money
offload DB processes
to less expensive
machines (PCs)
Lower telecommunications costs
DB closer to users
Decrease dependence on a single
computer manufacturer
Why Distribute
Move control closer
to owner
Increased DBMS scope
more varied types of data
link at workstations
Permit storage of multimedia data
True Distributed DB
Local autonomy
(ownership)
No reliance on central site
Continuous operations
not affected by another site
Data transparency
Independence
Independence
Fragmentation
Replication
Hardware
Software
Networks
Database
Problems With
Distributed Databases
Security
Shared data
simultaneous update
Complexity
Need telecommunications
infrastructure
Issue
Is data in your organization
totally distributed?
How?
Should it be?
Why or why not?
Points to remember
Definition
Typical problems
Role of Data Administrator
The DBMS
Distributing data