chapter 3 - Central Washington University

Download Report

Transcript chapter 3 - Central Washington University

CHAPTER 3
DATA RESOURCE
MANAGEMENT
Learning Objectives
 Examine the
managerial and
organizational
requirements for
managing data as a
resource.
Foundation concepts
 A computer system organizes data in a





hierarchy that starts with
bytes and progresses to
fields,
records,
files, and
databases.
FILE ORGANIZATION
 BIT: the smallest unit of data a computer can handle.
 BYTE: Combination of BITS called a byte, represents a




CHARACTER which can be a letter, a number or another
symbol.
FIELD: A grouping of characters into a word, a group of
words, or a complete number (such as person’s name or
age).
RECORD: A group of related fields, such as the
student’s name, the course taken, the date, and the
grade comprise a RECORD.
FILE: A collection of similar RECORDS.
DATABASE: A group of related files makes up a
DATABASE.
DATABASE
ORGANIZATION’S ELECTRONIC
LIBRARY
STORES & MANAGES DATA
IN A CONVENIENT FORM
*
Example
Student Database
HIERARCHY
Course file
Database
Personal
history file
NAME
File
Record
Financial
file
John Stewart
Karen Taylor
Emily Vincent
COURSE DATE GRADE
IS 101 F02
IS 101 F02
IS 101 F02
NAME
COURSE DATE GRADE
John Stewart IS 101 F02
B+
Field
John Stewart (NAME field)
Byte
01001010 (Letter J in ASCII)
Bit
B+
A
C
0, 1
KEY FIELD
 Every record on a file should contain at
least one field that uniquely identifies
instances of that record so that record can
be retrieved, updated and sorted.
 This identifier field is called KEY FIELD.
 Social security number for a personnel
record would be a key field.
File organization
 In the database, the individual database is a collection of




related attributes about entities.
A record describes an ENTITY. An ENTITY is something
you collect data about, such as people or classes.
A record is a collection of related attributes about a
single entity.
Each record typically consists of many attributes which
are individual pieces of information.
For example, order number, order date, order account,
item number, and item quantity would each be an
attribute of the entity order.
Entity=ORDER
Attributes
Order
number
Order
date
Item
number
Quantity
4340
02/08/03
1583
2
Amount
17.40
fields
Key field
The simple data table
Entity: Student
Attribute types
ID Number Last Name First Name Major
209345
Borden
Chris
Sociology
Attribute
213009
Chan
Lisa
Finance
45789
Austin
John
Law
Record (one row)
Database Management Systems
(DBMS)
 A DBMS is a software application with which you
create, store, organize, and retrieve data from a
single database or several databases.
 In the traditional file environment each functional
area tended to develop systems in isolation from
other functional areas.
 Accounting, finance, manufacturing, human
resources all developed their own systems and
data files.
 The organization is saddled with hundreds of
programs with no one who knows what they do,
what data they use, and who is using the data.
DBMS
DBMS
 DBMS is a software that permits to centralize data,




manage them effectively, and provide access to the
stored data by application programs.
Software to create & maintain data
Enables business applications to extract data.
Rather than storing data in separate files for each
application, data are stored physically to appear to users
as being stored in only one location.
Microsoft Access is an example of a popular DBMS for
personal computers.
DBMS
 DBMS software enables end users to
create and manage their own database
applications.
 Most DBMSs have a specialized language
called a data manipulation language that is
used to manipulate data in the database.
STRUCTURED QUERY LANGUAGE
(SQL)
 SQL is a data manipulation language to
access, read and update data from
databases.
 Most common language used to
interface with the databases.
*
SQL
 The most important SQL commands:
 SELECT: lists the columns from tables that the
user would like to see in a result table.
 FROM: identifies the tables or views from which
the columns will be selected.
 WHERE: includes conditions for selecting
specific rows (records) within a single table and
conditions for joining multiple tables.
SQL
Database
PART
SUPPLIER
Part number
Part
description
Unit price
Supplier
number
137
145
150
152
Door latch
Door handle
Door seal
Compressor
22.50
26.25
6.00
70.00
4058
2038
1058
1125
Supplier
number
4058
2038
1125
Supplier
name
CBM Inc.
Ace Inc.
Bryant Corp.
Supplier address
44 Winslow, Gary, IN 44950
Rte, 101, Essex, NJ 07763
51 Elm, Rochester, NY 11349
SQL
 Suppose you wanted to see the Part_Number,
Part_Description, and Unit_Price for each part in the
PART table. You would specify:
SELECT Part_Number, Part_Description, Unit_Price
FROM PART;
 The results of using the SELECT statement to select
only the columns Part_Number, Part_Description and
Unit_Price from all rows in the PART table.
Part number
Part
description
Unit price
137
145
150
152
Door latch
Door handle
Door seal
Compressor
22.50
26.25
6.00
70.00
SQL
 Suppose, you wanted to see the same data only for
parts in the PART table with unit prices less than $25.00.
You would specify:
SELECT Part_Number, Part_Description, Unit_Price
FROM PART;
WHERE Unit_Price<25.00
The results of using a conditional selection to select only parts that meet
the condition of having unit prices less than $25.
Part number
Part
description
Unit price
137
150
Door latch
Door seal
22.50
6.00
Data manipulation language
 Many DBMS packages have a simpler
way of interfacing with the databasesusing a concept called query by example
(QBE).
 It enables us to fill out a grid, or template,
in order to construct a sample of the data
we want to see.
Database Trends
1.Data Warehouses
 Database that stores current and historical data
of potential interest to managers throughout the
company.
 This data is extracted for various operational
systems and external sources including web site
transactions.
 The data from these diverse applications are
copied into the data warehouse database as
often as needed (hourly, weekly, daily, monthly).
Data Warehouses
 The data are standardized into a common data
model and consolidated so that they can be
used across the enterprise for management
analysis and decision making.
 The data are available for anyone to access as
needed but cannot be altered.
 Much like a physical warehouse for products
and components, the data warehouse stores
and distributes data on computer-based
information systems.
Components of a data warehouse
Operational
data
INTERNAL
DATA
SOURCES
Historical
data
Operational
data
Historical
data
EXTERNAL
DATA
SOURCES
External
data
External
data
Extract and
transform
Data
Warehouse
Information
directory
Data
Access and
analysis
•Data mining
•Reports
2. Data marts
 Companies can build enterprise-wide data
warehouses where a central data
warehouse serves the entire organization,
or they can create smaller, decentralized
warehouses called data marts.
 Small data warehouse for special function,
e.g., Focused marketing based
on customer info.
Data marts
 Subset of a data warehouse in which a summarized or




highly focused portion of the organization's data is
placed in a separate database for a specific population
of users.
Each data mart contains a subset of the data for a
single aspect of a company’s business, for ex. finance,
inventory, or personnel.
Data marts have been popular among small and medium
sized businesses and among departments within larger
companies.
Data marts typically contain tens of gigabytes of data, as
opposed to the hundreds of gigabytes in data
warehouse.
The cost of data mart is typically less than $1 million,
while the cost for a data warehouse can exceed $10
million.
DATABASE ADMINISTRATION
 DEFINES & ORGANIZES DATABASE
STRUCTURE AND CONTENT
 DEVELOPS SECURITY PROCEDURES
 DEVELOPS DATABASE DOCUMENTATION
 MAINTAINS DBMS
*
Management requirements for
database systems
 The database is an organizational
discipline, a method, rather than a tool or
technology.
 It requires organizational change.
 Without management support and
understanding, database efforts fail.
 The critical elements in a database
environment are data administration, data
planning, and database management.
ELEMENTS OF DATABASE
ENVIRONMENT
DATA
DATABASE
ADMINISTRATION
MANAGEMENT
DATABASE
MANAGEMENT
SYSTEM
DATA PLANNING
USERS
Data administration
 The organizations must develop a data
administration function for managing
(collecting, storing, and dissemination of
all types of data) the organization's data
resources.
 Data administration is responsible for the
specific policies and procedures through
which data can be managed as an
organizational resource.
Data planning
 The organization requires enterprise-wide
planning for data.
 Enterprise analysis, which addresses the
information requirements of the entire
organization is needed to develop
databases.
 The purpose of it is to identify the key
entities, attributes, and relationships that
constitute the organization's data.
Database management
 Databases require new software and a new staff
specially trained in DBMS’ techniques as well as
new data management structures.
 Most corporations develop a database design
and management group within the corporation
that is responsible for defining and organizing
the structure and content of the database and
maintaining the database.
Advantages of DBMS
 DBMS can manage hundreds of tables





simultaneously.
Helps us to manage tremendous volume and
complexity of interrelated data.
Prevents unnecessary redundancies of data.
Improved data sharing.
Improved data quality.
Improved data accessibility.
Database Structures
 In all information systems, data resources
must be organized and structured in some
logical manner.
 Contemporary DBMS uses different
database models to keep track of entities,
attributes and relationships.
Database Structures
 Hierarchical
 Network
 Relational
 Object-oriented
TYPES OF RELATIONS
ONE-TO-ONE:
STUDENT
CLASS
ONE-TO-MANY:
STUDENT
A
MANY-TO-MANY:
ID
STUDENT
B
CLASS
1
STUDENT
A
STUDENT
C
CLASS
2
STUDENT
B
STUDENT
C
HIERARCHICAL DATA MODEL
 Hierarchical DBMS depict data logically as one



to-many relationships.
The hierarchical DBMS presents data to users in
a treelike structure.
To the user, each record looks like an
organization chart with one top-level segment
called the root.
An upper segment is connected logically to a
lower segment in a parent-child relationship.
Used for structured, routine types of transaction
processing.
HIERARCHICAL DATA MODEL
ROOT
FIRST
CHILD
2nd
CHILD
Employer
Compensation
Ratings
Salary
Job
Assignments
Pension
Benefits
Insurance
Health
NETWORK STRUCTURE
 Network DBMS depict data logically as many-tomany relationships.
 More flexible than the hierarchical structure in
support of databases for many types of business
operations.
 A typical many-to-many relationship for a
network DBMS is the student-course
relationship. There are many courses in the
university and many students. A student takes
many courses and a course has many students.
NETWORK DATA MODEL
MANY-TO-MANY:
CLASS
1
CLASS
2
STUDENT STUDENT STUDENT
A
B
C
RELATIONAL DBMS
 The most popular type of DBMS today for PCs as well as




for larger computers and mainframes is the relational
DBMS.
It can relate data stored in one table to data in another
as long as the two tables share a common data element.
Easier for programmers to work with and easier to
maintain than the hierarchical and network models.
Leading mainframe relational database management
systems include IBM’s DB2 and Oracle from the Oracle
Corporation.
MS Access is a PC relational database management
system.
RELATIONAL DBMS
PART
SUPPLIER
Part number
Part
description
Unit price
Supplier
number
Supplier
number
Supplier
name
Supplier address
137
145
150
152
Door latch
Door handle
Door seal
Compressor
22.50
26.25
6.00
70.00
4058
2038
1058
1125
4058
2038
1125
CBM Inc.
Ace Inc.
Bryant
44 Winslow, Gary, IN 44950
Rte, 101, Essex, NJ 07763
51 Elm, Rochester, NY 11349
Select Part_number=137 or 152
Join by Supplier_number
Part number
Supplier
number
Supplier
name
Supplier address
137
152
4058
1125
CBM Inc.
Bryant
44 Winslow, Gary, IN 44950
51 Elm, Rochester, NY 11349
Combined data from two different tables and only selected attributes are displayed.
Object-oriented DBMS
 Many applications today and in the future will require
databases that can store and retrieve not only
structured numbers and characters but also drawings,
images, photographs, voice, and full-motion video.
 Stores data & procedures as objects.
 Conventional DBMSs are not well suited to handling
graphics-based or multimedia applications.
 They are becoming more popular because they can
be used to manage the various multimedia
components or Java applets used in Web applications
which typically integrate pieces of info from a variety
of sources.
CREATING A DATABASE
 To create a database, one must go through two
design exercises:
 CONCEPTUAL DESIGN
 PHYSICAL DESIGN
*
CREATING A DATABASE
CONCEPTUAL DESIGN:
 Abstract model from a business
perspective.
 It describes how data will be grouped.
 It establishes end-user needs.
*
CREATING A DATABASE
PHYSICAL DESIGN:
 Detailed model by database specialists
 How data are to be physically stored
and accessed on the storage devices
*
CREATING A DATABASE
Entity-relationship
diagram
 Database designers document the logical
data model with an entity-relationship
diagram (ERD) which is a methodology for
documenting databases illustrating the
relationship between various entities in the
database.
ENTITY- RELATIONSHIP DIAGRAM
Entity Attributes
Entities
ORDER
ORDER: #, DATE, PART #, QUANTITY
1
Relationships
CAN
HAVE
M
PART: #, DESCRIPTION, UNIT PRICE,
SUPPLIER #
PART
M
CAN
HAVE
1
SUPPLIER
SUPPLIER: #, NAME,
ADDRESS