Chapter Eight

Download Report

Transcript Chapter Eight

Chapter Eight
Managing Data Resources
File Organization and Concepts
The Data Hierarchy
Human Resources Database
Personnel Table
Database
Payroll Table
Table
James Thompson 36 220 Larkin Drive
Anna Harvey
24 63 Whalen Street
Record
James Thompson 36 220 Larkin Drive
Field
James Thompson (name field)
Byte
1011 10010
Bit
1, 0
File Organization Terms
• Field - a grouping a characters into a word, a
group of words, or a complete number, such
as a person’s name and age. (Key field uniquely identifies)
• Record - A group of related fields.
• File - a group of records of the same type
• Entity -a person, place,thing, or event about
which information must be kept
• Attribute - A piece of information describing a
particular entity.
Illustration of a Record
Entity = ORDER
Attributes
Order
Number
4340
KEY FIELD
Order
Date
2/8/99
Item
Number
Quantity
Amount
1583
2
17.4
FIELDS
Accessing Records from Computer Files
• Sequential File Organization
– a method of storing data records in which the
records must be retrieved in the same physical
sequence in which they are stored
• Direct/Random File Organization
– a method of storing data records in a file so that
they can be accessed in any sequence without
regard to their actual physical order on the
storage media
Data Access Methods
• Indexed Sequential Access Method
– a file access method to directly access records
organized sequentially using an index of key fields
• Direct File Access Method– a method of accessing records by mathematically
transforming the key fields into specific addresses for
the records
• Transform Algorithm
– a mathematical formula used to translate a record’s key
field directly into the record’s direct physical storage
location
Indexed Sequential Access Method
Record
230
Data
Cylinder Index
Cylinder Highest Key
1
200
2
392
3
500
.
.
.
.
Track Index
Cylinder 1
Track # Highest Key
1
9
2
19
3
28
4
39
Track Index
Cylinder 1
Track #
1
2
3
4
Highest Key
208
238
260
279
Track Index
Cylinder 1
Track # Highest Key
1
399
2
419
3
440
4
468
Direct File Access Method
• Basically uses an algorithm within a disk
stack to access the record
• Example - File Size -1000 records
– take the prime number closet to the max
number of records (in this case 997), divide the
key field number by this prime number and the
remainder is the address location number.
A Disk Stack
Access
mechanism
10 Access 404 Tracks
arms
000
403
20
20 Tracks
Read/write
heads
11 disks
A DASD Address
2
0
Track
number
9
0
7
Read/write
head number
0 0
Record
number
3
Traditional File Environment
• A way of collecting and maintaining data
in an organization that leads to each
functional area or division creating its
own data files and programs. Can be
either flat or data approached.
• Problems with TFE
– data redundancy, program data dependence,
inflexibility, poor data security, inability to
share
An Example of Two Flat Files
Table A
SALES
TERR
NO
Table B
SALES NO
NAME
112
1
112
ADAMS
128
3
128
WINKLER
153
2
153
HOUSE
159
1
159
FRANCIS
162
1
162
WILLIS
166
2
166
GROVETON
Problems with TFE Explained
• Data Redundancy
– presence of duplicate data in multiple data files
• Program-Data Dependence
– the close relationship between data and programs. Any
update in the data requires a format change within the
software (ZIP/AREA CODE ex)
• Inflexibility - cannot produce custom reports
• Security - with little control of data, hard to
control access
• Data sharing - with data in many places hard
to share b/c you do not know where it is
The Database Concept
• Database definition
An integrated collection of computer data,
organized and stored in a manner that
facilitates easy retrieval.
• Data independence -- keep data
specifications separate from programs,
in tables and indexes
Database Management System(DBMS)
• The special software to create and
maintain a database and enable
individual business applications to
extract the data they need without
having to create separate files or data
definitions in their computer programs
–
–
–
–
a DBMS has three components
a data definition language
a data manipulation language
a data dictionary
Hierarchical Database
Custormers
Customer
Kathy
Cindy
Debra
Order
Dog
food
Cat
food
Bird
food
Dog
food
Bird
food
..
..
Network Database
Custormers
Customer
Kathy
Order
Dog
food
Cat
food
Cindy
Bird
food
Dog
food
Dog Food
Debra
Bird
food
..
..
Relational Database
• Table format
• Easy access (query)
Customer(CustomerID, Name, OrderID,…)
Order(OrderID, CustomerID, OrderDate,…)
ItemsOrdered(OrderID, ItemID, Quantity)
Customer
CustomerID
Name
..
Order
OrderID
..
…
ItemsOrdered
OrderID
..
..
Database Differences
Comparison of Database Alternatives
DB Type
Flexibility
Hierarchical
Processing
Efficiency
high
low
End-User
Friendliness
low
Network
medium-high
low-med
low-moderate
high
Relational
low but
improving
high
high
low
Programming
Complexity
high
Design a database
Sally’s Pet Store
Animal
Sale
Supplier
Employee
Customer
The Pet store’s Database
Animal
Name AnimalID Category
Breed Dateborn Gender Color
Cathy
Terry
Sandy
Charles
Ruby
Curtis
Lovebird
Canary
African
Parrot
Other
Grey
123
353
345
232
565
675
Bird
dog
dog
Bird
dog
cat
6/5/98
7/1/97
8/5/98
4/6/99
6/4/99
8/5/98
Male
Female
Male
Male
Male
Male
Employee
Name Soc.Sec.No Address Salary
Daniel 232423435
……… ………
…..
……
$23000
AnimalID
123
Yellow
Red
Yellow
Yellow
Yellow
Grey
Structural Query Language (SQL)
Four basic commands:
•
•
•
•
SELECT columns Which property do you want ?
FROM tables
What tables are involved?
JOIN conditions How are the table joined?
WHERE criteria
What are the constraints?
SELECT AnimalID, Category, Dateborn
FROM Animal
WHERE ((Category=“Dog”) AND (Dateborn>#6/1/98#));
Query
Resulting table:
AnimalID
Category
345
565
SELECT Name, AnimalID, Salary
FROM Employee
WHERE (AnimalID=#123#)
dog
dog
Dateborn
8/5/98
6/4/99
SQL(2)
• List all birds who are male or who are born
before 6/1/98 and have red in their color
SELECT AnimalID, Category, Gender, Dateborn, Color
FROM Animal
WHERE (((Category=“Bird”) AND (Gender=“Male”))
OR ((Dateborn<#6/1/98#) AND
(Color Like “*Red*”)))
Back to Database Design
• The main idea is to separate all the 1 to
many relationships.
– Look at page 243, or to your access quiz for
examples
• Each table holds specific information
that is tied to the other tables through
‘keys’ and relationships
• this process is referred to as
normalization
Normalization
• Why?
– To prevent anomalies - occurrences in the
database that result from inefficient database
structure. They also cause harmful events that
can change or erase data.
• There are three types of anomalies
– insertion
– deletion
– modification
Insertion anomalies
• Refers to the inability to add data to the
database because the tables are not
referenced correctly.
• This is to say if we want to add a new
element we would have to add to
numerous tables, and failure to do so
would cause inconsistent data.ex. The
inability to enter a new employee w/o
adding in multiple tables.
Deletion Anomalies
• This is the opposite of the insertion
anomaly. When items are not
referenced correctly we can lose
information.
• Ex. We have an employee database
that does not have a separate
department database, if we delete that
employee we will lose that dept. info.
Modification Anomalies
• This occurs when you can change the
value of one of the attributes (columns)
of a dept, but that data is not changed
throughout the database, unless we go
in and manually change each table.
Other Database Concepts
• Databases on the web
– technology makes this really easy to do
– the gsc example
– makes for more efficient transfer of data as well
as collection, with db security we can have EDI
which is the underlying foundation of
Electronic Commerce
– related to Online Analytical Processing (OLAP)
see book for example
Mgmt Requirements for DB Systems
• Data administration- this is the definition
of information requirements and access
– usually formulated into a information policy
• Data Planning and modeling - the actual
database formation process, what are
‘keys’, attributes, and relationships
• database administration
• users- the idea of making your database
useable by all employees, plus support
Database Administration
• Defines and organizes database
structure and content
• develops security procedures to
safeguard the database
• develops database documentation
• maintains the database mgmt. software
Other terms to look up and know
• Distributed database
• data warehouse
• datamart