Chapter 6 Study Tool

Download Report

Transcript Chapter 6 Study Tool

Systems Analysis & Design
Sixth Edition
Chapter 7
Data Design
Introduction
● You will develop a physical plan for data
organization, storage, and retrieval
● Begins with a review of data design
concepts and terminology, then discusses
file-based systems and database systems,
including Web-based databases
● Concludes with a discussion of data
storage and access, including strategic
tools such as data warehousing and data
mining, physical design issues, logical and
physical records, data storage formats, and
data control
2
Data Design Concepts
● Data Structures
– A file or table contains data about people,
places, things, or events that interact with the
system
– File-oriented system
• File processing system
– Database system
3
Data Design Concepts
● Overview of File Processing
– Uses various types of files
•
•
•
•
•
•
Master file
Table file
Transaction file
Work file – scratch file
Security file
History file
4
Example of file system
5
Data Design Concepts
● Overview of File Processing
– Potential problems
• Data redundancy
• Data integrity
• Rigid data structure
6
Data Design Concepts
● Overview of Database Systems
– A properly design database system offers a
solution to the problems of file processing
– Provides an overall framework that avoids data
redundancy and supports a real-time, dynamic
environment
– Database management system (DBMS)
– The main advantage of a DBMS is that it offers
timely, interactive, and flexible data access
7
Data Design Concepts
● Overview of Database Systems
– Advantages
•
•
•
•
•
•
•
•
•
•
Scalability(擴充性)
Better support for client/server systems
Economy of scale
Flexible data sharing
Enterprise-wide application – database administrator
(DBA)
Stronger standards
Controlled redundancy
Better security
Increased programmer productivity
Data independence
8
Data Design Concepts
● Database Tradeoffs
– Because DBMSs are powerful, they
require more expensive hardware,
software, and data networks capable of
supporting a multi-user environment
– More complex than a file processing
system
– Procedures for security, backup, and
recovery are more complicated and
critical
9
DBMS Components
● A DBMS provides an interface between a
database and users who need to access the
data
10
DBMS Components
● Interfaces for Users, Database
Administrators, and Related Systems
– Users
• Query language (p. 310 for example)
 Query by example (QBE)
 SQL (structured query language)
– Database Administrators (DBA)
• A DBA is responsible for DBMS management and
support
11
DBMS Components
● Data Manipulation Language
– A data manipulation language (DML) controls
database operations, including storing,
retrieving, updating, and deleting data
● Schema (綱目)
– The complete definition of a database, including
descriptions of all fields, tables, and
relationships, is called a schema
– You also can define one or more subschemas
12
DBMS Components
● Physical Data Repository
– The data dictionary is transformed into a
physical data repository, which also contains the
schema and subschemas
– The physical repository might be centralized, or
distributed at several locations, and different
venders of databases might be used
– Need ODBC-compliant software to resolve
potential database connectivity and access
problems
• ODBC – open database connectivity
• JDBC – Java database connectivity
13
Web-Based Database Design
● Characteristics of Web-Based Design
– In a Web-based design, the Internet serves
as the front end, or interface, for the database
management system
– Internet technology provides enormous power
and flexibility
– Web-based systems are popular because
they offer ease of access, cost-effectiveness,
and worldwide connectivity
14
Web-Based Database Design
● Connecting a Database to the Web
– Database must be connected to the Internet or
intranet
– Database and internet speak two different
“languages”
– Middleware is needed
• A softwaqre that integrates different applications and
allows them to exchange data
• P. 312 for figure 7-9
15
Web-Based Database Design
● Data Security
– Web-based data must be totally secure, yet
easily accessible to authorized users
– To achieve this goal, well-designed systems
provide security at three levels: the database
itself, the Web server, and the
telecommunication links that connect the
components of the system
16
Data Design Terminology
● Definitions
– Entity: a person, place, thing, or event which
data is collected and maintained
– Table or file: a set of related records
– Record
• AKA Tuple
– Field
• AKA attribute
• Common field: an attribute that appears in more than
one entity. Used to link entities
17
Data Design Terminology
● Key Fields (p. 315)
– Primary key: unique field
– Combination key
• Composite key
• Concatenated key
• Multi-valued key
–
–
–
–
Candidate key: could be a p.k.
Nonkey field: not a p.k.or candidate key
Foreign key: 別的table的primary key
Secondary key: not unique, zip code
18
Data Design Terminology
● Referential Integrity: Validity checks can
help avoid data input errors
– A set of rules that avoids data inconsistency and
quality problems
– A foreign key value cannot be entered in one
table unless it matches a existing primary key in
another table
– EX. Referential integrity would prevent you from
entering a customer order in an order table
unless that customer already exists in the
customer table.
– orphan
19
Example of Referential Integrity in
Access
20
Entity-Relationship Diagrams
● Provides an overall view of the system,
and a blueprint for creating the physical
data structures
● An entity is a person, place, thing, or
event for which data is collected and
maintained
21
Entity-Relationship Diagrams
● Drawing an ERD
– The first step is to list
the entities that you
identified during the
fact-finding process
and to consider the
nature of the
relationships that link
them
22
Entity-Relationship Diagrams
● Types of Relationships: p. 318~320
– One-to-one relationship (1:1)
– One-to-many relationship (1:M)
– Many-to-many relationship (M:N)
• Associative entity
● Cardinality
• Cardinality notation
• P.318, 321
23
Normalization
● Table design
● Involves four stages: unnormalized
design, first normal form, second normal
form, and third normal form
● Most business-related databases must
be designed in third normal form
24
Normalization
● Standard Notation Format
– Designing tables is easier if you use a
standard notation format to show a table’s
structure, fields, and primary key
Example: NAME (FIELD 1, FIELD 2, FIELD 3)
25
Normalization
● Repeating Groups and Unnormalized
Designs
– Repeating group
• Often occur in manual documents prepared by users
– Unnormalized design
26
Unnormalized example
● ORDER (ORDER-NUM, ORDER-DATE,
(PRODUCT-NUM, PRODUCT-DESC, NUMORDERED))
27
Normalization
● First Normal Form
– A table is in first normal form (1NF) if it does not
contain a repeating group
– To convert, you must expand the table’s primary
key to include the primary key of the repeating
group
● ORDER (ORDER-NUM, ORDER-DATE,
PRODUCT-NUM, PRODUCT-DESC, NUMORDERED)
– See p. 324
28
Normalization
● Problems found in First Normal Form
– Four kinds of problems are found with 1NF
designs
• Consider the work necessary to change a particular
product’s description
• 1NF tables can contain inconsistent data
• Adding a new product is a problem
• Deleting a product is a problem
29
Normalization
●
Second Normal Form
– A table is in the 2NF if it is in 1NF and if all fields that
are NOT part of the primary key are functionally
dependent on the entire primary key
– To understand second normal form (2NF), you must
understand the concept of functional dependence
– Functionally dependent:函數相依
– Field X is functionally dependent on field Y if the
value of field X depends on the value of field Y
– ORDER_DATE is FD on ORDER_NUM
– DRODUCT_DESC is FD on PRODUCT_NUM
30
Normalization
●
Second Normal Form
– A standard process exists for converting a
table from 1NF to 2NF
1. Create and name a separate table for each
field in the existing primary key
2. Create a new table for each possible
combination of the original primary key
fields
3. Study the three tables and place each field
with its appropriate primary key
Example: p. 326
31
Normalization
● Third Normal Form
– A table design is in third normal form (3NF) if it is in
2NF and if NO nonkey field is dependent on another
nonkey field
– 3NF design avoids redundancy and data integrity
problems that still can exist in 2NF designs
– To convert the table to 3NF, you must remove all
fields from the 2NF table that depend on another
nonkey field and place them in a new table that uses
the nonkey field as a primary key
– Example: p. 327, Figure 7-25
32
Normalization
● A Normalization
Example (p. 329-333)
– To show the
normalization process,
consider the familiar
situation, which depicts
several entities in a
school advising system:
ADVISOR, COURSE,
and STUDENT
33
Unnormalized form
● STUDENT (STUDENT-NUMBER,
STUDENT-NAME, TOTAL-CREDITS, GPA,
ADVISOR-NUMBER, ADVISOR-NAME,
(COURSE-NUMBER, COURSE-DESC,
NUMBER-CREDIT, GRADE))
34
1NF
● STUDENT (STUDENT-NUMBER,
STUDENT-NAME, TOTAL-CREDITS, GPA,
ADVISOR-NUMBER, ADVISOR-NAME,
COURSE-NUMBER, COURSE-DESC,
NUMBER-CREDIT, GRADE)
35
2NF
● STUDENT (STUDENT-NUMBER,
STUDENT-NAME, TOTAL-CREDITS, GPA,
ADVISOR-NUMBER, ADVISOR-NAME)
not 3NF
● COURSE (COURSE-NUMBER, COURSEDESC, NUMBER-CREDIT)
● GRADE (STUDENT-NUMBER, COURSENUMBER, GRADE)
36
3NF
● STUDENT (STUDENT-NUMBER,
STUDENT-NAME, TOTAL-CREDITS, GPA,
ADVISOR-NUMBER)
● ADVISOR (ADVISOR-NUMBER,
ADVISOR-NAME)
● COURSE (COURSE-NUMBER, COURSEDESC, NUMBER-CREDIT)
● GRADE (STUDENT-NUMBER, COURSENUMBER, GRADE)
37
New ERD
● P. 333
38
Steps in Database Design
1. Create the initial ERD
2. Assign all data elements to entities
3. Create 3NF designs for all tables, taking
care to identify all primary, secondary,
and foreign keys
4. Verify all data dictionary entries
5. After creating your final ERD and
normalized table designs, you can
transform them into a database
39
Using Codes During Data Design
● Overview of Codes
– Because codes often are used to represent
data, you encounter them constantly in your
everyday life
– They save storage space and costs, reduce
transmission time, and decrease data entry time
– Can reduce data input errors
40
Using Codes During Data Design
● Types of Codes
1. Sequence codes
2. Block sequence codes (e.g. 100 level course–
entry level course)
3. Alphabetic codes
a. Category codes (CS, EE)
b. Abbreviation codes – mnemonic codes (NY, JFK)
4.
5.
6.
7.
Significant digit codes (e.g. zipcode)
Derivation codes (p. 335, Fig. 7-37)
Cipher codes (用於code價錢)
Action codes (A for add, D for delete)
41
Database Models
● Relational Databases
– The relational model was introduced during the
1970s and became popular because it was
flexible and powerful
– Because all the tables are linked, a user can
request data that meets specific conditions
– New entities and attributes can be added at any
time without restructuring the entire database
– Example in p. 338, figure 7-41, 42
42
Data Storage and Access
● Data storage and access involve strategic
business tools
– Data warehouse - dimensions
43
Data Storage and Access
● Strategic tools for data storage and
access
– Data Mining: works best when you have clear,
measurable goals
– Walmart’s example of data mining
44
Data Control
● File and database control must include
all measures necessary to ensure that
data storage is correct, complete, and
secure
● A well-designed DBMS must provide
built-in control and security features,
including subschemas, passwords,
encryption, audit trail files, and backup
and recovery procedures to maintain
data
45
Data Control
●
●
●
●
●
●
●
●
User ID
Password
Permissions
Encryption
Backup
Recovery procedures
Audit log files
Audit fields
46