Transcript Chapter 7

Systems Analysis & Design
Sixth Edition
Chapter 7
Chapter Objectives
● Explain data design concepts and data
structures
● Describe file processing systems and
various types of files
● Understand database systems and
define the components of a database
management system (DBMS)
● Describe Web-based data design
2
Chapter Objectives
● Explain data design terminology,
including entities, fields, common fields,
records, files, tables, and key fields
● Describe data relationships, draw an
entity-relationship diagram, define
cardinality, and use cardinality notation
● Explain the concept of normalization
● Explain the importance of codes and
describe various coding schemes
3
Chapter Objectives
● Describe relational and object-oriented
database models
● Explain data warehousing and data
mining
● Differentiate between logical and
physical storage and records
● Explain data control measures
4
Introduction
● You will develop a physical plan for data
organization, storage, and retrieval
● Begins with a review of data design concepts
and terminology, then discusses file-based
systems and database systems, including Webbased databases
● Concludes with a discussion of data storage
and access, including strategic tools such as
data warehousing and data mining, physical
design issues, logical and physical records,
data storage formats, and data control
5
Data Design Concepts
● Data Structures
– A file or table contains data about people,
places, things, or events that interact with the
system
– File-oriented system
– File processing system
– Database system
6
Data Design Concepts
● Overview of File Processing
– Potential problems
• Data redundancy
• Data integrity
• Rigid data structure
7
Data Design Concepts
● Overview of File Processing
– Uses various types of files
•
•
•
•
•
•
Master file
Table file
Transaction file
Work file – scratch file
Security file
History file
8
Data Design Concepts
● Overview of Database Systems
– A properly design database system offers a
solution to the problems of file processing
– Provides an overall framework that avoids data
redundancy and supports a real-time, dynamic
environment
– Database management system (DBMS)
– The main advantage of a DBMS is that it offers
timely, interactive, and flexible data access
9
Data Design Concepts
● Overview of Database Systems
– Advantages
•
•
•
•
•
Scalability
Better support for client/server systems
Economy of scale
Flexible data sharing
Enterprise-wide application – database administrator
(DBA)
• Stronger standards
10
Data Design Concepts
● Overview of Database Systems
– Advantages
•
•
•
•
Controlled redundancy
Better security
Increased programmer productivity
Data independence
11
Data Design Concepts
● Database Tradeoffs
– Because DBMSs are powerful, they require
more expensive hardware, software, and data
networks capable of supporting a multi-user
environment
– More complex than a file processing system
– Procedures for security, backup, and recovery
are more complicated and critical
12
DBMS Components
● Interfaces for Users, Database
Administrators, and Related Systems
– Users
• Query language
• Query by example (QBE)
• SQL (structured query language)
– Database Administrators
• A DBA is responsible for DBMS management and
support
13
DBMS Components
● Interfaces for Users, Database
Administrators, and Related Systems
– Related information systems
• A DBMS can support several related information
systems that provide input to, and require specific data
from, the DBMS
14
DBMS Components
● Data Manipulation Language
– A data manipulation language (DML) controls
database operations, including storing,
retrieving, updating, and deleting data
● Schema
– The complete definition of a database, including
descriptions of all fields, tables, and
relationships, is called a schema
– You also can define one or more subschemas
15
DBMS Components
● Physical Data Repository
– The data dictionary is transformed into a
physical data repository, which also contains the
schema and subschemas
– The physical repository might be centralized, or
distributed at several locations
– ODBC – open database connectivity
– JDBC – Java database connectivity
16
Web-Based Database Design
● Characteristics of Web-Based Design
– In a Web-based design, the Internet serves
as the front end, or interface, for the database
management system
– Internet technology provides enormous power
and flexibility
– Web-based systems are popular because
they offer ease of access, cost-effectiveness,
and worldwide connectivity
17
Web-Based Database Design
● Internet Terminology
–
–
–
–
–
–
–
Web browser
Web page
HTML – Hypertext Markup Language
Tags
Web server
Web site
Intranet
18
Web-Based Database Design
● Internet Terminology
–
–
–
–
–
Extranet
Protocols
Web-centric
Clients
Servers
19
Web-Based Database Design
● Connecting a Database to the Web
– Database must be connected to the Internet or
intranet
• Middleware
• Macromedia’s ColdFusion
20
Web-Based Database Design
● Data Security
– Web-based data must be totally secure, yet
easily accessible to authorized users
– To achieve this goal, well-designed systems
provide security at three levels: the database
itself, the Web server, and the
telecommunication links that connect the
components of the system
21
Data Design Terminology
● Definitions
– Entity
– Table or file
– Field
• Attribute
• Common field
– Record
• Tuple
22
Data Design Terminology
● Key Fields
– Primary key
•
•
•
•
Combination key
Composite key
Concatenated key
Multi-valued key
– Candidate key
• Nonkey field
– Foreign key
– Secondary key
23
Data Design Terminology
● Referential Integrity
– Validity checks can help avoid data input errors
24
Entity-Relationship Diagrams
● An entity is a person, place, thing, or
event for which data is collected and
maintained
● Provides an overall view of the system,
and a blueprint for creating the physical
data structures
● Entity-relationship diagram
25
Entity-Relationship Diagrams
● Drawing an ERD
– The first step is to list
the entities that you
identified during the
fact-finding process
and to consider the
nature of the
relationships that link
them
26
Entity-Relationship Diagrams
● Types of Relationships
– One-to-one relationship (1:1)
– One-to-many relationship (1:M)
– Many-to-many relationship (M:N)
• Associative entity
● Cardinality
• Cardinality notation
• Crow’s foot notation
• Unified Modeling Language (UML)
27
Normalization
● Table design
● Involves four stages: unnormalized
design, first normal form, second normal
form, and third normal form
● Most business-related databases must
be designed in third normal form
28
Normalization
● Standard Notation Format
– Designing tables is easier if you use a
standard notation format to show a table’s
structure, fields, and primary key
Example: NAME (FIELD 1, FIELD 2, FIELD 3)
29
Normalization
● Repeating Groups and Unnormalized
Designs
– Repeating group
• Often occur in manual documents prepared by users
– Unnormalized design
30
Normalization
● First Normal Form
– A table is in first normal form (1NF) if it does not
contain a repeating group
– To convert, you must expand the table’s primary
key to include the primary key of the repeating
group
● Second Normal Form
– To understand second normal form (2NF), you
must understand the concept of functional
dependence
– Functionally dependent
31
Normalization
● Second Normal Form
–
A standard process exists for converting a
table from 1NF to 2NF
1. Create and name a separate table for each field in
the existing primary key
2. Create a new table for each possible combination of
the original primary key fields
3. Study the three tables and place each field with its
appropriate primary key
32
Normalization
● Second Normal Form
– Four kinds of problems are found with 1NF
designs that do not exist in 2NF
• Consider the work necessary to change a particular
product’s description
• 1NF tables can contain inconsistent data
• Adding a new product is a problem
• Deleting a product is a problem
33
Normalization
● Third Normal Form
– 3NF design avoids redundancy and data
integrity problems that still can exist in 2NF
designs
– A table design is in third normal form (3NF) if it is
in 2NF and if no nonkey field is dependent on
another nonkey field
– To convert the table to 3NF, you must remove all
fields from the 2NF table that depend on another
nonkey field and place them in a new table that
uses the nonkey field as a primary key
34
Normalization
● A Normalization
Example
– To show the
normalization process,
consider the familiar
situation, which depicts
several entities in a
school advising system:
ADVISOR, COURSE,
and STUDENT
35
Using Codes During Data Design
● Overview of Codes
– Because codes often are used to represent
data, you encounter them constantly in your
everyday life
– They save storage space and costs, reduce
transmission time, and decrease data entry time
– Can reduce data input errors
36
Using Codes During Data Design
● Types of Codes
1. Sequence codes
2. Block sequence codes
3. Alphabetic codes
a. Category codes
b. Abbreviation codes – mnemonic codes
4.
5.
6.
7.
Significant digit codes
Derivation codes
Cipher codes
Action codes
37
Using Codes During Data Design
● Developing a Code
1.
2.
3.
4.
5.
6.
7.
8.
9.
Keep codes concise
Allow for expansion
Keep codes stable
Make codes unique
Use sortable codes
Avoid confusing codes
Make codes meaningful
Use a code for a single purpose
Keep codes consistent
38
Steps in Database Design
1. Create the initial ERD
2. Assign all data elements to entities
3. Create 3NF designs for all tables, taking
care to identify all primary, secondary,
and foreign keys
4. Verify all data dictionary entries
5. After creating your final ERD and
normalized table designs, you can
transform them into a database
39
Database Models
● Relational Databases
– The relational model was introduced during the
1970s and became popular because it was
flexible and powerful
– Because all the tables are linked, a user can
request data that meets specific conditions
– New entities and attributes can be added at any
time without restructuring the entire database
40
Database Models
● Object-Oriented Databases
– Many systems developers are using objectoriented database (OODB) design as a natural
extension of the object-oriented analysis process
• Object Data standard
• Object Database Management Group (ODMG)
• Each object has a unique object identifier
41
Data Storage and Access
● Data storage and access involve strategic
business tools
● Strategic tools for data storage and access
– Data warehouse - dimensions
42
Data Storage and Access
● Strategic tools for data storage and
access
– Data Mining
•
•
•
•
•
Increase average pages viewed per session
Increase number of referred customers
Reduce clicks to close
Increase checkouts per visit
Increase average profit per checkout
43
Data Storage and Access
● Logical and Physical Storage
– Logical storage
• Characters
• Data element or data item
• Logical record
– Physical storage
• Physical record or block
• Buffer
• Blocking factor
44
Data Storage and Access
● Data Storage Formats
–
–
–
–
–
Binary digits
Bit
Byte
EBCDIC and ASCII
Unicode
45
Data Storage and Access
● Data Storage Formats
– Binary
•
•
•
•
Binary storage format
Integer format
Long integer format
Other binary formats exist for efficient storage of
exceedingly long numbers
46
Data Storage and Access
● Selecting a Data Storage Format
– In many cases, a user can select a specific
data storage format
– For example, when using Microsoft Office,
you can store documents, spreadsheets, and
databases in Unicode-compatible form by
using the font called Arial Unicode MS
– Best answer is it depends on the situation
47
Data Storage and Access
● Date Fields
– Most date formats now are based on the
model established by the International
Organization for Standardization (ISO)
– Can be sorted easily and used in
comparisons
– Absolute date
– Best method depends on how the specific
date will be printed, displayed, or used in a
calculation
48
Data Control
● File and database control must include
all measures necessary to ensure that
data storage is correct, complete, and
secure
● A well-designed DBMS must provide
built-in control and security features,
including subschemas, passwords,
encryption, audit trail files, and backup
and recovery procedures to maintain
data
49
Data Control
●
●
●
●
●
●
●
●
User ID
Password
Permissions
Encryption
Backup
Recovery procedures
Audit log files
Audit fields
50
Chapter Summary
● Files and tables contain data about
people, places, things, or events that
affect the information system
● DBMS designs are more powerful and
flexible than traditional file-oriented
systems
51
Chapter Summary
● Data design tasks include creating an
initial ERD; assigning data elements to
an entity; normalizing all table designs;
and completing the data dictionary
entries for files, records, and data
elements
● Logical storage is information seen
through a user’s eyes, regardless of
how or where that information actually
is organized or stored
52
Chapter Summary
● Physical storage is hardware-related and
involves reading and writing blocks of
binary data to physical media
● File and database control measures
include limiting access to the data, data
encryption, backup/recovery procedures,
audit-trail files, and internal audit fields
● Chapter 7 complete
53