Transcript Chapter 6

Phase 3. Systems Design
Chapter 6 – Data Design
Phase Description
• Systems Design is the third of five phases
in the systems development life cycle
(SDLC)
• Now you are ready to begin the physical
design of the system that will meet the
specifications described in the system
requirements document
• Systems design tasks include data design,
user interface design, and system
architecture
Data Design Concepts
• Before constructing an information system,
a systems analyst must understand basic
design concepts, including data structures
and the characteristics of file-oriented and
database systems, including Web-based
database design
Data Design Concepts
• Data Structures
• A framework for organizing and storing
data => consists of one/more files/tables
– A file or table contains data about people,
places or events that interact with the system
– File-oriented system
• File processing
– Database system
Figure 6-2
Figure 6-3
Data Design Concepts
• File processing
• Database system
Data Design Concepts
• Overview of File Processing
– Although less common today, file processing
can be more efficient and cost less than a
DBMS in certain situations – E.g. large
volumes of structured data such as process
records in a sequence
Figure 6-4
Data Design Concepts
• Overview of File Processing
– Potential problems
• Data redundancy
• Data integrity
• Rigid data structure
Data Design Concepts
• Overview of File Processing
– Uses various types of files
• Master file – relatively permanent data about an
entity
• Table file – reference data (e.g. tax table, rate
tables)
• Transaction file – day-to-day business records.
Input file that updates master file.
• Work file/scratch file – temporary file for a single
task
• Security file – for backup and recovery purposes
• History file – for historical or archiving purposes
Data Design Concepts
• Overview of Database Systems
– A properly design database system offers a
solution to the problems of file processing
A database might serve
5 separate systems in a
database environment
Figure 6-5
Data Design Concepts
• Overview of Database Systems
– A database management system (DBMS) is a
collection of tools, features, and interfaces
that enables users to add, update, manage,
access, and analyze the contents of a
database
– The main advantage of a DBMS is that it
offers timely, interactive, and flexible data
access
Data Design Concepts
• Overview of Database Systems
– Advantages
• Scalability – expandable/modifiable/downsizable
• Better support for client/server systems
• Economy of scale – efficiency of high-volume
processing on larger computers
• Flexible data sharing
• Enterprise-wide application – database
administrator (DBA)
Data Design Concepts
• Overview of Database Systems
– Advantages
• Stronger standards – for data names, formats,
documentation, etc
• Controlled redundancy
• Better security – define authorization procedures
• Increased programmer productivity
• Data independence
DBMS Components
Figure 6-6
• Interface
• DML – controls DB
operations (storing,
retrieving, updating,
deleting)
• Schema – Defines
the DB (fields,
tables,
relationships)
• Physical Data
Repository
Web-Based Database Design
• Characteristics of Web-Based Design
– In a Web-based design, the Internet serves
as the front end, or interface for the
database management system. Internet
technology provides enormous power and
flexibility
– Web-based systems are popular because
they offer ease of access, costeffectiveness, and worldwide connectivity
Figure 6-8
Web-Based Database Design
• Connecting a Database to the Web
– Database must be connected to the
Internet or intranet
• Middleware
• Macromedia’s ColdFusion
Figure 6-9
Data Design Terminology
• Definitions
– Entity
– Table or file
– Field
• Attribute
• Common field
– Record
• Tuple
Figure 6-11
Data Design Terminology
• Key Fields
– Primary key – unique field or combination of
more than one field (Combination key,
Composite key, Concatenated key, Multivalued key)
– Candidate key – any field(s) that could be
primary key
• Nonkey field – not primary or candidate
– Foreign key – a field in one table that matches
a primary key.
– Secondary key – field that can be used to
access/retrieve records. Not unique
Data Design Terminology
• Referential Integrity
– Validity checks can help avoid data input
errors
– In a relational database, referential integrity
means that a foreign key value cannot be
entered in one table unless it matches an
existing primary key in another table
Figure 6-13
Data Relationships
• A relationship is a logical link between
entities based on how they interact
• Entity-Relationship Diagrams
– One-to-one relationship (1:1)
– One-to-many relationship (1:M)
– Many-to-many relationship (M:N)
Data Relationships
• Entity-Relationship Diagrams
– Cardinality
• Cardinality notation
• Crow’s foot notation
• Unified Modeling Language (UML)
Figure 6-14
Figure 6-15
Figure 6-16
Figure 6-17
Normalization
• Normalization
• Table design
• Involves four stages: unnormalized design,
first normal form, second normal form, and
third normal form
• Most business-related databases must be
designed in third normal form
Normalization
• Standard Notation Format
– Designing tables is easier if you use a
standard notation format to show a table’s
structure, fields, and primary key
Example: NAME (FIELD 1, FIELD 2, FIELD 3)
Normalization
• Repeating Groups and Unnormalized
Design
– Repeating group
• Often occur in manual documents prepared by
users
– Unnormalized design
Normalization
• Repeating Groups
and Unnormalized
Design
Normalization
• First Normal Form
– A table is in first normal form (1NF) if it does
not contain a repeating group
– To convert, you must expand the table’s
primary key to include the primary key of the
repeating group
Normalization
• Second Normal Form
– To understand second normal form (2NF), you
must understand the concept of functional
dependence
– Field X is functionally dependent on field Y if
the value of field X depends on the value of
field Y
– Note: 1NF with a single field primary key is
automatically in 2NF
Normalization
• Second Normal Form
– A standard process exists for converting a
table from 1NF to 2NF
1. Create and name a separate table for each field
in the existing primary key
2. Create a new table for each possible
combination of the original primary key fields
3. Study the three tables and place each field with
its appropriate primary key
Figure 6-21
Normalization
•
Second Normal
Form
Figure 6-21
Normalization
• Third Normal Form
– 3NF design avoids redundancy and data
integrity problems that still can exist in 2NF
designs
– A table design is in third normal form (3NF) if
it is in 2NF and if no nonkey field is
dependent on another nonkey field
Figure 6-22
Normalization
• Third Normal Form
– To convert the table to 3NF, you must
remove all fields from the 2NF table that
depend on another nonkey field and place
them in a new table that uses the nonkey
field as a primary key
Figure 6-23
Figure 6-24
Figure 6-25
Steps in Database Design
1. Create the initial ERD
2. Assign all data elements to entities
3. Create 3NF designs for all tables, taking
care to identify all primary, secondary,
and foreign keys
4. Verify all data dictionary entries
• After creating your final ERD and
normalized table designs, you can
transform them into a database
Database Models
• Hierarchical and Network Databases
– In a hierarchical database, data is organized
like a family tree or organization chart, with
branches representing parent records and
child records
– A network database resembles a hierarchical
design, but provides somewhat more
flexibility
Database Models
• Relational Databases
– The relational model was introduced during
the 1970s and became popular because it
was flexible and powerful
– Because all the tables are linked, a user can
request data that meets specific conditions
– New entities and attributes can be added at
any time without restructuring the entire
database
Figure 6-34
Figure 6-35
Database Models
• Object-Oriented Databases
– Many systems developers are using
object-oriented database (OODB) design
as a natural extension of the objectoriented analysis process
• Object data standard
• Object Database Management Group (ODMG)
• Each object has a unique object identifier
Figure 6-36
Figure 6-37
Data Storage
• Data Warehousing
– Data warehouse - dimensions
– Without a data warehouse it would be
difficult for a user to extract data that spans
several information systems and time
frames
– Allows users to retrieve and analyze the
data easily
Figure 6-38
Data Storage
• Data Mining
– Works best when you have clear,
measurable goals
•
•
•
•
•
Increase average pages viewed per session.
Increase number of referred customers
Reduce clicks to close
Increase checkouts per visit
Increase average profit per checkout
Figure 6-39
Data Storage
• Logical and Physical Storage
– Logical storage
•
•
•
•
As seen through a user’s eyes
Characters
Date element or data item
Logical record
– Physical storage
• Hardware-related
• Physical record or block
• Blocking factor
Figure 6-40
Data Storage
• Data Storage Formats
–
–
–
–
–
Binary digits
Bit
Byte
EBCDIC and ASCII
Unicode
Figure 6-41
Data Storage
• Data Storage Formats
– Binary
•
•
•
•
Binary storage format
Integer format
Long integer format
Other binary formats exist for efficient storage of
exceedingly long numbers
Data Storage
• Selecting a Data Storage Format
– In many cases, a user can select a
specific data storage format
– For example, when using Microsoft Office,
you can store documents, spreadsheets,
and databases in Unicode-compatible
form by using the font called Arial Unicode
MS
– Best answer is it depends on the situation
Data Storage
• Date Fields
– Most date formats now are based on the
model established by the International
Organization for Standardization (ISO)
– Can be sorted easily and used in
comparisons
– Absolute date
– Best method depends on how the specific
date will be printed, displayed or used in a
calculation
Figure 6-42
Data Control
• File and database control must include
all measures necessary to ensure that
data storage is correct, complete, and
secure
• A well-designed DBMS must provide
built-in control and security features,
including subschemas, passwords,
encryption, audit trail files, and backup
Figure 6-43
and recovery procedures to maintain
data
Data Control
•
•
•
•
•
•
User ID
Password
Backup
Recovery procedures
Audit log files
Audit fields