Chapter 9 Study Tool - Universiti Teknologi MARA

Download Report

Transcript Chapter 9 Study Tool - Universiti Teknologi MARA

CHAPTER 6
DATA DESIGN
Chapter Objectives




Explain data design concepts and structures
Describe file processing systems
Explain database systems and define the
components of a database management system
(DBMS)
Describe Web-based data design
2
Chapter Objectives




Explain data design terminology, including entities,
fields, common fields, records, files, tables, and key
fields
Describe data relationships, draw an entity
relationship diagram, define cardinality, and use
cardinality notation
Explain the concept of normalization
Explain the importance of codes and describe
various coding schemes
3
Chapter Objectives




Describe relational and object-oriented database
models
Explain data warehousing and data mining
Differentiate between logical and physical storage
and records
Explain data control measures
4
Introduction



You will develop a physical plan for data
organization, storage, and retrieval
Begins with a review of data design concepts and
terminology, then discusses file-based systems and
database systems, including Web-based
databases
Concludes with a discussion of data storage and
access, including strategic tools such as data
warehousing and data mining, physical design
issues, logical and physical records, data storage
formats, and data controls
5
Data Design Concepts

Before constructing an information system, a systems
analyst must understand basic design concepts,
including data structures and the characteristics of
file processing and database systems, including
Web-based database design
6
Data Design Concepts

Data Structures
 Each
file or table contains data about people, places,
things or events that interact with the information system
 File-oriented system
 File processing system
 Database system
7
Data Design Concepts

Overview of File Processing
 Some
companies still use file processing to handle large
volumes of structured data on a regular basis
 Although much less common today, file processing can
be efficient and cost-effective in certain situations
8
Data Design Concepts

Overview of File Processing
 Potential
problems
 Data
redundancy
 Data integrity
 Rigid data structure
9
Data Design Concepts

Overview of File Processing
 Various
types of files
 Master
file
 Table file
 Transaction file
 Work file
 Security file
 History file
10
Data Design Concepts

The Evolution from File Systems to Database
Systems
A
properly designed database system offers a solution
to the problems of file processing
 Provides an overall framework that avoids data
redundancy and supports a real-time, dynamic
environment
11
Data Design Concepts

The Evolution from File Systems to Database
Systems
A
database management system (DBMS) is a collection
of tools, features, and interfaces that enables users to
add, update, manage, access, and analyze the contents
of a database
 The main advantage of a DBMS is that it offers timely,
interactive, and flexible data access
12
Data Design Concepts

The Evolution from File Systems to Database
Systems
 Advantages
 Scalability
 Better
support for client/server systems
 Economy of scale
 Flexible data sharing
 Enterprise-wide application – database administrator (DBA)
 Stronger standards
13
Data Design Concepts

The Evolution from File Systems to Database
Systems
 Advantages
 Controlled
redundancy
 Better security
 Increased programmer productivity
 Data independence
14
Data Design Concepts

The Evolution from File Systems to Database
Systems
 Although
the trend is toward enterprise-wide database
design, many companies still use a combination of
centralized DBMSs and smaller, department-level
database systems
 The compromise, in many cases, is a client/server
design, where processing is shared among several
computers
15
DBMS Components


A DBMS provides an interface between a database
and users who need to access the data
In addition to interfaces for users, database
administrators, and related systems, a DBMS also
has a data manipulation language, a schema and
subschemas, and a physical data repository
16
DBMS Components

Interfaces for Users, Database Administrators, and
Related Systems
 Users
 Query
language
 Query by example (QBE)
 SQL (structured query language)
 Database
A
Administrators
DBA is responsible for DBMS management and support
17
DBMS Components

Interfaces for Users, Database Administrators, and
Related Systems
 Related
information systems
A
DBMS can support several related information systems
that provide input to, and require specific data from, the
DBMS
 No human intervention is required for two-way
communication
18
DBMS Components

Data Manipulation Language
A
data manipulation language (DML) controls database
operations, including storing, retrieving, updating, and
deleting data
 Some database products also provide an easy-to-use
graphical environment that enables users to control
operations with menu-driven commands.
19
DBMS Components

Schema
 The
complete definition of a database, including
descriptions of all fields, tables, and relationships, is
called a schema
 You also can define one or more subschemas
 For example, specific users, systems, or locations might
be permitted to create, retrieve, update, or delete
data, depending on their needs and the company’s
security policies
20
DBMS Components

Physical Data Repository
 The
data dictionary is transformed into a physical data
repository, which also contains the schema and
subschemas
 The physical repository might be centralized, or
distributed at several locations
 ODBC – open database connectivity
 JDBC – Java database connectivity
21
Web-Based Database Design

The following sections discuss the characteristics of
Web-based design, Internet terminology, connecting
a database to the Web, and data security on the
Web
22
Web-Based Database Design

Characteristics of Web-Based Design
 In
a Web-based design, the Internet serves as the
front end, or interface for the database management
system. Internet technology provides enormous power
and flexibility
 Web-based systems are popular because they offer
ease of access, cost-effectiveness, and worldwide
connectivity
23
Web-Based Database Design

Internet Terminology
 Web
browser
 Web page
 HTML (Hypertext Markup Language)
 Tags
 Web server
 Web site
24
Web-Based Database Design

Internet Terminology
 Intranet
 Extranet
 Protocols
 Web-centric
 Clients
 Servers
25
Web-Based Database Design

Connecting a Database to the Web
 Database
must be connected to the Internet or intranet
 The database and the Internet speak two different
languages
 Middleware
 Adobe
ColdFusion
26
Web-Based Database Design

Data Security
 Web-based
data must be secure, yet easily accessible
to authorized users
 To achieve this goal, well-designed systems provide
security at three levels: the database itself, the Web
server, and the telecommunication links that connect the
components of the system
27
Data Design Terminology

Definitions
 Entity
 Table
or file
 Field
 Attribute
 Common
field
 Record
 Tuple
28
Data Design Terminology

Key Fields
 Primary
key
 Combination
key
 Composite key
 Concatenated key
 Multi-valued key
29
Data Design Terminology

Key Fields
 Candidate
 Nonkey
key
field
 Foreign
key
 Secondary key
30
Data Design Terminology

Referential Integrity
 Validity
checks can help avoid data input errors
 In a relational database, referential integrity means
that a foreign key value cannot be entered in one table
unless it matches an existing primary key in another
table
 Orphan
31
Entity-Relationship Diagrams



An entity is a person, place, thing, or event for
which data is collected and maintained
Entity-relationship diagram (ERD)
An ERD provides an overall view of the system, and
a blueprint for creating the physical data structures
32
Entity-Relationship Diagrams

Drawing an ERD
 The
first step is to list the entities that you identified
during the fact-finding process and to consider the
nature of the relationships that link them
 A popular method is to represent entities as rectangles
and relationships as diamond shapes
33
Entity-Relationship Diagrams

Types of Relationships
 Three
types of relationships can exist between entities
 One-to-one relationship (1:1)
 One-to-many relationship (1:M)
 Many-to-many relationship (M:N)
 Associative
entity
34
Entity-Relationship Diagrams

Cardinality
 Cardinality
notation
 Crow’s foot notation
 Unified Modeling Language (UML)
 Now that you understand database elements and their
relationships, you can start designing tables
35
Normalization




Normalization
Table design
Involves four stages: unnormalized design, first
normal form, second normal form, and third normal
form
Most business-related databases must be designed
in third normal form
36
Normalization

Standard Notation Format
 Designing
tables is easier if you use a standard
notation format to show a table’s structure, fields, and
primary key
Example: NAME (FIELD 1, FIELD 2, FIELD 3)
37
Normalization

Repeating Groups and Unnormalized Design
 Repeating
 Often
group
occur in manual documents prepared by users
 Unnormalized
 Enclose
the repeating group of fields within a second
set of parentheses
38
Normalization

First Normal Form
A
table is in first normal form (1NF) if it does not
contain a repeating group
 To convert, you must expand the table’s primary key to
include the primary key of the repeating group
39
Normalization

Second Normal Form
 To
understand second normal form (2NF), you must
understand the concept of functional dependence
 Field X is functionally dependent on field Y if the value
of field X depends on the value of field Y
40
Normalization

Second Normal Form

A standard process exists for converting a table
from 1NF to 2NF
1.
2.
3.
First, create and name a separate table for each field
in the existing primary key
Next, create a new table for each possible
combination of the original primary key fields
Finally, study the three tables and place each field
with its appropriate primary key
41
Normalization

Second Normal Form
 Four
kinds of problems are found with 1NF description
that do not exist with 2NF
 Consider
the work necessary to change a particular
product’s description
 1NF tables can contain inconsistent data
 Adding a new product is a problem
 Deleting a product is a problem
42
Normalization

Third Normal Form
 3NF
design avoids redundancy and data integrity
problems that still can exist in 2NF designs
 A table design is in third normal form (3NF) if it is in
2NF and if no nonkey field is dependent on another
nonkey field
43
Normalization

Third Normal Form
 To
convert the table to 3NF, you must remove all fields
from the 2NF table that depend on another nonkey
field and place them in a new table that uses the
nonkey field as a primary key
44
Normalization

A Normalization Example
 To
show the normalization process, consider the
familiar situation in Figure 9-27 which might depict
several entities in a school advising system: ADVISOR,
COURSE, and STUDENT
45
Using Codes During System Design

Overview of Codes
A
code is a set of letters or numbers that represents a
data item. Codes can be used to simplify output, input,
and data formats.
 Because codes often are used to represent data, you
encounter them constantly in your everyday life
 They save storage space and costs, reduce data
transmission time, and decrease data entry time
 Can reduce data input errors
46
Using Codes During System Design
Types of Codes
[1] Sequence codes


Numbers/letters assigned in a specific order
E.g. UiTM student matric number


[2] Block sequence codes

Block sequence codes use blocks of numbers for different
classifications.
E.g. course codes


1xx – 3xx : diploma courses
 4xx – 6xx : bachelor courses
 7xx – 8xx : master courses

47
Using Codes During System Design
Types of Codes
[3] Alphabetic codes






[a] Category codes – identify related items using
numbers/letters
E.g. ITS – system science courses, ITT – data comm. &
networking courses, ITC – comp. science courses, ACC
– accounting courses
[b] Abbreviation codes – mnemonic
codes/abbreviations
E.g MY = Malaysia, SG = Singapore, NZ = New
Zealand
48
Using Codes During System Design
Types of codes
[4] Significant digit codes - Distinguish items by using
a series of subgroups of digits



[5] Derivation codes - Combine data from different
item attributes/characteristics to build the code




E.g. Classroom number in UiTM Pahang: J1-01, J1-02, A101, A2-01
E.g. Magazine release num: 201109.Vol.18 (Vol. 18,
Released on September 2011)
Cipher codes
Action codes
49
Using Codes During System Design
Types of codes
[6] Cipher codes: use a keyword to encode a
number




E.g. E.g. IASETTO = 1453770
[7] Action codes - Indicate action to be executed
associated with item
 E.g.
X – exit program, F – File menu
50
Using Codes During System Design

Developing a Code
1.
2.
3.
4.
5.
Keep codes concise
Allow for expansion
Keep codes stable
Make codes unique
Use sortable codes
51
Using Codes During System Design

Developing a Code
6.
7.
8.
9.
Avoid confusing codes
Make codes meaningful
Use a code for a single purpose
Keep codes consistent
52
Steps in Database Design
1.
2.
3.
4.
Create the initial ERD
Assign all data elements to entities
Create 3NF designs for all tables
Verify all data dictionary entries

After creating your final ERD and normalized table
designs, you can transform them into a database
53
Database Models

Relational Databases
 The
relational model was introduced during the 1970s
and became popular because it was flexible and
powerful
 Because all the tables are linked, a user can request
data that meets specific conditions
 New entities and attributes can be added at any time
without restructuring the entire database
54
Database Models

Object-Oriented Databases
 Many
systems developers are using object-oriented
database (OODB) design as a natural extension of
the object-oriented analysis process
 Object
Management Group (OMG)
 Each object has a unique object identifier
55
Data Storage and Access


Data storage and access involve strategic business
tools
Strategic tools for data storage and access
 Data
warehouse – dimensions
 Data mart
56
Data Storage and Access

Strategic tools for data storage and access
 Data
Mining
 Increase
average pages viewed per session.
 Increase number of referred customers
 Reduce clicks to close
 Increase checkouts per visit
 Increase average profit per checkout
 Clickstream storage – market basket analysis
57
Data Storage and Access

Logical and Physical Storage
 Logical
storage
 Characters
 Date
element or data item
 Logical record
 Physical
storage
 Physical
record or block
 Buffer
 Blocking
factor
58
Data Storage and Access

Data Coding and Storage
 Binary
digits
 Bit
 Byte
 EBCDIC,
 EBCDIC
ASCII, and Binary
and ASCII
59
Data Storage and Access

Data Coding and Storage
 EBCDIC,
ASCII, and Binary
 Binary
storage format
 Integer format
 Long integer format
 Unicode
60
Data Storage and Access

Data Coding and Storage
 Storing
dates
 Y2K
Issue
 Most date formats now are based on the model
established by the International Organization for
Standardization (ISO)
 Absolute date
61
Data Control


File and database control must include all
measures necessary to ensure that data storage is
correct, complete, and secure
A well-designed DBMS must provide built-in
control and security features, including
subschemas, passwords, encryption, audit trail
files, and backup and recovery procedures to
maintain data
62
Data Control








User ID
Password
Permissions
Encryption
Backup
Recovery procedures
Audit log files
Audit fields
63
Chapter Summary


Files and tables contain data about people, places,
things, or events that affect the information system
DBMS designs are more powerful and flexible than
traditional file-oriented systems
64
Chapter Summary



An entity-relationship (ERD) is a graphic
representation of all system entities and the
relationships among them
A code is a set of letters or numbers used to
represent data in a system
The most common database models are relational
and object-oriented
65
Chapter Summary



Logical storage is information seen through a
user’s eyes, regardless of how or where that
information actually is organized or stored
Physical storage is hardware-related and involves
reading and writing blocks of binary data to
physical media
File and database control measures include
limiting access to the data, data encryption,
backup/recovery procedures, audit-trail files, and
internal audit fields
66