Chapter 17 Designing Effective Input

Download Report

Transcript Chapter 17 Designing Effective Input

Major Topics
Databases
Normalization
Data warehouses
Data mining
Chapter 17 Designing Databases
17-1
Objectives for
Data Storage Design
The data must be available when the user
wants to use it
The data must have integrity
 It must be accurate and consistent
Efficient storage of data as well as efficient
updating and retrieval
The information retrieval be purposeful
Chapter 17 Designing Databases
17-2
Computer Based Data Storage
Approach:
Using individual files
Each file unique to a particular application
Using database
used in many different applications by many
uses
Chapter 17 Designing Databases
17-3
Objectives of Effective
Databases
 Ensuring that data can be shared among users for
a variety of applications
 Maintaining data that are both accurate and
consistent
 Ensuring all data required for current and future
applications will be readily available
 Allowing the database to evolve and the needs of
the users to grow
 Allowing users to construct their personal view of
the data without concern for the way the data are
physically stored
Chapter 17 Designing Databases
17-4
Metadata
 Metadata is the information that describes
data in the file or database
 Used to help users understand the form and
structure of the data
Chapter 17 Designing Databases
17-5
Entity-Relationship Concepts
Entities are objects or events for which
data is collected and stored
Relationships are associations between
entities
Chapter 17 Designing Databases
17-6
Entities
A distinct collection of data for one
person, place, thing, or event
 Entities become files of database tables
C usto m er
Chapter 17 Designing Databases
17-7
Associative Entity
Associative Entity - links two entities
An associative entity can only exist
between two entities
Associative entities become database
tables
O rd er
Item
Chapter 17 Designing Databases
17-8
Associative Entity Connections
Each entity end has a “one” connection
The associative entity has a “many”
connection on each side
Chapter 17 Designing Databases
17-9
Key fields for Associative Entity
The primary key for each “one” end is a
foreign key on the associative entity
Both foreign keys concatenated together
become the primary key
Chapter 17 Designing Databases
17-10
Attributive Entity (pg. 39)
Attributive Entity - describes attributes,
especially repeating elements
Attributive entities becomes database
tables
B oo k
S ub ject
Chapter 17 Designing Databases
17-11
Example
Chapter 17 Designing Databases
17-12
Types of Keys
Primary key, unique for the record
Secondary key, a key which may not be
unique
Concatenated key, a combination of two
or more data items for the key
Foreign key, a data item in one record
that is the key of another record
Chapter 17 Designing Databases
17-13
Types of Files
 Master file
 Have large records
 Contain all pertinent information about an entity
 Transaction file
 Are short records
 Contain information used to update master files
 Table file: contains data to calculate more data
 Work file: for quick record access
 Report file: various reports generated for printing
Chapter 17 Designing Databases
17-14
Databases Organization
A database is intended to be shared by
many users
Three structures for storing database
files:
Hierarchical database structures
Network database structures
Relational database structures
Chapter 17 Designing Databases
17-15
Normalization
 Normalization is the transformation of
complex user views and data to a set of
smaller, stable, and easily maintainable data
structures
 Normalization creates data that are stored
only once on a file
 The exception is key fields
 This eliminates redundant data storage
 It provides ideal data storage for database
systems
Chapter 17 Designing Databases
17-16
Three Steps of Data
Normalization
Remove all repeating groups and identify
the primary key
Ensure that all nonkey attributes are fully
dependent on the primary key
Remove any transitive dependencies,
attributes which are dependent on other
nonkey attributes
Chapter 17 Designing Databases
17-17
Normalization
U se r V ie w
U n n o rm a liz e d
R elatio n sh ip
R em o v e re p e a tin g g ro u p s
N o rm a liz ed
R elatio n s (1 N F )
R em o v e p artia l d e p en d e n cie s
S ec o n d N o rm al F o rm
R elatio n s (2 N F )
R em o v e tran sitiv e d e p e n d e n c ie s
T h ird N o rm a l F o rm
R elatio n s (1 N F )
Chapter 17 Designing Databases
17-18
First Normal Form (1NF)
Remove any repeating groups
All repeating groups are moved into a
new table
Foreign keys are used to link the tables
When a relation contains no repeating
groups, it is in the first normal form
Keys must be included to link the
relations, tables
Chapter 17 Designing Databases
17-19
Example: To 1NF
Chapter 17 Designing Databases
17-20
Second Normal Form (2NF)
Remove any partial dependencies
A partial dependency is when the data
are only dependent on a part of a key
field
A relation is created for the data that
are only dependent on part of the key
and another for data that are
dependent on both parts
Chapter 17 Designing Databases
17-21
Example: To 2NF
Chapter 17 Designing Databases
17-22
Third Normal Form (3NF)
Remove any transitive dependencies
A transitive dependency is when a
relation contains data that are not part
of the entity
The problem with transitive
dependencies is updating the data
A single data item may be present on
many records
Chapter 17 Designing Databases
17-23
Example: To 3NF
Chapter 17 Designing Databases
17-24
Example: ERD
Chapter 17 Designing Databases
17-25
Data Warehouses
Data warehouses are used to organize
information for quick and effective
queries
Chapter 17 Designing Databases
17-26
Data Warehouses and
Database
 In the data warehouse, data are organized
around major subjects
 Data in the warehouse are stored as
summarized rather than detailed raw data
 Data in the data warehouse cover a much
longer time frame than in a traditional
transaction-oriented database
 Data warehouses are organized for fast
queries
 Data warehouses are usually optimized for
answering complex queries, known as OLAP
Chapter 17 Designing Databases
17-27
Data Warehouses and
Database
 Data warehouses allow for easy access via
data-mining software called software
 Data warehouses include multiple databases
that have been processed so that data are
uniformly defined, containing what is referred
to as “clean” data
 Data warehouses usually contain data from
outside sources
Chapter 17 Designing Databases
17-28
Data Mining
Statistical analysis
Decision trees
Neural networks
Fuzzy logic
Clustering
Chapter 17 Designing Databases
17-29
Data Mining Patterns
Data mining patterns that decision
makers try to identify include
Associations, patterns that occur together
Sequences, patterns of actions that take
place over a period of time
Clustering, patterns that develop among
groups of people
Trends, the patterns that are noticed over
a period of time
Chapter 17 Designing Databases
17-30
Web Based Databases and
XML
Web-based databases are used for
sharing data
Extensible markup language (XML) is
used to define data used primarily for
business data exchange over the Web
An XML document contains only data and
the nature of the data
Chapter 17 Designing Databases
17-31