Models - Courses
Download
Report
Transcript Models - Courses
Lecture 11: Intro to Database Design
SIMS 202:
Information Organization
and Retrieval
Prof. Ray Larson & Prof. Marc Davis
UC Berkeley SIMS
Tuesday and Thursday 10:30 am - 12:00 pm
Fall 2004
http://www.sims.berkeley.edu/academics/courses/is202/f04/
IS 202 – FALL 2004
Callie’s Birthday 2004-10-05 - SLIDE 1
Lecture Overview
• Review
– Evaluation exercise
•
•
•
•
•
Databases and Database Design
Database Life Cycle
ER Diagrams
Discussion
Next Time/Readings
IS 202 – FALL 2004
Callie’s Birthday 2004-10-05 - SLIDE 2
Lecture Overview
• Review
– Evaluation exercise
•
•
•
•
•
Databases and Database Design
Database Life Cycle
ER Diagrams
Discussion
Next Time/Readings
IS 202 – FALL 2004
Callie’s Birthday 2004-10-05 - SLIDE 3
What is a Database?
IS 202 – FALL 2004
Callie’s Birthday 2004-10-05 - SLIDE 4
Files and Databases
• File: A collection of records or documents
dealing with one organization, person,
area or subject (Rowley)
– Manual (paper) files
– Computer files
• Database: A collection of similar records
with relationships between the records
(Rowley)
– Bibliographic, statistical, business data,
images, etc.
IS 202 – FALL 2004
Callie’s Birthday 2004-10-05 - SLIDE 5
Database
• A Database is a collection of stored
operational data used by the application
systems of some particular enterprise
(C.J. Date)
– Paper “Databases”
• Still contain a large portion of the world’s
knowledge
– File-Based Data Processing Systems
• Early batch processing of (primarily) business data
– Database Management Systems (DBMS)
IS 202 – FALL 2004
Callie’s Birthday 2004-10-05 - SLIDE 6
Why DBMS?
• History
– 50’s and 60’s all applications were custom
built for particular needs
– File based
– Many similar/duplicative applications dealing
with collections of business data
– Early DBMS were extensions of programming
languages
– 1970 - E.F. Codd and the Relational Model
– 1979 - Ashton-Tate and first Microcomputer
DBMS
IS 202 – FALL 2004
Callie’s Birthday 2004-10-05 - SLIDE 7
File Based Systems
Application
Delivery
List
Coal
Estimation
Just what
asked for
IS 202 – FALL 2004
File
Toys
Addresses
Naughty
Nice Toys
Callie’s Birthday 2004-10-05 - SLIDE 8
From File Systems to DBMS
• Problems with file processing systems
– Inconsistent data
– Inflexibility
– Limited data sharing
– Poor enforcement of standards
– Excessive program maintenance
IS 202 – FALL 2004
Callie’s Birthday 2004-10-05 - SLIDE 9
DBMS Benefits
•
•
•
•
•
•
•
•
•
Minimal data redundancy
Consistency of data
Integration of data
Sharing of data
Ease of application development
Uniform security, privacy, and integrity controls
Data accessibility and responsiveness
Data independence
Reduced program maintenance
IS 202 – FALL 2004
Callie’s Birthday 2004-10-05 - SLIDE 10
Terms and Concepts
• Data independence
– Physical representation and location of data
and the use of that data are separated
• The application doesn’t need to know how or
where the database has stored the data, but just
how to ask for it
• Moving a database from one DBMS to another
should not have a material effect on application
program
• Recoding, adding fields, etc. in the database
should not affect applications
IS 202 – FALL 2004
Callie’s Birthday 2004-10-05 - SLIDE 11
Database Environment
CASE
Tools
Repository
IS 202 – FALL 2004
User
Interface
DBMS
Application
Programs
Database
Callie’s Birthday 2004-10-05 - SLIDE 12
Database Components
DBMS
===============
Design tools
Database
Database contains:
User’s Data
Metadata
Indexes
Application Metadata
IS 202 – FALL 2004
Table Creation
Form Creation
Query Creation
Report Creation
Procedural
language
compiler (4GL)
=============
Run time
Form processor
Query processor
Report Writer
Language Run time
Application
Programs
User
Interface
Applications
Callie’s Birthday 2004-10-05 - SLIDE 13
Types of Database Systems
•
•
•
•
•
PC databases
Centralized database
Client/server databases
Distributed databases
Database models
IS 202 – FALL 2004
Callie’s Birthday 2004-10-05 - SLIDE 14
PC Databases
E.g.:
Access
FoxPro
Dbase
Etc.
IS 202 – FALL 2004
Callie’s Birthday 2004-10-05 - SLIDE 15
Centralized Databases
Central
Computer
IS 202 – FALL 2004
Callie’s Birthday 2004-10-05 - SLIDE 16
Client Server Databases
Client
Client
Network
Database
Server
Client
IS 202 – FALL 2004
Callie’s Birthday 2004-10-05 - SLIDE 17
Distributed Databases
Location C
Location B
computer
computer
computer
Homogeneous
Databases
Location A
IS 202 – FALL 2004
Callie’s Birthday 2004-10-05 - SLIDE 18
Distributed Databases
Client
Heterogeneous
Or Federated
Databases
Database
Server
Remote
Comp.
Local Network
Comm
Server
Client
IS 202 – FALL 2004
Remote
Comp.
Callie’s Birthday 2004-10-05 - SLIDE 19
Terms and Concepts
• A “database application” is an application
program (or set of related programs) that is used
to perform a series of database activities:
– Create
• Add new data to the database
– Read
• Read current data from the database
– Update
• Update or modify current database data
– Delete
• Remove current On behalf of database users
IS 202 – FALL 2004
Callie’s Birthday 2004-10-05 - SLIDE 20
Terms and Concepts
• Enterprise
– Organization
• Entity
– Person, Place, Thing, Event, Concept...
• Attributes
– Data elements (facts) about some entity
– Also sometimes called fields or items or domains
• Data values
– Instances of a particular attribute for a particular
entity
IS 202 – FALL 2004
Callie’s Birthday 2004-10-05 - SLIDE 21
Terms and Concepts
• Key
– An attribute or set of attributes used to identify
or locate records in a file
• Primary Key
– An attribute or set of attributes that uniquely
identifies each record in a file
IS 202 – FALL 2004
Callie’s Birthday 2004-10-05 - SLIDE 22
Terms and Concepts
• Models
– (1) Levels or views of the Database
• Conceptual, logical, physical
– (2) DBMS types
• Relational, Hierarchic, Network, Object-Oriented,
Object-Relational
IS 202 – FALL 2004
Callie’s Birthday 2004-10-05 - SLIDE 23
Models (1)
Application 1
External
Model
Application 2
Application 3
Application 4
External
Model
External
Model
External
Model
Application 1
Conceptual
requirements
Application 2
Conceptual
requirements
Application 3
Conceptual
requirements
Conceptual
Model
Logical
Model
Internal
Model
Application 4
Conceptual
requirements
More later on this…
IS 202 – FALL 2004
Callie’s Birthday 2004-10-05 - SLIDE 24
Data Models(2): History
• Hierarchical Model (1960’s and 1970’s)
– Similar to data structures in programming
languages
Books
(id, title)
Authors
(first, last)
IS 202 – FALL 2004
Publisher
Subjects
Callie’s Birthday 2004-10-05 - SLIDE 25
Data Models(2): History
• Network Model (1970’s)
– Provides for single entries of data and
navigational “links” through chains of data.
Authors
Subjects
Books
Publishers
IS 202 – FALL 2004
Callie’s Birthday 2004-10-05 - SLIDE 26
Data Models(2): History
• Relational Model (1980’s)
– Provides a conceptually simple model for data
as relations (typically considered “tables”) with
all data visible
pubid
Book ID
1
2
3
4
5
Title
pubid
Introductio
The history
New stuff ab
Another title
And yet more
IS 202 – FALL 2004
2
4
3
2
1
Author id
1
2
3
4
5
1
2
3
4
Book ID
pubname
Harper
Addison
Oxford
Que
Authorid
1
2
3
4
5
Author name
Smith
Wynar
Jones
Duncan
Applegate
Subid
1
2
3
4
4
2
1
3
2
3
Subid
Subject
1 cataloging
2 history
3 stuff
Callie’s Birthday 2004-10-05 - SLIDE 27
Data Models(2): History
• Object Oriented Data Model (1990’s)
– Encapsulates data and operations as
“Objects”
Books
(id, title)
Authors
(first, last)
IS 202 – FALL 2004
Publisher
Subjects
Callie’s Birthday 2004-10-05 - SLIDE 28
Data Models(2): History
• Object-Relational Model (1990’s)
– Combines the well-known properties of the
Relational Model with such OO features as:
• User-defined datatypes
• User-defined functions
• Inheritance and sub-classing
• All of the major enterprise DBMS systems
are now Object-Relational or incorporate
Object-Relational features
IS 202 – FALL 2004
Callie’s Birthday 2004-10-05 - SLIDE 29
Lecture Overview
• Review
– MediaStreams
•
•
•
•
•
Databases and Database Design
Database Life Cycle
ER Diagrams
Discussion
Next Time/Readings
IS 202 – FALL 2004
Callie’s Birthday 2004-10-05 - SLIDE 30
Database System Life Cycle
Physical
Creation
2
Conversion
3
Design
1
Growth,
Change, &
Maintenance
6
Integration
4
Operations
5
IS 202 – FALL 2004
Callie’s Birthday 2004-10-05 - SLIDE 31
Design
• Determination of the needs of the
organization
• Development of the Conceptual Model of
the database
– Typically using Entity-Relationship
diagramming techniques
• Construction of a Data Dictionary
• Development of the Logical Model
IS 202 – FALL 2004
Callie’s Birthday 2004-10-05 - SLIDE 32
Physical Creation
• Development of the Physical Model of the
Database
– Data formats and types
– Determination of indexes, etc.
• Load a prototype database and test
• Determine and implement security, privacy
and access controls
• Determine and implement integrity
constraints
IS 202 – FALL 2004
Callie’s Birthday 2004-10-05 - SLIDE 33
Conversion
• Convert existing data sets and
applications to use the new database
– May need programs, conversion utilities to
convert old data to new formats
IS 202 – FALL 2004
Callie’s Birthday 2004-10-05 - SLIDE 34
Integration
• Overlaps with Phase 3
• Integration of converted applications and
new applications into the new database
IS 202 – FALL 2004
Callie’s Birthday 2004-10-05 - SLIDE 35
Operations
• All applications run full-scale
• Privacy, security, access control must be in
place
• Recovery and Backup procedures must be
established and used
IS 202 – FALL 2004
Callie’s Birthday 2004-10-05 - SLIDE 36
Growth, Change, and Maintenance
• Change is a way of life
– Applications, data requirements, reports, etc.
will all change as new needs and
requirements are found
– The Database and applications and will need
to be modified to meet the needs of changes
IS 202 – FALL 2004
Callie’s Birthday 2004-10-05 - SLIDE 37
Another View of the Life Cycle
Integration
4
Operations
5
Design
Physical
1
Creation Conversion Growth,
2
Change
3
6
IS 202 – FALL 2004
Callie’s Birthday 2004-10-05 - SLIDE 38
Lecture Overview
• Review
– MediaStreams
•
•
•
•
•
Databases and Database Design
Database Life Cycle
ER Diagrams
Discussion
Next Time/Readings
IS 202 – FALL 2004
Callie’s Birthday 2004-10-05 - SLIDE 39
Database Design Process
Application 1
External
Model
Application 2
Application 3
Application 4
External
Model
External
Model
External
Model
Application 1
Conceptual
requirements
Application 2
Conceptual
requirements
Application 3
Conceptual
requirements
Conceptual
Model
Logical
Model
Internal
Model
Application 4
Conceptual
requirements
IS 202 – FALL 2004
Callie’s Birthday 2004-10-05 - SLIDE 40
Entity
• An Entity is an object in the real world (or
even imaginary worlds) about which we
want or need to maintain information
– Persons (e.g.: customers in a business,
employees, authors)
– Things (e.g.: purchase orders, meetings,
parts, companies)
Employee
IS 202 – FALL 2004
Callie’s Birthday 2004-10-05 - SLIDE 41
Attributes
• Attributes are the significant properties or
characteristics of an entity that help identify it
and provide the information needed to
interact with it or use it (this is the Metadata
for the entities)
Birthdate
First
Middle
Last
IS 202 – FALL 2004
Age
Name
Employee
SSN
Projects
Callie’s Birthday 2004-10-05 - SLIDE 42
Relationships
• Relationships are the associations
between entities
• They can involve one or more entities and
belong to particular relationship types
IS 202 – FALL 2004
Callie’s Birthday 2004-10-05 - SLIDE 43
Relationships
Student
Attends
Class
Project
Supplier
IS 202 – FALL 2004
Supplies
project
parts
Part
Callie’s Birthday 2004-10-05 - SLIDE 44
Types of Relationships
• Concerned only with cardinality of
relationship
Employee
Employee
Employee
1 Assigned
n
Assigned
1
1
m Assigned n
Truck
Project
Project
Chen ER notation
IS 202 – FALL 2004
Callie’s Birthday 2004-10-05 - SLIDE 45
Other Notations
Employee
Assigned
Truck
Employee
Assigned
Project
Employee
Assigned
Project
“Crow’s Foot”
IS 202 – FALL 2004
Callie’s Birthday 2004-10-05 - SLIDE 46
Other Notations
Employee
Assigned
Truck
Employee
Assigned
Project
Employee
Assigned
Project
IDEFIX Notation
IS 202 – FALL 2004
Callie’s Birthday 2004-10-05 - SLIDE 47
More Complex Relationships
Manager
1/1/1
Employee
1/n/n Evaluation n/n/1
Project
SSN
Date
Project
Employee
4(2-10)
Assigned
1
Manages
Employee
Is Managed By
Project
1
Manages
n
IS 202 – FALL 2004
Callie’s Birthday 2004-10-05 - SLIDE 48
Weak Entities
• Owe existence entirely to another entity
Part#
Invoice #
Order
Invoice#
Contains
Quantity
Order-line
Rep#
IS 202 – FALL 2004
Callie’s Birthday 2004-10-05 - SLIDE 49
Supertype and Subtype Entities
Employee
Sales-rep
Is one of
Manages
Clerk
Sold
Other
Invoice
IS 202 – FALL 2004
Callie’s Birthday 2004-10-05 - SLIDE 50
Many to Many Relationships
SSN
Proj#
Proj#
Hours
Project
Assignment
Is
Assigned
Project
Assigned
Employee
IS 202 – FALL 2004
SSN
Callie’s Birthday 2004-10-05 - SLIDE 51
Lecture Overview
• Review
– MediaStreams
•
•
•
•
•
Databases and Database Design
Database Life Cycle
ER Diagrams
Discussion
Next Time/Readings
IS 202 – FALL 2004
Callie’s Birthday 2004-10-05 - SLIDE 52
Discussion
• Why use DBMS for web-based system
development?
• Why Not use IR systems?
• Can you use both?
• Other Questions?
IS 202 – FALL 2004
Callie’s Birthday 2004-10-05 - SLIDE 53
Lecture Overview
• Review
– MediaStreams
•
•
•
•
•
•
Databases and Database Design
Database Life Cycle
ER Diagrams
Database Design
Discussion
Next Time/Readings
IS 202 – FALL 2004
Callie’s Birthday 2004-10-05 - SLIDE 54
Next Time
• Database Design – Normalization and
SQL
• Readings
– Hoffer/McFadden “Logical database Design
and the Relational Model”
IS 202 – FALL 2004
Callie’s Birthday 2004-10-05 - SLIDE 55