Slides from Lecture 1 - Courses - University of California, Berkeley

Download Report

Transcript Slides from Lecture 1 - Courses - University of California, Berkeley

Database Management:
Introduction
Ray R. Larson
University of California, Berkeley
School of Information Management
and Systems
SIMS 257: Database Management
IS 257 – Spring 2004
2004.01.20 - SLIDE 1
Overview
• Announcements
– TA – Mayjane Co
• Course Description
• Database Concepts and Terminology
• Database Models
IS 257 – Spring 2004
2004.01.20 - SLIDE 2
Course Overview
•
•
•
•
•
•
Description of the Course
Assignments
Readings
Grading
Schedule
Web site:
http://sims.berkeley.edu/courses/is257/s04
IS 257 – Spring 2004
2004.01.20 - SLIDE 3
Course Description
• This course is concerned with the design
of the database itself -- not with the design
of database system software.
– We will discuss DBMS internals only as they
relate to the database and its design and
structure
• We will spend a fair amount time on
database application design, especially on
options for Web application database
support -- but this will not be primary
focus.
IS 257 – Spring 2004
2004.01.20 - SLIDE 4
Assignments
• Two kinds of assignments
– Using a pre-built database for search and
retrieval and database modification queries
– Designing, populating, and running queries
against your own personal database
• Types of database project
– Individual
» Work related
» Course only
» Projects from around campus that need doing…
– Group
» Course related
» SIMS Final project
IS 257 – Spring 2004
2004.01.20 - SLIDE 5
Readings
• Textbook is:
– Jeffrey A. Hoffer, Mary B. Prescott and Fred
R. McFadden. Modern Database
Management (Sixth Edition). Prentice Hall
(Pearson Educational) : Upper Saddle River,
NJ, 2002.
– ISBN 0-13-061183-2
IS 257 – Spring 2004
2004.01.20 - SLIDE 6
Grading
• Grades will be based on:
– Assignments (30%)
– Personal/Group Database project (60%)
– Class participation (10%)
– (No midterm or final)
IS 257 – Spring 2004
2004.01.20 - SLIDE 7
Schedule
• on website:
• http://sims.berkeley.edu/courses/is257/s04
/Schedule.html
IS 257 – Spring 2004
2004.01.20 - SLIDE 8
What is a Database?
IS 257 – Spring 2004
2004.01.20 - SLIDE 9
Files and Databases
• File: A collection of records or documents
dealing with one organization, person, area or
subject. (Rowley)
– Manual (paper) files
– Computer files
• Database: A collection of similar records with
relationships between the records. (Rowley)
– bibliographic, statistical, business data, images,
etc.
IS 257 – Spring 2004
2004.01.20 - SLIDE 10
Database
• A Database is a collection of stored
operational data used by the application
systems of some particular enterprise. (C.J.
Date)
– Paper “Databases”
• Still contain a large portion of the world’s knowledge
– File-Based Data Processing Systems
• Early batch processing of (primarily) business data
– Database Management Systems (DBMS)
IS 257 – Spring 2004
2004.01.20 - SLIDE 11
Terms and Concepts
• Database Management System -- DBMS
– Software system used to define, create,
maintain and provide controlled access to the
database and repository
IS 257 – Spring 2004
2004.01.20 - SLIDE 12
Terms and Concepts
• Repository
– AKA Data Dictionary
– The place where all metadata for a particular
database is stored
– may also include information on relationships
between files or tables in a particular
database
IS 257 – Spring 2004
2004.01.20 - SLIDE 13
Terms and Concepts
• Metadata
– Data about data
• In DBMS means all of the characteristics
describing the attributes of an entity, E.G.:
–
–
–
–
name of attribute
data type of attribute
size of the attribute
format or special characteristics
– Characteristics of files or relations
• name, content, notes, etc.
IS 257 – Spring 2004
2004.01.20 - SLIDE 14
Why DBMS?
• History
– 50’s and 60’s all applications were custom
built for particular needs
– File based
– Many similar/duplicative applications dealing
with collections of business data
– Early DBMS were extensions of programming
languages
– 1970 - E.F. Codd and the Relational Model
– 1979 - Ashton-Tate & first Microcomputer
DBMS
IS 257 – Spring 2004
2004.01.20 - SLIDE 15
File Based Systems
Application
Delivery
List
Coal
Estimation
Just what
asked for
IS 257 – Spring 2004
File
Toys
Addresses
Naughty
Nice Toys
2004.01.20 - SLIDE 16
From File Systems to DBMS
• Problems with File Processing
systems
– Inconsistent Data
– Inflexibility
– Limited Data Sharing
– Poor enforcement of standards
– Excessive program maintenance
IS 257 – Spring 2004
2004.01.20 - SLIDE 17
DBMS Benefits
•
•
•
•
•
•
Minimal Data Redundancy
Consistency of Data
Integration of Data
Sharing of Data
Ease of Application Development
Uniform Security, Privacy, and Integrity
Controls
• Data Accessibility and Responsiveness
• Data Independence
• Reduced Program Maintenance
IS 257 – Spring 2004
2004.01.20 - SLIDE 18
Terms and Concepts
• Data Independence
– Physical representation and location of data
and the use of that data are separated
• The application doesn’t need to know how or
where the database has stored the data, but just
how to ask for it.
• Moving a database from one DBMS to another
should not have a material effect on application
program
• Recoding, adding fields, etc. in the database
should not affect applications
IS 257 – Spring 2004
2004.01.20 - SLIDE 19
Database Environment
CASE
Tools
Repository
IS 257 – Spring 2004
User
Interface
DBMS
Application
Programs
Database
2004.01.20 - SLIDE 20
Database Components
DBMS
===============
Design tools
Database
Database contains:
User’s Data
Metadata
Indexes
Application Metadata
IS 257 – Spring 2004
Table Creation
Form Creation
Query Creation
Report Creation
Procedural
language
compiler (4GL)
=============
Run time
Form processor
Query processor
Report Writer
Language Run time
Application
Programs
User
Interface
Applications
2004.01.20 - SLIDE 21
Types of Database Systems
•
•
•
•
•
PC Databases
Centralized Database
Client/Server Databases
Distributed Databases
Database Models
IS 257 – Spring 2004
2004.01.20 - SLIDE 22
PC Databases
E.G.
Access
FoxPro
Dbase
Etc.
IS 257 – Spring 2004
2004.01.20 - SLIDE 23
Centralized Databases
Cental
Computer
IS 257 – Spring 2004
2004.01.20 - SLIDE 24
Client Server Databases
Client
Client
Network
Database
Server
Client
IS 257 – Spring 2004
2004.01.20 - SLIDE 25
Distributed Databases
Location C
Location B
computer
computer
computer
Homogeneous
Databases
Location A
IS 257 – Spring 2004
2004.01.20 - SLIDE 26
Distributed Databases
Client
Heterogeneous
Or Federated
Databases
Database
Server
Remote
Comp.
Local Network
Comm
Server
Client
IS 257 – Spring 2004
Remote
Comp.
2004.01.20 - SLIDE 27
Terms and Concepts
• Database Application
– An application program (or set of related
programs) that is used to perform a series of
database activities:
•
•
•
•
•
Create
Read
Update
Delete
On behalf of database users
IS 257 – Spring 2004
2004.01.20 - SLIDE 28
Range of Database Applications
• PC databases
– Usually for individual
• WorkGroup databases
– Small group use where everyone has access
to the database over a LAN
• Departmental databases
– Larger than a workgroup – but similar
• Enterprises databases
– For the entire organization over an intranet (or
sometimes the internet)
IS 257 – Spring 2004
2004.01.20 - SLIDE 29
Terms and Concepts
• Database activities:
– Create
• Add new data to the database
– Read
• Read current data from the database
– Update
• Update or modify current database data
– Delete
• Remove current data from the database
IS 257 – Spring 2004
2004.01.20 - SLIDE 30
Terms and Concepts
• Enterprise
– Organization
• Entity
– Person, Place, Thing, Event, Concept...
• Attributes
– Data elements (facts) about some entity
– Also sometimes called fields or items or domains
• Data values
– instances of a particular attribute for a particular entity
IS 257 – Spring 2004
2004.01.20 - SLIDE 31
Terms and Concepts
• Records
– The set of values for all attributes of a
particular entity
– AKA “tuples” or “rows” in relational DBMS
• File
– Collection of records
– AKA “Relation” or “Table” in relational DBMS
IS 257 – Spring 2004
2004.01.20 - SLIDE 32
Terms and Concepts
• Key
– an attribute or set of attributes used to identify
or locate records in a file
• Primary Key
– an attribute or set of attributes that uniquely
identifies each record in a file
IS 257 – Spring 2004
2004.01.20 - SLIDE 33
Terms and Concepts
• DA
– Data adminstrator - person responsible for the
Data Administration function in an
organization
– Sometimes may be the CIO -- Chief
Information Officer
• DBA
– Database Administrator - person responsible
for the Database Administration Function
IS 257 – Spring 2004
2004.01.20 - SLIDE 34
Terms and Concepts
• Data Administration
– Responsibility for the overall management of data
resources within an organization
• Database Administration
– Responsibility for physical database design and
technical issues in database management
• Data Steward
– Responsibility for some subset of the
organization’s data, and all of the interactions
(applications, user access, etc.) for that data
IS 257 – Spring 2004
2004.01.20 - SLIDE 35
Terms and Concepts
• Models
– (1) Levels or views of the Database
• Conceptual, logical, physical
– (2) DBMS types
• Relational, Hierarchic, Network, Object-Oriented,
Object-Relational
IS 257 – Spring 2004
2004.01.20 - SLIDE 36
Models (1)
Application 1
External
Model
Application 2
Application 3
Application 4
External
Model
External
Model
External
Model
Application 1
Conceptual
requirements
Application 2
Conceptual
requirements
Application 3
Conceptual
requirements
Conceptual
Model
Logical
Model
Internal
Model
Application 4
Conceptual
requirements
IS 257 – Spring 2004
2004.01.20 - SLIDE 37
Data Models(2): History
• Hierarchical Model (1960’s and 1970’s)
– Similar to data structures in programming
languages.
Books
(id, title)
Authors
(first, last)
IS 257 – Spring 2004
Publisher
Subjects
2004.01.20 - SLIDE 38
Data Models(2): History
• Network Model (1970’s)
– Provides for single entries of data and
navigational “links” through chains of data.
Authors
Subjects
Books
Publishers
IS 257 – Spring 2004
2004.01.20 - SLIDE 39
Data Models(2): History
• Relational Model (1980’s)
– Provides a conceptually simple model for data
as relations (typically considered “tables”) with
all data visible.
pubid
Book ID
1
2
3
4
5
Title
pubid
Introductio
The history
New stuff ab
Another title
And yet more
IS 257 – Spring 2004
2
4
3
2
1
Author id
1
2
3
4
5
1
2
3
4
Book ID
pubname
Harper
Addison
Oxford
Que
Authorid
1
2
3
4
5
Author name
Smith
Wynar
Jones
Duncan
Applegate
Subid
1
2
3
4
4
2
1
3
2
3
Subid
Subject
1 cataloging
2 history
3 stuff
2004.01.20 - SLIDE 40
Data Models(2): History
• Object Oriented Data Model (1990’s)
– Encapsulates data and operations as
“Objects”
Books
(id, title)
Authors
(first, last)
IS 257 – Spring 2004
Publisher
Subjects
2004.01.20 - SLIDE 41
Data Models(2): History
• Object-Relational Model (1990’s)
– Combines the well-known properties of the
Relational Model with such OO features as:
• User-defined datatypes
• User-defined functions
• Inheritance and sub-classing
IS 257 – Spring 2004
2004.01.20 - SLIDE 42
Database System Life Cycle
Physical
Creation
2
Conversion
3
Design
1
Growth,
Change, &
Maintenance
6
Integration
4
Operations
5
IS 257 – Spring 2004
2004.01.20 - SLIDE 43
Design
• Determination of the needs of the
organization
• Development of the Conceptual Model
of the database
– Typically using Entity-Relationship
diagramming techniques
• Construction of a Data Dictionary
• Development of the Logical Model
IS 257 – Spring 2004
2004.01.20 - SLIDE 44
Physical Creation
• Development of the Physical Model of the
Database
– data formats and types
– determination of indexes, etc.
• Load a prototype database and test
• Determine and implement security, privacy
and access controls
• Determine and implement integrity
constraints
IS 257 – Spring 2004
2004.01.20 - SLIDE 45
Conversion
• Convert existing data sets and
applications to use the new database
– May need programs, conversion utilities to
convert old data to new formats.
IS 257 – Spring 2004
2004.01.20 - SLIDE 46
Integration
• Overlaps with Phase 3
• Integration of converted applications and
new applications into the new database
IS 257 – Spring 2004
2004.01.20 - SLIDE 47
Operations
• All applications run full-scale
• Privacy, security, access control must be in
place.
• Recovery and Backup procedures must be
established and used
IS 257 – Spring 2004
2004.01.20 - SLIDE 48
Growth, Change & Maintenance
• Change is a way of life
– Applications, data requirements, reports, etc.
will all change as new needs and
requirements are found
– The Database and applications and will need
to be modified to meet the needs of changes
IS 257 – Spring 2004
2004.01.20 - SLIDE 49
Another View of the Life Cycle
Integration
4
Operations
5
Design
Physical
1
Creation Conversion Growth,
2
Change
3
6
IS 257 – Spring 2004
2004.01.20 - SLIDE 50
Next Time
IS 257 – Spring 2004
2004.01.20 - SLIDE 51