Week 3 - Sacramento State

Download Report

Transcript Week 3 - Sacramento State

1
Week 3
September 12
• Three-Level Architecture
• Database Management System (DBMS)
• Relational Data Model
• Views
R. Ching, Ph.D. • MIS Dept. • California State University, Sacramento
Privacy and Confidentiality
SSN: 123 45 6789
Customer: John K Smith
Address: 1234 Main Street
Dallas, TX 68213
Account: 5432 1234 4567 8901
Credit limit: 20,000
Current balance: 9,123.00
Employer: Enron Corp.
Monthly income: 100,000.00
R. Ching, Ph.D. • MIS Dept. • California State University, Sacramento
2
Schema and Subschemas
3
User
User
Subschema
User
User
Subschema
Schema
User
User
Individual
Subschema
Views
Complete catalog of all data
retained in the database
DBMS Software Manages the database
Physical Database
R. Ching, Ph.D. • MIS Dept. • California State University, Sacramento
Database Environment
Three Level Architecture
External
Level
4
User’s view of the database
Conceptual Level
Internal Level
Conceptual Schema
Community
view
Internal Schema
Physical
representation
Physical data organization
R. Ching, Ph.D. • MIS Dept. • California State University, Sacramento
Physical storage
Independence
• Each user should be able to access the same data, but have
a different customized view of the data
• Users should not have to deal directly with physical
database storage details
• The DBA should be able to change the database storage
structures without affecting the users’ views
• The internal structure of the database should be unaffected
by changes to the physical aspects of storage
• The DBA should be able to change the conceptual or
global structure of the database without affecting all users
R. Ching, Ph.D. • MIS Dept. • California State University, Sacramento
5
Three-Level Architecture
• External Level
Describes that part of the database that is relevant to a
particular user
• Conceptual Level
Describes what data is stored in the database and
relationships among the data
• Internal Level
Describes how the data is stored in the database
R. Ching, Ph.D. • MIS Dept. • California State University, Sacramento
6
Internal Conceptual
External
Database Environment
Three Level Architecture
Subschema
External/conceptual
mapping
Conceptual/internal
mapping
Subschema
7
Subschema
Conceptual Schema
Logical data
independence
Internal Schema
Physical data
independence
R. Ching, Ph.D. • MIS Dept. • California State University, Sacramento
Internal Conceptual
External
Database Environment
Three Level Architecture
Subschema
External/conceptual
mapping
Conceptual/interna
l mapping
Subschema
8
Subschema
Conceptual Schema
Logical data
independence
Internal Schema
Physical data
independence
R. Ching, Ph.D. • MIS Dept. • California State University, Sacramento
Internal Conceptual
External
Database Environment
Three Level Architecture
Subschema
External/conceptual
mapping
Conceptual/internal
mapping
Subschema
9
Subschema
Conceptual Schema
Internal Schema
R. Ching, Ph.D. • MIS Dept. • California State University, Sacramento
Logical data
independence
Physical data
independence
Data Independence
• Logical data independence
Immunity of external schemas to changes in the conceptual
schema
• Physical data independence
Immunity of the conceptual schema to changes in the
internal schema
“Plug and Play!”
R. Ching, Ph.D. • MIS Dept. • California State University, Sacramento
10
Database Environment
User
User
Subschema
User
11
User
Subschema
Schema
User
User
Individual
Subschema
Views
Complete catalog of all data
retained in the database
DBMS Software Manages the database
Shared and Managed
Physical Database
R. Ching, Ph.D. • MIS Dept. • California State University, Sacramento
File-Based Systems
User
User
Subschema
File
File
Each user has
his/her file
User
12
User
User
Subschema
User
Subschema
Integrity Problems
File
Schema
DBMS Software
File
R. Ching, Ph.D. • MIS Dept. • California State University, Sacramento
Everyone has access
to all data in the file
Database Languages: DDL vs. DML
• Data definition language (DDL)
Used to describe name the entities required for the
application and the relationships that may exist between
the different entities
– Specify or modify the database schema and subschemas
• Data manipulation language (DML)
Provides a set of operations that support the basic data
manipulation operations the data
– Read and update (i.e., insert, update, delete) the
database
R. Ching, Ph.D. • MIS Dept. • California State University, Sacramento
13
Models
• Represents the real thing
• Identifies the components and their interactions
• Specifies the behavior
For example...
R. Ching, Ph.D. • MIS Dept. • California State University, Sacramento
14
Data Models
• An integrated collection of concepts for describing and
manipulating data, relationships between data and
constraints on the data in an organization
• Three components:
– Structural part - set of rules applied to the construction
of the database
– Manipulative part - defines the types of operations
allowed on the data
– Integrity rules - ensures the accuracy of the data
R. Ching, Ph.D. • MIS Dept. • California State University, Sacramento
15
Data Models
• Object-based
– Entity-relationship
– Semantic
Knowledge-based
– Functional
– Object-oriented
Object-relational
• Record-based (transactions)
– Relational
Transaction-based
– Network
– Hierarchical
• Physical
– Unifying
How data are stored
– Frame memory
R. Ching, Ph.D. • MIS Dept. • California State University, Sacramento
16
Record-Based Data Models
• Relational (Oracle, DB2, Sybase, Informix, SQL 7, Ingres,
etc.)
– Based on concepts of mathematical relations
– Tables, rows, columns
• Network (CODASYL - COnference on DAta SYstem
Languages) (Image)
– Many-to-many relationships
– Record types, data items
• Hierarchical (IMS)
– Segment types, fields
In COBOL: files, records, fields
R. Ching, Ph.D. • MIS Dept. • California State University, Sacramento
17
Database
18
Management
Queries
Physical
Database
DBMS
Manages the database
R. Ching, Ph.D. • MIS Dept. • California State University, Sacramento
Application
Programs
Other
Software
Functions of a DBMS
•
•
•
•
•
•
•
•
•
•
Data storage, retrieval and update
User-accessible catalog
Transaction support
Concurrency control services
Recovery services
Authorization services
Support for data communications
Integrity services
Services to promote data independence
Utility services
R. Ching, Ph.D. • MIS Dept. • California State University, Sacramento
19
DBMS
20
Programmers
Application
Programs
Users
DBA
Queries
Database
Schema
DML Processor
Query Processor
DDL Processor
Program Object
Code
Database
Manager
Dictionary
Manager
DBMS
Access Methods
File Manager
System Buffers
R. Ching, Ph.D. • MIS Dept. • California State University, Sacramento
Operating
System
Database Manager
Processes query
Integrity Checker
Checks integrity
constraints
Data Manager
21
Authorization
Control
Command
Processor
Checks user authorization
Determines optimal strategy
Query
Optimizer
Transaction
Manager
Scheduler
Buffer Manager
Recovery
Manager
R. Ching, Ph.D. • MIS Dept. • California State University, Sacramento
Database Manager
22
Authorization
Control
Integrity Checker
Command
Processor
Query
Optimizer
Manages concurrent operations
Performs command
operation
Transaction
Manager
Scheduler
Ensures recovery in case of failures
Transfers data between
primary and secondary
storage
Buffer Manager
R. Ching, Ph.D. • MIS Dept. • California State University, Sacramento
Recovery
Manager
Database Manager
23
Authorization
Control
Integrity Checker
Command
Processor
Query
Optimizer
Manages concurrent operations
Query 
TransactionTransaction
 Journal  Buffered  OS
Performs
command
Scheduler
operation
Manager
Ensures recovery in case of failures
Transfers data between
primary and secondary
storage
Buffer Manager
R. Ching, Ph.D. • MIS Dept. • California State University, Sacramento
Recovery
Manager
System Catalog
• A repository of information describing the data in the
database (metadata)
• Stores
– Names of users authorized to access the DBMS
– Names of all data items in the database
• Types and sizes
• Constraints
– Data items and authorization level granted to each user
• Active vs. Passive
• Integrated vs. Standalone
R. Ching, Ph.D. • MIS Dept. • California State University, Sacramento
24
25
Relational Data Model
R. Ching, Ph.D. • MIS Dept. • California State University, Sacramento
Relational Model
26
• Objectives
– A degree of data independence
– Address data semantic, consistency and redundancy
problems
– Set-oriented data manipulation language
• Structured Query Language (SQL)
Database Criteria
Data Set
R. Ching, Ph.D. • MIS Dept. • California State University, Sacramento
Presentation
method
Information
Data Set
Presentation
method
27
Information
Criteria
Data Set
Presentation
method
Information
Database
Data Set
R. Ching, Ph.D. • MIS Dept. • California State University, Sacramento
Presentation
method
Information
Domain = all values an attribute can assume
28
Entity
Tuples (rows)
• Cardinalitiy =
number of
tuples
Relation
Attributes (columns)
• Degree of a relation = number of attributes
R. Ching, Ph.D. • MIS Dept. • California State University, Sacramento
Domain of an Attribute
29
• Set of allowable values for one or more attributes
Attribute 1
Domain
Attribute 2
Union
or
Intersection
Information
R. Ching, Ph.D. • MIS Dept. • California State University, Sacramento
Domain
Properties of Relations
• Distinct (i.e., unique) relation name
• Each cell contains exactly one atomic (single) value
– No repeating groups
• Distinct attribute name
• The values of an attribute come from the same domain
• Order of attributes has no significance
• Each tuple is distinct (i.e., unique)
– No duplicate tuples
• Order of tuples has no significance
R. Ching, Ph.D. • MIS Dept. • California State University, Sacramento
30
Unique Identification of a Relation
Relation
key
Superkey
Candidate key
Primary key
Foreign key
?
R. Ching, Ph.D. • MIS Dept. • California State University, Sacramento
31
Identifying a Tuple
• Superkey
An attribute or a set of attributes that uniquely identifies a
tuple within a relation
• Candidate key
A super key such that no proper subset is a superkey within
the relation
– Uniquely identifies the tuple (uniqueness)
– Contains no unique subset (irreducibility)
• Primary key
The candidate key that is selected to identify tuples
uniquely within a relation
– Should remain constant over the life of the tuple
– Most efficient way of identifying a tuple
R. Ching, Ph.D. • MIS Dept. • California State University, Sacramento
32
Finding the Primary Key
Super Key
Candidate Key
Primary key
R. Ching, Ph.D. • MIS Dept. • California State University, Sacramento
33
Keys
34
CDs Relation
129341 Help!
Beatles
Columbia
1-29150-8384-0
129342 Hard Day’s Night
Beatles
Columbia
1-29150-7115-0
129343 Sergeant Pepper’s
Beatles
Columbia
1-29150-2484-0
129344 Magical Mystery Tour
Beatles
Columbia
1-29150-7515-0
129345 Abbey Road
Beatles
Apple
1-15700-9510-0
Attributes
• Catalog number
• Record title
• Artist name
• Record label
• UPC
R. Ching, Ph.D. • MIS Dept. • California State University, Sacramento
Superkey?
Candidate key?
Primary key?
Selecting a Key
• Criteria
– An efficient way of identifying an entity
– The attribute (value) remains constant over the life of
the entity
• Never changes
R. Ching, Ph.D. • MIS Dept. • California State University, Sacramento
35
Identifying a Tuple
36
• Foreign key
An attribute or set of attributes within one relation that
matches the candidate key of some (possibly the same)
relation
Relation
key
foreign key
Relation
key
R. Ching, Ph.D. • MIS Dept. • California State University, Sacramento
Foreign Key
37
CDs Relation
129341 Help!
Beatles
COL
1-29150-8384-0
129342 Hard Day’s Night
Beatles
COL
1-29150-7115-0
129343 Sergeant Pepper’s
Beatles
COL
1-29150-2484-0
129344 Magical Mystery Tour
Beatles
COL
129345 Abbey Road
Beatles
APP
Recording Label
Relation
(home relation)
Must match!
1-29150-7515-0
1-15700-9510-0
COL
Columbia Records
APP
Apple Records
R. Ching, Ph.D. • MIS Dept. • California State University, Sacramento
Relational Integrity
Constraints placed on the set of values allowed for the
attributes of a relation.
• Entity integrity
– No attribute of a primary key can be null (every tuple
must be uniquely identified)
• Referential integrity
– If a foreign key exists in a relation, either the foreign
key value must match a candidate key value of some
tuple in its home relation, or the foreign key value
must be wholly null (i.e., no key exists in the home
relation)
• Enterprise constraints (organizational)
R. Ching, Ph.D. • MIS Dept. • California State University, Sacramento
38
Null Value
• Absence of any value (i.e., unknown or nonapplicable to a
tuple)
R. Ching, Ph.D. • MIS Dept. • California State University, Sacramento
39
Views
• A view is a virtual relation or one that does not actually
exist, but dynamically derived
– Can be constructed by performing operations (i.e.,
select, project, join, etc.) on values of existing base
relations
• Base relation - a named relation, corresponding to
an entity in the conceptual schema, whose tuples are
physically stored in the database
• View - a dynamic result of one or more relational
operations operating on the base relations to produce
another
R. Ching, Ph.D. • MIS Dept. • California State University, Sacramento
40
Schema and Subschemas
41
Internal Level
Physical Database
DBMS
Some end-user applications
can be supported by views
External
Subschema
Level
User
User
DBMS Software
Schema Conceptual Level
Subschema
User
R. Ching, Ph.D. • MIS Dept. • California State University, Sacramento
User
Subschema
User
User
Views
42
Base Relation R
Key
Base Relation S
Foreign
Key
Key
Criterion
View
R. Ching, Ph.D. • MIS Dept. • California State University, Sacramento
Purpose of Views
• Provides a powerful and flexible security mechanism by
hiding parts of the database from certain users
• Permits user access in a way that is customized to their
needs
• Simplify complex operations on the base relations
• Designed to support the external model
• Provides logical independence
R. Ching, Ph.D. • MIS Dept. • California State University, Sacramento
43
Updating Views
• Allowed on views
– Derived from a single base relation, and
– Containing the primary key or a candidate key
• NOT allowed on views
– Derived from multiple base relations
– Involving aggregations (i.e., summations) or groups
operations
• Vendors may have other constraints on updating views
R. Ching, Ph.D. • MIS Dept. • California State University, Sacramento
44
45
R. Ching, Ph.D. • MIS Dept. • California State University, Sacramento