Chapter 1 Introduction to DBMS Concepts

Download Report

Transcript Chapter 1 Introduction to DBMS Concepts

Chapter 1 Introduction to DBMS
Concepts
IAM6133 Advanced Database
Technology
Topic 1
Introduction to DBMS Concepts
(Database Management Systems)
Application
program
DBMS
End-user
1-2
Outline of Unit 1
1.1 Information Systems
1.2 An Overview of a Database System
1.3 An Architecture for a Database
System
1.4 Database Technology Trends
1-3
1.1 Information Systems
1-4
Stages of Information System
– Stage 0: Manual Information System
• Records
• Files
• Index Cards
– Stage 1: Sequential Information Systems
• Tapes
• Files
• slow, non-interactive, redundancy,... .
– Stage 2: File Based Information Systems
• Disk (direct access)
• application program has its own file
data dependence
• data redundancy
– Stage 3: DBMS based Information Systems
• Generalized data management software
• Transaction processing
1-5
File Based Information Systems
– Conventional Data Processing techniques:
Enterprise:
Application
System A
Application
System B
Application
Program A
File System A
Application
Program B
Customer
File System B
Invoice
File System N
Inventory
…
Application
System N
Application
Program N
1-6
File Based Information Systems (cont.)
Customer
Invoice
Customer No.
Customer Name
Part No.
Quantities
Unit Price
Customer No.
Customer Name
Customer Addr.
Social Security ID
Inventory
Parts
Part No.
Part Description
Unit Price
Supplier
Quantities Remain
Quantities Ordered
Part No.
Part Description
Supplier
Quantities Ordered
Customer Name
Unit Price
1-7
User A1
User A2
User B1
User B2
User B3
Host
Language
+ DSL
Host
Language
+ DSL
Host
Language
+ DSL
Host
Language
+ DSL
Host
Language
+ DSL
C, Pascal
DSL (Data Sub Language)
e.g. SQL
1
3
2
External View
@ # &
External
schema
A
External/conceptual
mapping A
Conceptual
schema
<
External
schema
B
External View
B
External/conceptual
mapping B
Database
management
system dictionary
(DBMS) e.g. system
Conceptual
View
catalog
Conceptual/internal
mapping
DBA
(Build and
maintain
schemas
and
mappings)
Storage
structure
definition
(Internal
schema)
1 2 3
Stored database (Internal View)
1-8
#
...
100
& @
DBMS based Information Systems:
Basic Approach - Integration
– (1) Integration of Information
• Description of the integrated view of data is the
"Conceptual Schema"
of the database
DBMS
Application
program
End-user
1-9
DBMS based Information Systems:
Basic Approach – Simple views and High level language
– (2) Provide simple views (External Schema) and high level language (e.g.
SQL) for users to manipulate (handle) data
• High level language: e.g. SQL (Structured Query Language)
<e.g.>: SELECT SNAME
FROM S
WHERE S#= 'S4';
• Description of user's view of data is the "external schema" or
"subschema" or "view".
• High-level languages (Query Language): SQL
(1) Data Definition Language:
define format
(2) Data Manipulation Language:
retrieve, insert, delete, update
• Emphasize: EASE OF USE !!
1-10
S
S# name
DBMS based Information Systems:
Basic Approach - Storage/Access Method
– (3) Efficient Storage/Access Techniques:
• implemented once rather than duplicated in all application
programs.
User: query in SQL
Language Processor
DBMS
Access Methods Calls
Access Method
(B+ tree, Dynamic Hashing)
I/O calls
1-11
DBMS based Information Systems:
Basic Approach - Transaction Management
– (4) Provide Transaction Management:
• Concurrency Control
• Recovery
• Security
•
.
:
1-12
Example: A Simple Query Processing
Query in SQL:
SELECT CUSTOMER. NAME
FROM CUSTOMER, INVOICE
WHERE REGION = 'N.Y.' AND
AMOUNT > 10000 AND
CUTOMER.C#=INVOICE.C
DBMS
Language Processor
Internal Form :
( (S
SP)
Optimizer
Language
Processor
Operator :
SCAN C using region index, create C
SCAN I using amount index, create I
SORT C?and I?on C#
JOIN C?and I?on C#
EXTRACT name field
Operator Processor
Calls to Access Method:
OPEN SCAN on C with region index
GET next tuple
.
.
.
Access Method
e.g.B-tree; Index;
Hashing
Calls to file system:
GET10th to 25th bytes from
block #6 of file #5
File System
1-13
database
Access
Method
1.2 An Overview of a
Database System
1-14
An Example
• The Wine Cellar Database:
Cellar: Bin Wine
2
3
6
12
16
43
50
51
Producer
Year Bottle Ready
Chardonnay Buena Vista 83
Chardonnay
Louis Martini
Chardonnay
Chappellet
Jo. Riesling
Buena Vista
Jo. Riesling
Sattui
Cab. Sauvignon Robt. Mondavi
Pinot Noir
Mirassou
Pinot Noir
Ch. St. Jean
1
81
82
82
82
77
77
78
85
5
4
1
1
12
3
2
• Retrieval:
• DML (Data Manipulation Language):
SELECT Wine, Bin, Producer
FROM Cellar
WHERE Ready = 85;
• Result:
Wine
Bin
Chardonnay 2
Chardonnay 6
Pinot Noir
50
Producer
Buena Vista
Chappellet
Mirassou
1-15
84
85
83
83
87
85
86
Comments
Thanksgiving
Late Harvest
Very dry
Harvest
An Example (cont.)
• Deletion:
• DML: DELETE FROM Cellar
WHERE Ready < 86;
• Result:
Bin
Wine
43 Cab. Sauvignon
51 Pinot Noir
Producer
Year Bottle Ready Comments
Robt. Mondavi 77
Ch. St. Jean
78
12
87
2
86
• Insertion:
• DML: INSERT INTO Cellar
VALUES (53, 'Pinot Noir', 'Franciscan', 79, 1, 86, 'for Joan');
• Result:
Bin
Wine
43 Cab. Sauvignon
51 Pinot Noir
53 Pinot Noir
Producer
Year Bottle Ready
Robt. Mondavi 77
Ch. St. Jean
78
Franciscan
79
1-16
12
Comments
87
2
1
86
86
for Joan
An Example (cont.)
• Update
• DML: UPDATE
Cellar
SET Bottles = 4
WHERE Bin = 51;
• Result:
Bin
Wine
43 Cab. Sauvignon
51 Pinot Noir
53 Pinot Noir
Producer
Year
Bottle Ready Comments
Robt. Mondavi 77
Ch. St. Jean
78
Franciscan
79
12
1-17
87
4
1
86
86
for Joan
What is a Database System?
DBMS
Application
program
End-user
1-18
What is a Database System? (cont.)
• Major components of a database system:
•
•
•
•
Data: integrated and shared.
Hardware: disk, CPU, Main Memory, ...
Software: DBMS
Users:
1. Application programmers
2. End users
3. Database administrator (DBA)
» Defining external schema
» Defining conceptual schema
» Defining internal schema
» Liaison with users
» Defining security and integrity checks
» Defining backup and recovery procedures
» Monitoring performance and changing requirements
1-19
Why Database ?
•
•
•
•
•
•
•
Redundancy can be reduced
Inconsistency can be avoided
The data can be shared
Standards can be enforced
Security restrictions can be applied
Integrity can be maintained
Provision of data independence
objective !
1-20
Data Independence
• Application Program
 Data Structure
• Immunity of application to change in
storage structure and access strategy.
1-21
Data Dependence vs. Data Independence
S
• Data Dependent
S#
S1
S2
:
Sn
e.g. SELECT CITY
FROM S
WHERE ITEM = 'X';
• Linked list: TOP
S
Top
s1
sn
s2
if item = TOP . item then .........
• Tree:
.
if item < root.data then root := root .left ..........
• Array:
if A[I] = item then ............
• Storage structure changed  program changed
1-22
1.3 An Architecture for a
Database System
1-23
User A1
User A2
User B1
User B2
User B3
Host
Language
+ DSL
Host
Language
+ DSL
Host
Language
+ DSL
Host
Language
+ DSL
Host
Language
+ DSL
(Data Sub Language)
e.g. SQL, QUEL
External View
@ #A &
External
schema
A
External
schema
B
External/conceptual
mapping A
Conceptual
schema
<
External/conceptual
mapping B
Conceptual
@
View
#A
&
Database
management
system
(DBMS)
Conceptual/internal
mapping
DBA
(Build and
maintain
schemas
and
mappings)
External View
B
Storage
structure
definition
(Internal
schema)
Stored database (Internal View)
1-24
#
& @
An Example of the Three Levels
• Internal level:
STORED_EMP length = 18
PREFIX
TYPE = BYTE(6), OFFSET = 0, INDEX = EMPX
EMP#
TYPE = BYTE(6), OFFSET=0,
DEPT#
TYPE = BYTE(4), OFFSET = 12
PAY
TYPE = FULLWORD, OFFSET = 16
•
Conceptual level:
EMPLOYEE
EMPLOYEE_NUMBER
DEPARTMENT_NUMBER
SALARY
CHARACTER(6)
CHARACTER(4)
NUMERIC(5)
• External level:
– ( PL /I )
DCL 1 EMP
2
EMP# CHAR(6)
2
SAL
FIXED BIN(31)
– (COBOL)
01 EMPC
02 EMPNO PIC X(4)
02 DEPTNO PIC X(4)
1-25
Functions of the DBMS
•
•
•
•
•
•
Data Definition Language (DDL)
Data Manipulation Language (DML)
Data Security and Integrity
Data Recovery and Concurrency
Data Dictionary
Performance
1-26
1.4 Database Technology Trends
1960s to
Mid-1970s
1970s to
Mid-1980s
Late
1980s
Future
Merging data models,
knowledge representation,
and programming
languages
Network
Hierarchical
Relational
Semantic
Object-oriented
Logic
Database
Hardware
Mainframes
Mainframes
Minis
PCs
Faster PCs
Workstations
Database machines
User
Interface
None
Forms
Query
languages
- SQL, QUEL
Graphics
Menus
Query-by-forms
Natural language
Speech input
Program
Interface
Procedural
4GL
Logic programming
Integrated database
and programming
language
Presentation
and display
processing
Reports
Processing
data
Business graphics
Image output
Knowledge
processing
Generalized display
managers
Distributed knowledge
processing
Data Model
Embedded
query
language
Report
generators
Information
and transaction
processing
1-27
Parallel processing
Optical memories
Distributed Databases
Database
Distributed System
• Distributed database is a database that is
not stored in its entirety at a single physical
location,
but rather is spread across a
< e.g.>
Hsinchu
network of computer.
computers
Taipei
Kaoshiung
communication
links
Taichung
1-28
Distributed Databases (cont.)
• Advantages:
– efficiency of local processing
– data sharing
• Disadvantages:
– communication overhead
– implementation difficulties
• Reference:
S. Ceri and G. Pelagatti
"Distributed Databases: principles and systems"
1-29
Multi-Database/Heterogeneous Database
Multidatabase
IMS
Hierarchical
Model
INGRES
Relational
Model
• semantic inconsistency
• data incompleteness
• global schema
1-30
.. .
ORION
ObjectOriented
Model
DB + AI
Query
Database
DBMS
Language Processor
Query Optimizer
Knowledge
Base
Operator Processor
Access Method
File Manager
Logical
DB design
Distributed
DB design
Knowledge
Base
Knowledge
Base
1-31
AI
KBMS
Database
Logic
– A Combined Model :
Logic Programming + Relational DB
Query :
? :- ancestor (taro, Y)
? :- grandfather (?, c)
User Program
– Three layers :
Knowledge
management
program
IDB
relational
interface
Relational DB
management
program
EDB
1-32
IDB:
ancestor(X,Y):- parent(X,Y)
ancestor(X,Y) :- parent(X,Z), ancestor(Z,Y)
parent(X,Y):-edb(father(X,Y))
parent(X,Y):-edb(mother(X,Y))
grandfather(X,Z):- father(X,Y) ^ father(Y,Z)
EDB:
fathe
r
father son
A
B
X
Y
.
. .
B
C
mother
.
Database Object-Oriented
OODB
• A typical Document : MEMO [Woelk86, SIGMOD]
MCC
To: W. Kim
From: D. Woolk
Date: September 18, 1992
Subject: Workstations
can be heard
In the computer center of National Chiao-Tung University, there are
a lot of workstations. There are HP RS serials, SUNs, Apollo, and so
on. The students in NCTU learn to use workstation since they are freshmen. The configuration of the workstations follows:
Workstation
.
speaker
voice message
associated
.
.
Workstation
}
text
}
}
graphics
Database
Server
In the course introduction to Computer Science? students do their
homework's on workstations.
image
1-33
Use of a Database Management System in Design
and Application
Database
Detailed
Design
Manufacturing
Design
Release
Control
Fabrication
Assembly
Info
Analysis
Models
DBMS
Database Management
System
Preliminarydesign
INTERFACE
Graphic Interface
Test /
Inspection
Language Interface
APPLICATION
Design
Analysis
Design
Verification
Manufacturing
Planning
1-34
Evaluation
Production
Control
Synthesis
Release
Project
Management
Fuzzy Database
Database
– Fuzzy Query
<e.g.>
SELECT
FROM
WHERE
STUDENT.NAME
STUDENT
SEX = M
AND HEIGH = TALLER
AND WEIGH = SLIMMER
STUDENT:
NAME SEX HEIGHT WEIGHT IQ
Mary
Linda
.
.
.
<e.g.>
F
F
.
.
.
158
165
.
.
.
SELECT
FROM
WHERE
55
55
.
.
.
High
Medium
.
.
.
STUDENT.NAME
STUDENT
IQ >= 130
1-35
Fuzzy Set Theory
More?
AI
DB
1993
DB
?
?
2001
VR
DB
DB
1997
1999
DB
?
1995
2003
DNA/BioInfo
.
?
DB
WWW
2004
?
2006
?
1-36
??? SARS
?
DB
Mobile/video
DB
911/ Anthrax
???
?