Introduction to Database Systems
Download
Report
Transcript Introduction to Database Systems
Introduction to Database Systems
Chapter 1
Instructor: Mirsad Hadzikadic
Database Management Systems
Ramakrishnan & Gehrke
1
http://www.sigmod.org/record/issues/0606/index.html
Database Management Systems
Ramakrishnan & Gehrke
2
History
60s
C. Bachman GE, network data model, CODASYL
Late 60s
IBM IMS, hierarchical data model, SABRE, AA-IBM
70
Edgar Codd, IBM, relational model
80s
SQL, IBM, System R project, concurrent
transactions management, J. Gray
Late 80-90s DB2, Oracle, Informix, Sybase
ERP, MRP Baan, Oracle, PeopleSoft, SAP, Siebel
Common tasks – inventory, HR, financial analysis
90s
DW, Internet
Object-oriented, object-relational DBs
Turing award and Turing test?
Database Management Systems
Ramakrishnan & Gehrke
3
Why Study Databases?
Data everywhere
Shift from computation to information
– At the “low end:” scramble to Web space
– At the “high end:” scientific applications
Datasets increasing in diversity and volume
– Digital libraries, interactive video, Human
Genome project, EOS project
– ... need for DBMS exploding
DBMS encompasses most of CS
– OS, languages, theory, “A”I, multimedia, logic
Database Management Systems
Ramakrishnan & Gehrke
4
What Is a DBMS?
Files vs. Database
A very large, integrated collection of data.
Models real-world enterprise
– Entities (e.g., students, courses)
– Relationships (e.g., Madonna is taking ITCS 6160)
A Database Management System (DBMS) is a
software package designed to maintain and
utilize databases
Database Management Systems
Ramakrishnan & Gehrke
5
Why Use a DBMS?
Data independence and efficient access
Data integrity and security
Uniform data administration
Concurrent access, recovery from crashes
Reduced application development time
When not to use a DB?
Database Management Systems
Ramakrishnan & Gehrke
6
Data Models
A data model is a collection of concepts for
describing data
A schema is a description of a particular
collection of data, using the given data model
The relational model of data is the most widely
used model today
– Main concept: relation, basically a table with rows
and columns
– Every relation has a schema, which describes the
columns, or fields
Database Management Systems
Ramakrishnan & Gehrke
7
Levels of Abstraction
Many (external) views,
single conceptual (logical)
schema and physical
schema.
View 1
– Views describe how users
see the data
– Conceptual schema defines
logical structure
– Physical schema describes
the files and indexes used
View 2
View 3
Conceptual Schema
Physical Schema
Views and Schemas are defined using DDL; data is modified/queried using DML
Database Management Systems
Ramakrishnan & Gehrke
8
Example: University Database
Conceptual schema:
– Students(sid: string, name: string, login: string,
age: integer, gpa:real)
– Courses(cid: string, cname:string, credits:integer)
– Enrolled(sid:string, cid:string, grade:string)
Physical schema:
– Relations stored as unordered files
– Index on first column of Students
External Schema (View):
– Course_info(cid:string, enrollment:integer)
Database Management Systems
Ramakrishnan & Gehrke
9
Data Independence
Applications insulated from how data is
structured and stored
Logical data independence: Protection from
changes in logical structure of data
Physical data independence: Protection from
changes in physical structure of data
One of the most important benefits of using a DBMS!
Database Management Systems
Ramakrishnan & Gehrke
10
These layers
must consider
concurrency
control and
recovery
Structure of a DBMS
A typical DBMS has a
Query Optimization
layered architecture
and Execution
The figure does not
Relational Operators
show the concurrency
Files and Access Methods
control and recovery
components
Buffer Management
This is one of several
Disk Space Management
possible architectures;
each system has its own
variations
DB
Database Management Systems
Ramakrishnan & Gehrke
11
Transaction Management: ACID
properties
Atomicity: All actions in the Xact happen, or none happen
Consistency: If each Xact is consistent, and the DB starts
consistent, it ends up consistent
Isolation:
Execution of one Xact is isolated from that of
other Xacts
Durability:
The Recovery Manager guarantees Atomicity & Durability
If a Xact commits, its effects persist
Database Management Systems
Ramakrishnan & Gehrke
12
Motivation of concurrency control
Consistency
Isolation
Example
–
–
–
–
Two parallel transactions T1 and T2
Serial execution
Execution with interleaving actions
Example
Database Management Systems
Ramakrishnan & Gehrke
13
Motivation of recovery management
Atomicity:
– Transactions may abort (“Rollback”)
Durability:
– What if DBMS stops running? (Causes?)
Desired Behavior after
system restarts:
– T1, T2 & T3 should be
durable
– T4 & T5 should be
aborted (effects not seen)
Database Management Systems
T1
T2
T3
T4
T5
Ramakrishnan & Gehrke
crash!
14
Databases make these folks happy ...
End users and DBMS vendors
DB application programmers
– E.g. smart webmasters
Database administrator (DBA)
–
–
–
–
Designs logical /physical schemas
Handles security and authorization
Data availability, crash recovery
Database tuning as needs evolve
Must understand how a DBMS works!
Database Management Systems
Ramakrishnan & Gehrke
15
Summary
DBMS is used to maintain, query large datasets
Benefits include recovery from system crashes,
concurrent access, quick application
development, data integrity and security
Levels of abstraction give data independence
A DBMS typically has a layered architecture
DBAs hold responsible jobs and are
well-paid!
DBMS R&D is one of the broadest,
most exciting areas in CS
Database Management Systems
Ramakrishnan & Gehrke
16