Transcript Chap01

The Worlds of Database
Systems
Chapter 1
Database Management Systems
(DBMS)
• DBMS: Powerful tool for creating and
managing large amounts of data efficiently
and allowing it to persist over long periods
of time safely
• DBMS capabilities
– Persistent storage
– Programming interface
– Transaction management
DBMS vs File Systems
• Conventional file systems are inadequate
as database systems.
• They fail to support:
– Efficient search
– Efficient modifications to small pieces of data
– Complex queries
– Controlled buffering of useful data in MM
– Atomic and independent execution of
transactions
DBMS Functionalities
• Allows users to create new databases
• Allows users to querry data
• Support the storage of very large amounts
of data
• Controls access to data from many users
at once
Early Database Management
Systems
• Airline reservation systems
• Banking systems
• Corporate Records
Airline Reservation System
• Data
– Reservations (flight no, seat, meal prefer.)
– Flights (src, dest, departure/arrival times, etc)
– Prices
• Queries
– Flights on a given date, time, city, seats available,
prices
• Modifications
– Booking a flight, assigning a seat
• Requirements
– Concurrent access by many agents (make sure no
two agents assign the same seat simultaneously)
– Avoid loss in case of a sudden system failure
Relational Databases
• Based on the relational model
Student
Course
Quarter
Charles
CS 444
Fall, 1997
Dan
CS 142
…
…
Winter,
1998
…
• Separates the logical view from the
physical view of the data.
Querying a Database
• Find all the students who have taken
CS444 in Fall, 1997.
• S(tructured) Q(uery) L(anguage)
– select E.name
– from Enroll E
– where E.course=CS444 and
–
E.quarter=“Fall, 1997”
• Query processor figures out how to
answer the query efficiently.
Database Industry
• Relational databases are a great success
of theoretical ideas.
• “Big 3” DBMS companies are among the
largest software companies in the world.
• IBM (with DB2) and Microsoft (SQL
Server, Microsoft Access) are also
important players.
• $20B industry
• Challenged by object oriented DBMS.
Evolution
• Smaller and smaller systems
– DBMS’s could only be run on large computers (only
large computers could store gigabyte)
– Now, feasible to run a DBMS on a PC
• Bigger and bigger systems
– Many applications need hundreds of gigabyte
• E.g. A retail chain may store terabytes (1012)
– Newer databases store images, audio, video, etc.
• E.g., database of sattellites images store petabytes (1015)
– Modest size databases stored in Secondary Storage
(Arrays of disks)
Dealing with large data
• One could argue that what distinguishes database
systems from other software is, database systems
assume that data is too big to fit in main memory; thus
kept on disk
• Tertiary Storage
– Capacity of terabytes
– Access times seconds (ms for disks)
– CD’s DVD’s may be the storage medium
• Parallel Computing
– Satisfies the need of fast access to large data
– Speeds up through parallel access to disks/devices
– Needs efficient algorithms to intelligently break queries up
• Client-Server and Multi-Tier Architectures
– Major database components at the server and the
client used to interface with the user
– In multi-tier architectures, client is an application
server, manages connections to the database,
transactions, authorization, etc.
– Client  application server  database server
• Multimedia Data
• Information Integration
– Providing a unified view to the database user even if
data comes from different sources(legacy databases)
using different structures to represent information
– Data Warehouses copy information from many legacy
databases with appropriate translation to a central
database;
• Warehouses reconstructed each night to reflect updates on
legacy databases
Overview of a DBMS
• DDL: Data Definition Language
– Capable of altering the metadata (schema or
structure/constraints information of the database)
– Requires special authority (i.e., used by database
administrator)
• Storage and Buffer Management
– Storage manager keeps track of the location of files
on the disk and obtains the blocks containing a file on
request from the buffer manager which keeps portions
of the disk contents in MM. Information that various
components may need include
• Data, Metadata, Statistics
• İndexes: data structures that support efficient access to the
data
DBMS
Components
Transaction Processing
– Group of one or more database operations into a
transaction which must be executed atomically and in
isolation of other trans.
– A DBMS guarantees durability: the work of a
completed transaction will never be lost.
– The Transaction processor performs 3 tasks.
1) Logging: Every change in the database is logged on disk
(through buffer manager) to enable recovery in case of a crash
2) concurrency control assures
* atomicity: a transaction is performed either completely or
not at all
* isolation: transactions are executed as if there were no other
concurrently executing transactions (uses locks)
3) deadlock resolution (“roll back” or “abort” some transaction)
• Querry Processing
• Querry compiler
– Parser: builds a tree structure from the textual form of the
query
– Preprocessor: performs semantic checks on the query
and translates parse tree into algebraic operators
representing the initial query plan
– Optimizer: transforms the initial query plan into the best
available sequence of operations on actual data
• Execution engine
– Executes the steps of the chosen query plan
– Needs to interact with most of the other components of
the DBMS
DBMS
Components
DBMS Studies
• Database Design
– E-R Model
– Relational data model
– ODL
• Database programming
– SQL
– OQL (Object Query Language)
• Database system implementation
– Storage management
– Query processing
– Transaction management