Notes 01 - Donald Bren School of Information and Computer Sciences

Download Report

Transcript Notes 01 - Donald Bren School of Information and Computer Sciences

CS222: Principles of Database Management
Fall 2010
Professor Chen Li
Department of Computer Science
University of California, Irvine
Notes 01
1
Course General Info
• URL: http://www.ics.uci.edu/~cs222/
• Lecture times: T/Th, 11:00 – 12:20 pm, Bren Hall
1200
• Instructor: Chen Li
• Office Hours: Th, 2 – 3 pm, Bren Hall 2092,
chenli at ics dot uci dot edu, 949-824-9470
CS222
Notes 01
2
Prerequisites
• Undergraduate course in DBMS (CS122A or equivalent)
– DB design, relational model, SQL, OO data model
• Operating systems concepts
– virtual memory, paging, concurrent programming,
semaphores, critical sections, monitors, file and buffer
management
• Basic Computer Science Concepts:
– Depth-first search, directed/undirected graphs, “big O”
notation, computational complexity, NP completeness
…
• Programming: C/C++
CS222
Notes 01
3
Why take CS222?
• DBMS techniques are a key component of the past,
present and future computing infrastructures.
– ALL computer scientists specializing in systems should
have knowledge of DBMS.
• It prepares you for more advanced DBMS
courses/research (e.g., CS223, CS224)
CS222
Notes 01
4
Text Books and Gradiance account
• Required: Database Management Systems, Third
Edition, by Raghu Ramakrishnan, Johannes
Gehrke, available on Amazon.com.
• Recommended textbook: either one of the
following two books:
– Database System Implementation, by Hector GarciaMolina, Jeffrey Ullman, and Jennifer Widom, Prentice
Hall.
– Database Systems: The Complete Book, by Hector
Garcia-Molina, Jeffrey D. Ullman, Jennifer D. Widom,
Prentice Hall.
CS222
Notes 01
5
Course Requirements
•
•
•
•
Assignments: 15%
Programming Projects: 35%
Midterm: 25%
Final: 30%
CS222
Notes 01
6
Assignment Policies
•
•
•
•
Done in groups of <= 2 students (projects)
Problem sets done individually
Late submissions: will not be accepted
You have two weeks to resolve any gradingrelated issues. After that, all the grades will be
finalized.
CS222
Notes 01
7
DBMS Overview
user
Applications/queries
Query processor
Storage manager
metadata
data
• Data: collection of interrelated information about world being modeled
• DBMS: general-purpose software to define, create, modify, retrieve, delete and
manipulate a database
• Vendors: IBM (+ Informix), Microsoft, Oracle, Sybase, MySQL, …
CS222
Notes 01
8
Simplified DBMS Architecture
Application
Queries
Schema changes
compilers
Metadata
and data
dictionary
optimizer
evaluator
Query processor
Buffer manager
Transaction
Manager
File system
Storage manager
Database and
Indices
CS222
Notes 01
9
Example
CS222
Notes 01
10
DBMS Goals
•
•
•
•
•
Efficient data management (faster than files)
Large amount of data
High reliability
Information sharing (multiple users)
DBMS Users:
– E-commerce companies, banks, airlines, transportation
companies, corporate databases, government agencies, …
– Anyone you can think of!
CS222
Notes 01
11
Classification of DBMS
• Relational DBMS:
– Modeling concept: tables and constraints on tables
– Query Language: SQL
– Applications: suited for traditional business processing
• Object-Oriented DBMS
– Modeling concepts: objects, classes, inheritance
– Query Language: object oriented OQL
– Applications: suited for CAD databases, CASE databases, office
automation
• Object-Relational DBMS:
– Incorporate OO concepts into relational model
– Similar functionality as OO-DBMS, but different implementations
– Language: extended to process objects.
• XML DBMS
CS222
Notes 01
12
Why not use a traditional file system?
Naïve implementation:
• Records are stored sequentially in a file, separated by
special characters:
“Tom Smith | Bill Jackson | John Wayne |…”
• Queries are answered by retrieving the data from the
file(s), then doing the necessary processing
Q1: select * from emp where sal > 50K;
Q2: select * from emp, dept
where emp.did = dept.did;
CS222
Notes 01
13
Problems
• Record modifications (insert, delete, update)?
• Efficiency of query processing?
• Buffer management?
CS222
Notes 01
14
Problems (cont)
• Concurrency control with different granularities?
• Data reliability?
• Application Programming Interface (API)?
CS222
Notes 01
15
Main DB courses @ UCI
Intro
Project-oriented
CS122B
CS122A
undergrad
grad
CS222
DB Principles
CS222
CS224
CS223
Distributed DBs
and transactions
Advanced Topics
Notes 01
16
Key Database Technologies
• File Management
– provides a file abstraction as a collection of records stored in disk
CS222
• Index Management and Access Methods
– implements techniques for associative access to data
• Query Optimization and Processing
– given a query and data storage structures, determines an efficient strategy
to evaluate the query.
• Transaction management
– ensures consistency of the database in presence of concurrent transactions
and various types of failures
• Catalog Management
– maintains database schema information
• Authorization and Integrity Management
CS223
– tests for integrity constraints and user authorization
CS222
Notes 01
17