Introduction to DB
Download
Report
Transcript Introduction to DB
Introduction to Database
Database System Concepts, 6th Ed.
©Silberschatz, Korth and Sudarshan
See www.db-book.com for conditions on re-use
Index
DBMS
What, Why, How
Relational databases
Table, SQL
Advanced issues
Designing database
Indexing
Transaction
Concurrency control
Recovery System
DBC
Database System Concepts - 6th Edition
2
©Silberschatz, Korth and Sudarshan
Database Management System (DBMS)
DBMS contains information about a particular enterprise
Collection of interrelated data
Set of programs to access the data
An environment that is both convenient and efficient to use
Database Applications:
Banking: transactions
Airlines: reservations, schedules
Universities: registration, grades
Sales: customers, products, purchases
Online retailers: order tracking, customized recommendations
Manufacturing: production, inventory, orders, supply chain
Human resources: employee records, salaries, tax deductions
Databases can be very large.
Databases touch all aspects of our lives
Database System Concepts - 6th Edition
3
©Silberschatz, Korth and Sudarshan
University Database Example
Application program examples
Add new students, instructors, and courses
Register students for courses, and generate class rosters
Assign grades to students, compute grade point averages (GPA)
and generate transcripts
In the early days, database applications were built directly on top of
file systems
Database System Concepts - 6th Edition
4
©Silberschatz, Korth and Sudarshan
Drawbacks of Using File Systems to Store Data
Data redundancy and inconsistency
Multiple file formats, duplication of information in different files
Difficulty in accessing data
Need to write a new program to carry out each new task
Data isolation — multiple files and formats
Integrity problems
Integrity constraints (e.g., account balance > 0) become “buried”
in program code rather than being stated explicitly
Hard to add new constraints or change existing ones
Database System Concepts - 6th Edition
5
©Silberschatz, Korth and Sudarshan
Drawbacks of Using File Systems to Store Data (Cont.)
Atomicity of updates
Failures may leave database in an inconsistent state with partial
updates carried out
Example: Transfer of funds from one account to another should either
complete or not happen at all
Concurrent access by multiple users
Concurrent access needed for performance
Uncontrolled concurrent accesses can lead to inconsistencies
Example: Two people reading a balance (say 100) and updating it
by withdrawing money (say 50 each) at the same time
Security problems
Hard to provide user access to some, but not all, data
Database systems offer solutions to all the above problems
Database System Concepts - 6th Edition
6
©Silberschatz, Korth and Sudarshan
Example of a Relation
attributes
(or columns)
tuples
(or rows)
Database System Concepts - 6th Edition
7
©Silberschatz, Korth and Sudarshan
Attribute Types
The set of allowed values for each attribute is called the domain
of the attribute
Attribute values are (normally) required to be atomic; that is,
indivisible
The special value null is a member of every domain
The null value causes complications in the definition of many
operations
Database System Concepts - 6th Edition
8
©Silberschatz, Korth and Sudarshan
Create Table Construct
An SQL relation is defined using the create table command:
create table r (A1 D1, A2 D2, ..., An Dn,
(integrity-constraint1),
...,
(integrity-constraintk))
r is the name of the relation
each Ai is an attribute name in the schema of relation r
Di is the data type of values in the domain of attribute Ai
Example:
create table instructor (
ID
char(5),
name
varchar(20),
dept_name varchar(20),
salary
numeric(8,2))
Database System Concepts - 6th Edition
9
©Silberschatz, Korth and Sudarshan
Keys
Let K R
K is a superkey of R if values for K are sufficient to identify a unique
tuple of each possible relation r(R)
Example: {ID} and {ID,name} are both superkeys of instructor.
Superkey K is a candidate key if K is minimal
Example: {ID} is a candidate key for Instructor
One of the candidate keys is selected to be the primary key.
which one?
Foreign key constraint: Value in one relation must appear in another
Referencing relation
Referenced relation
Database System Concepts - 6th Edition
10
©Silberschatz, Korth and Sudarshan
Integrity Constraints in Create Table
not null
primary key (A1, ..., An )
foreign key (Am, ..., An ) references r
Example: Declare branch_name as the primary key for branch
create table instructor (
ID
char(5),
name
varchar(20) not null,
dept_name varchar(20),
salary
numeric(8,2),
primary key (ID),
foreign key (dept_name) references department);
primary key declaration on an attribute automatically ensures not null
Database System Concepts - 6th Edition
11
©Silberschatz, Korth and Sudarshan
Updates to tables
Insert
insert into instructor values (‘10211’, ’Smith’, ’Biology’, 66000);
Delete
delete from student
Drop Table
drop table r
Alter
alter table r add A D
where A is the name of the attribute to be added to relation
r and D is the domain of A.
All tuples in the relation are assigned null as the value for
the new attribute.
alter table r drop A
where A is the name of an attribute of relation r
Dropping of attributes not supported by many databases.
Database System Concepts - 6th Edition
12
©Silberschatz, Korth and Sudarshan
Basic Query Structure
A typical SQL query has the form:
select A1, A2, ..., An
from r1, r2, ..., rm
where P
Ai represents an attribute
Ri represents a relation
P is a predicate.
The result of an SQL query is a relation.
Database System Concepts - 6th Edition
13
©Silberschatz, Korth and Sudarshan
The where Clause
The where clause specifies conditions that the result must satisfy
Corresponds to the selection predicate of the relational algebra.
To find all instructors in Comp. Sci. dept with salary > 80000
select name
from instructor
where dept_name = ‘Comp. Sci.' and salary > 80000
Comparison results can be combined using the logical connectives and,
or, and not.
Comparisons can be applied to results of arithmetic expressions.
Database System Concepts - 6th Edition
14
©Silberschatz, Korth and Sudarshan
The from Clause
The from clause lists the relations involved in the query
Corresponds to the Cartesian product operation of the relational
algebra.
Find the Cartesian product instructor X teaches
select
from instructor, teaches
generates every possible instructor – teaches pair, with all attributes
from both relations.
Cartesian product not very useful directly, but useful combined with
where-clause condition (selection operation in relational algebra).
Database System Concepts - 6th Edition
15
©Silberschatz, Korth and Sudarshan
Cartesian Product
teaches
instructor
Database System Concepts - 6th Edition
16
©Silberschatz, Korth and Sudarshan
Joins
For all instructors who have taught courses, find their names and the
course ID of the courses they taught.
select name, course_id
from instructor, teaches
where instructor.ID = teaches.ID
Find the course ID, semester, year and title of each course offered by the
Comp. Sci. department
select section.course_id, semester, year, title
from section, course
where section.course_id = course.course_id and
dept_name = ‘Comp. Sci.'
Database System Concepts - 6th Edition
17
©Silberschatz, Korth and Sudarshan
Ordering the Display of Tuples
List in alphabetic order the names of all instructors
select distinct name
from instructor
order by name
We may specify desc for descending order or asc for ascending
order, for each attribute; ascending order is the default.
Example: order by name desc
Can sort on multiple attributes
Example: order by dept_name, name
Database System Concepts - 6th Edition
18
©Silberschatz, Korth and Sudarshan
Null Values
It is possible for tuples to have a null value, denoted by null, for some
of their attributes
null signifies an unknown value or that a value does not exist.
The result of any arithmetic expression involving null is null
Example: 5 + null returns null
The predicate is null can be used to check for null values.
Example: Find all instructors whose salary is null.
select name
from instructor
where salary is null
Database System Concepts - 6th Edition
19
©Silberschatz, Korth and Sudarshan
Aggregate Functions
These functions operate on the multiset of values of a column of
a relation, and return a value
avg: average value
min: minimum value
max: maximum value
sum: sum of values
count: number of values
Database System Concepts - 6th Edition
20
©Silberschatz, Korth and Sudarshan
Aggregate Functions (Cont.)
Find the average salary of instructors in the Computer Science
department
select avg (salary)
from instructor
where dept_name= ’Comp. Sci.’;
Find the total number of instructors who teach a course in the Spring
2010 semester
select count (distinct ID)
from teaches
where semester = ’Spring’ and year = 2010;
Find the number of tuples in the course relation
select count (*)
from course;
Database System Concepts - 6th Edition
21
©Silberschatz, Korth and Sudarshan
Aggregate Functions – Group By
Find the average salary of instructors in each department
select dept_name, avg (salary) as avg_salary
from instructor
group by dept_name;
avg_salary
Database System Concepts - 6th Edition
22
©Silberschatz, Korth and Sudarshan
Deletion
Delete all instructors
delete from instructor
Delete all instructors from the Finance department
delete from instructor
where dept_name= ’Finance’;
Delete all tuples in the instructor relation for those instructors
associated with a department located in the Watson building.
delete from instructor
where dept name in (select dept name
from department
where building = ’Watson’);
Database System Concepts - 6th Edition
23
©Silberschatz, Korth and Sudarshan
Insertion
Add a new tuple to course
insert into course
values (’CS-437’, ’Database Systems’, ’Comp. Sci.’, 4);
or equivalently
insert into course (course_id, title, dept_name, credits)
values (’CS-437’, ’Database Systems’, ’Comp. Sci.’, 4);
Add a new tuple to student with tot_creds set to null
insert into student
values (’3003’, ’Green’, ’Finance’, null);
Database System Concepts - 6th Edition
24
©Silberschatz, Korth and Sudarshan
Example of B+-Tree
Database System Concepts - 6th Edition
25
©Silberschatz, Korth and Sudarshan
Designing Good Database
Suppose we combine instructor and department into inst_dept
(No connection to relationship set inst_dept)
Result is possible repetition of information
Database System Concepts - 6th Edition
26
©Silberschatz, Korth and Sudarshan
Example of Transactions and Concurrent Access
Transaction to transfer $50 from account A to account B:
1. read(A)
2. A := A – 50
3. write(A)
4. read(B)
5. B := B + 50
6. write(B)
Two people P1 and P2 are using two company debit cards for business
There is $1000 in the company account
P1 is trying to retrieve $500
P2 is trying to retrieve $800
Database System Concepts - 6th Edition
27
©Silberschatz, Korth and Sudarshan
Example of Fund Transfer
Transaction to transfer $50 from account A to account B:
1. read(A)
2. A := A – 50
3. write(A)
4. read(B)
5. B := B + 50
6. write(B)
Atomicity requirement
if the transaction fails after step 3 and before step 6, money will be “lost”
leading to an inconsistent database state
Failure could be due to software or hardware
the system should ensure that updates of a partially executed transaction
are not reflected in the database
Durability requirement — once the user has been notified that the transaction
has completed (i.e., the transfer of the $50 has taken place), the updates to the
database by the transaction must persist even if there are software or
hardware failures.
Database System Concepts - 6th Edition
28
©Silberschatz, Korth and Sudarshan
Example of Fund Transfer (Cont.)
Transaction to transfer $50 from account A to account B:
1.
2.
3.
4.
5.
6.
read(A)
A := A – 50
write(A)
read(B)
B := B + 50
write(B)
Consistency requirement in above example:
the sum of A and B is unchanged by the execution of the transaction
In general, consistency requirements include
Explicitly specified integrity constraints such as primary keys and foreign
keys
Implicit integrity constraints
– e.g. sum of balances of all accounts, minus sum of loan amounts
must equal value of cash-in-hand
A transaction must see a consistent database.
During transaction execution the database may be temporarily inconsistent.
When the transaction completes successfully the database must be
consistent
Erroneous transaction logic can lead to inconsistency
Database System Concepts - 6th Edition
29
©Silberschatz, Korth and Sudarshan
Example of Fund Transfer (Cont.)
Isolation requirement — if between steps 3 and 6, another
transaction T2 is allowed to access the partially updated database, it
will see an inconsistent database (the sum A + B will be less than it
should be).
T1
T2
1. read(A)
2. A := A – 50
3. write(A)
read(A), read(B), print(A+B)
4. read(B)
5. B := B + 50
6. write(B
Isolation can be ensured trivially by running transactions serially
that is, one after the other.
However, executing multiple transactions concurrently has significant
benefits, as we will see later.
Database System Concepts - 6th Edition
30
©Silberschatz, Korth and Sudarshan
ACID Properties
A transaction is a unit of program execution that accesses and possibly
updates various data items.To preserve the integrity of data the database
system must ensure:
Atomicity. Either all operations of the transaction are properly reflected
in the database or none are.
Consistency. Execution of a transaction in isolation preserves the
consistency of the database.
Isolation. Although multiple transactions may execute concurrently,
each transaction must be unaware of other concurrently executing
transactions. Intermediate transaction results must be hidden from other
concurrently executed transactions.
That is, for every pair of transactions Ti and Tj, it appears to Ti that
either Tj, finished execution before Ti started, or Tj started execution
after Ti finished.
Durability. After a transaction completes successfully, the changes it
has made to the database persist, even if there are system failures.
Database System Concepts - 6th Edition
31
©Silberschatz, Korth and Sudarshan
Failure Classification
Transaction failure :
Logical errors: transaction cannot complete due to some internal
error condition
System errors: the database system must terminate an active
transaction due to an error condition (e.g., deadlock)
System crash: a power failure or other hardware or software failure
causes the system to crash.
Fail-stop assumption: non-volatile storage contents are assumed
to not be corrupted by system crash
Database systems have numerous integrity checks to prevent
corruption of disk data
Disk failure: a head crash or similar disk failure destroys all or part of
disk storage
Destruction is assumed to be detectable: disk drives use
checksums to detect failures
Database System Concepts - 6th Edition
32
©Silberschatz, Korth and Sudarshan
Log-Based Recovery
A log is kept on stable storage.
The log is a sequence of log records, and maintains a record of
update activities on the database.
When transaction Ti starts, it registers itself by writing a
<Ti start>log record
Before Ti executes write(X), a log record
<Ti, X, V1, V2>
is written, where V1 is the value of X before the write (the old value),
and V2 is the value to be written to X (the new value).
When Ti finishes it last statement, the log record <Ti commit> is written.
Two approaches using logs
Deferred database modification
Immediate database modification
At the moment, Assume serial execution of Transactions T0, T1, T3…
Database System Concepts - 6th Edition
33
©Silberschatz, Korth and Sudarshan
JDBC and ODBC
API (application-program interface) for a program to interact with a
database server
Application makes calls to
Connect with the database server
Send SQL commands to the database server
Fetch tuples of result one-by-one into program variables
ODBC (Open Database Connectivity) works with C, C++, C#, and
Visual Basic
Other API’s such as ADO.NET sit on top of ODBC
JDBC (Java Database Connectivity) works with Java
Database System Concepts - 6th Edition
34
©Silberschatz, Korth and Sudarshan
Summary
DBMS
Relational databases
Database System Concepts - 6th Edition
35
©Silberschatz, Korth and Sudarshan