Transcript UNIT01-DBMS

www.BookSpar.com | Website for Students | VTU
NOTES -QUESTION PAPERS
1
DATABASE MANAGEMENT SYSTEMS
2
Contents
• Introduction to DBMS
• DBMS Users
• DBMS architecture
• Data Models
• ER Modeling
• Relational Model
• SQL
• Normalization
www.BookSpar.com | Website for Students | VTU NOTES -QUESTION PAPERS
www.BookSpar.com | Website for Students | VTU
NOTES -QUESTION PAPERS
3
What Is a DBMS?
• Database : A very large, integrated collection of data.
• A Database Management System (DBMS) is a
software package designed to store and manage
databases.
www.BookSpar.com | Website for Students | VTU
NOTES -QUESTION PAPERS
Database System Applications
4
• Database Applications:
• Banking: all transactions
• Airlines: reservations, schedules
• Universities: registration, grades
• Sales: customers, products, purchases
• Online retailers: order tracking, customized recommendations
• Manufacturing: production, inventory, orders, supply chain
• Human resources: employee records, salaries, tax deductions
• Databases touch all aspects of our lives
www.BookSpar.com | Website for Students | VTU
NOTES -QUESTION PAPERS
5
www.BookSpar.com | Website for Students | VTU
NOTES -QUESTION PAPERS
6
www.BookSpar.com | Website for Students | VTU
NOTES -QUESTION PAPERS
7
www.BookSpar.com | Website for Students | VTU
NOTES -QUESTION PAPERS
8
www.BookSpar.com | Website for Students | VTU
NOTES -QUESTION PAPERS
9
www.BookSpar.com | Website for Students | VTU
NOTES -QUESTION PAPERS
Purpose of Database Systems
10
• In the early days, database applications were built
directly on top of file systems
• Drawbacks of using file systems to store data:
• Data redundancy and inconsistency
• Multiple file formats, duplication of information in
different files
• Difficulty in accessing data
• Need to write a new program to carry out each new
task
• Data isolation — multiple files and formats
• Integrity problems
• Integrity constraints (e.g. account balance > 0)
become “buried” in program code rather than being
stated explicitly
• Hard to add new constraints or change existing ones
www.BookSpar.com | Website for Students | VTU
NOTES -QUESTION PAPERS
Purpose of Database Systems (Cont.)
11
• Drawbacks of using file systems (cont.)
• Atomicity of updates
• Failures may leave database in an inconsistent state with
partial updates carried out
• Example: Transfer of funds from one account to another
should either complete or not happen at all
• Concurrent access by multiple users
• Concurrent accessed needed for performance
• Uncontrolled concurrent accesses can lead to
inconsistencies
• Example: Two people reading a balance and updating it at the same
time
• Security problems
• Hard to provide user access to some, but not all, data
• Database systems offer solutions to all the above problems
www.BookSpar.com | Website for Students | VTU
NOTES -QUESTION PAPERS
Why Use a DBMS?
• Data independence and efficient access.
• Reduced application development time.
• Data integrity and security.
• Uniform data administration.
• Concurrent access, recovery from crashes.
12
Database Users
www.BookSpar.com | Website for Students | VTU
NOTES -QUESTION PAPERS
13
Users are differentiated by the way they expect to interact with
the system
• Application programmers – interact with system through
DML calls
• Sophisticated users – form requests in a database query
language
• Naïve users – invoke one of the permanent application
programs that have been written previously
• Examples, people accessing database over the web, bank
tellers, clerical staff
www.BookSpar.com | Website for Students | VTU
NOTES -QUESTION PAPERS
Database Administrator
14
• Coordinates all the activities of the database system
• has a good understanding of the enterprise’s
information resources and needs.
• Database administrator's duties include:
• Storage structure and access method definition
• Schema and physical organization modification
• Granting users authority to access the database
• Backing up data
• Monitoring performance and responding to changes
View of Data
www.BookSpar.com | Website for Students | VTU
NOTES -QUESTION PAPERS
An architecture for a database system
15
www.BookSpar.com | Website for Students | VTU
NOTES -QUESTION PAPERS
Levels of Abstraction
16
• Physical level: describes how a record (e.g., customer) is
stored.
• Logical level: describes data stored in database, and the
relationships among the data.
type customer = record
customer_id : string;
customer_name : string;
customer_street : string;
customer_city : string;
end;
• View level: application programs hide details of data types.
Views can also hide information (such as an employee’s
salary) for security purposes.
www.BookSpar.com | Website for Students | VTU
NOTES -QUESTION PAPERS
Instances and Schemas
17
• Similar to types and variables in programming
languages
• Schema – the logical structure of the database
• Example: The database consists of information
about a set of customers and accounts and the
relationship between them)
• Analogous to type information of a variable in a
program
• Physical schema: database design at the
physical level
• Logical schema: database design at the logical
level
www.BookSpar.com | Website for Students | VTU
NOTES -QUESTION PAPERS
Instances and Schemas
18
• Physical Data Independence – the ability to
modify the physical schema without changing
the logical schema
• Applications depend on the logical schema
• In general, the interfaces between the various
levels and components should be well defined
so that changes in some parts do not seriously
influence others.
www.BookSpar.com | Website for Students | VTU
NOTES -QUESTION PAPERS
19
Data Independence
• Applications insulated from how data is structured
and stored.
• Logical data independence: Protection from changes
in logical structure of data.
• Physical data independence: Protection from
changes in physical structure of data.
 One of the most important benefits of using a DBMS!
www.BookSpar.com | Website for Students | VTU
NOTES -QUESTION PAPERS
History of Database Systems
20
• 1950s and early 1960s:
• Data processing using magnetic tapes for storage
• Tapes provide only sequential access
• Punched cards for input
• Late 1960s and 1970s:
• Hard disks allow direct access to data
• Network and hierarchical data models in widespread use
• Ted Codd defines the relational data model
• Would win the ACM Turing Award for this work
• IBM Research begins System R prototype
• UC Berkeley begins Ingres prototype
• High-performance (for the era) transaction processing
History (cont.)
www.BookSpar.com | Website for Students | VTU
NOTES -QUESTION PAPERS
21
• 1980s:
• Research relational prototypes evolve into commercial
systems
• SQL becomes industry standard
• Parallel and distributed database systems
• Object-oriented database systems
• 1990s:
• Large decision support and data-mining applications
• Large multi-terabyte data warehouses
• Emergence of Web commerce
• 2000s:
• XML and XQuery standards
• Automated database administration
• Increasing use of highly parallel database systems
• Web-scale distributed data storage systems
Data Models
www.BookSpar.com | Website for Students | VTU
NOTES -QUESTION PAPERS
22
• A data model is a collection of concepts for
describing data.
• Relational model
• Entity-Relationship data model (mainly for
database design)
• Object-based data models (Object-oriented and
Object-relational)
• Semi structured data model (XML)
• Other older models:
– Network model
– Hierarchical model
The relational model of data is the most widely
used model today.
www.BookSpar.com | Website for Students | VTU
NOTES -QUESTION PAPERS
23
• A database can be modeled as:
• a collection of entities,
• relationship among entities.
Database Design- Conceptual design:
• An entity is an object that exists and is
(ERdistinguishable
Model)
from other objects.
• Example: specific person, company, event, plant
• Entities have attributes
• Example: people have names and addresses
• An entity set is a set of entities of the same type that
share the same properties.
• Example: set of all persons, companies, trees,
holidays
• Relationship: Association among two or more
entities. E.g., Aisha works in Pharmacy department
www.BookSpar.com | Website for Students | VTU
NOTES -QUESTION PAPERS
ER-diagram
name
ssn
pname
lot
Employees
Dependents
Covers
Policies
policyid
age
cost
24
Relational Model
www.BookSpar.com | Website for Students | VTU
NOTES -QUESTION PAPERS
• Example of tabular data in the relational model
25
Attributes
www.BookSpar.com | Website for Students | VTU
NOTES -QUESTION PAPERS
A Sample Relational Database
26
www.BookSpar.com | Website for Students | VTU
NOTES -QUESTION PAPERS
27
Logical DB Design: ER to Relational
• Entity sets to tables:
SSN
ssn
name
Employees
lot
Name
lot
www.BookSpar.com | Website for Students | VTU
NOTES -QUESTION PAPERS
28
INTRODUCTION TO SCHEMA REFINEMENT
Problems Caused by Redundancy
• Storing the same information redundantly, that is, in
more than one place within a database, can lead to
several problems:
• Redundant storage: Some information is stored
repeatedly.
• Update anomalies: If one copy of such repeated data is
updated, an inconsistency
is created unless all copies are similarly updated.
• Insertion anomalies: It may not be possible to store
some information unless
some other information is stored as well.
• Deletion anomalies: It may not be possible to delete
some information
www.BookSpar.com | Website for Students | VTU
NOTES -QUESTION PAPERS
29
Bad DB Design to good DB Design
• Informal method
• Informal design guidelines
• Formal method
- Normalization
www.BookSpar.com | Website for Students | VTU
NOTES -QUESTION PAPERS
30
Normal Forms
• If a relation is in a certain normal form (BCNF,
3NF etc.), it is known that certain kinds of
problems are avoided/minimized. This can be
used to help us decide whether decomposing
the relation will help.
•
FDs
are used to detect redundancy

www.BookSpar.com | Website for Students | VTU
NOTES -QUESTION PAPERS
31
Example decomposition(simple)
• Bad database design
Eno
Ename Salary
Add
Bdate Dno
• Good database design
Eno
Dno
Ename Salary
Add Bdate Dno
Dname Mngname
Dname Mngname
www.BookSpar.com | Website for Students | VTU
NOTES -QUESTION PAPERS
Data Definition Language (DDL)
32
• Specification notation for defining the database schema
Example:
create table account (
account_number char(10),
branch_name
char(10),
balance
integer)
• DDL compiler generates a set of tables stored in a data
dictionary
• Data dictionary contains metadata (i.e., data about data)
• Database schema
• Data storage and definition language
• Specifies the storage structure and access methods used
• Integrity constraints
• Domain constraints
• Referential integrity (e.g. branch_name must correspond
to a valid branch in the branch table)
• Authorization
www.BookSpar.com | Website for Students | VTU
NOTES -QUESTION PAPERS
33
Data Manipulation Language (DML)
• Two classes of languages
• Procedural – user specifies what data is required and
how to get those data
• Declarative (nonprocedural) – user specifies what
data is required without specifying how to get those
data
• SQL is the most widely used query language
SQL
www.BookSpar.com | Website for Students | VTU
NOTES -QUESTION PAPERS
34
• SQL: widely used non-procedural language
• Example: Find the name of the customer with customer-
id 192
select customer_name
from customer
where customer_id = 192
• Example: Find the balances of all accounts held by the
customer with customer-id 192
select balance
from depositor, account
where customer_id = 192 and
depositor.account_number = account.account_number
SQL
www.BookSpar.com | Website for Students | VTU
NOTES -QUESTION PAPERS
35
• Application programs generally access databases
through one of
• Language extensions to allow embedded SQL
• Application program interface (e.g., ODBC/JDBC)
which allow SQL queries to be sent to a database
www.BookSpar.com | Website for Students | VTU
NOTES -QUESTION PAPERS
Query Processing
1.Parsing and translation
2.
3.
Optimization
Evaluation
36
www.BookSpar.com | Website for Students | VTU
NOTES -QUESTION PAPERS
Query Processing (Cont.)
37
• Alternative ways of evaluating a given query
• Equivalent expressions
• Different algorithms for each operation
• Cost difference between a good and a bad way of evaluating
a query can be enormous
• Need to estimate the cost of operations
• Depends critically on statistical information about relations
which the database must maintain
• Need to estimate statistics for intermediate results to
compute cost of complex expressions
www.BookSpar.com | Website for Students | VTU
NOTES -QUESTION PAPERS
Transaction Management
38
• A transaction is a collection of operations that
performs a single logical function in a database
application
• Transaction-management component ensures
that the database remains in a consistent
(correct) state despite system failures (e.g.,
power failures and operating system crashes)
and transaction failures.
• Concurrency-control manager controls the
interaction among the concurrent transactions,
to ensure the consistency of the database.
www.BookSpar.com | Website for Students | VTU
NOTES -QUESTION PAPERS
Transaction Concept
39
• A transaction is a unit of program execution that
accesses and possibly updates various data items.
• E.g. transaction to transfer $50 from account A to
account B:
1.
2.
3.
4.
5.
6.
read(A)
A := A – 50
write(A)
read(B)
B := B + 50
write(B)
• Two main issues to deal with:
• Failures of various kinds, such as hardware failures
and system crashes
• Concurrent execution of multiple transactions
www.BookSpar.com | Website for Students | VTU
NOTES -QUESTION PAPERS
Example of Fund Transfer
40
• Transaction to transfer $50 from account A to account B:
1.read(A)
2.A := A – 50
3.write(A)
4.read(B)
5.B := B + 50
6.write(B)
• Atomicity requirement
• if the transaction fails after step 3 and before step 6, money will be “lost”
leading to an inconsistent database state
• Failure could be due to software or hardware
• the system should ensure that updates of a partially executed transaction are
not reflected in the database
• Durability requirement — once the user has been notified that the transaction
has completed (i.e., the transfer of the $50 has taken place), the updates to the
database by the transaction must persist even if there are software or hardware
failures.
41
• Transaction to transfer $50 from account A to account B:
1.read(A)
2.A := A – 50
3.write(A)
4.read(B)
5.B := B + 50
6.write(B)
Example of Fund Transfer (contd..)
• Consistency requirement in above example:
• the sum of A and B is unchanged by the execution of the transaction
• A transaction must see a consistent database.
• During transaction execution the database may be temporarily inconsistent.
• When the transaction completes successfully the database must be consistent
• Erroneous transaction logic can lead to inconsistency
www.BookSpar.com | Website for
Students | VTU NOTES QUESTION PAPERS
www.BookSpar.com | Website for Students | VTU
NOTES -QUESTION PAPERS
Example of Fund Transfer (Cont.)
42
• Isolation requirement — if between steps 3 and 6, another transaction T2 is
allowed to access the partially updated database, it will see an inconsistent
database (the sum A + B will be less than it should be).
T1
T2
1.
read(A)
2.
A := A – 50
3.
write(A)
read(A), read(B), print(A+B)
4.
read(B)
5.
B := B + 50
6.
write(B
• Isolation can be ensured trivially by running transactions serially
• that is, one after the other.
• However, executing multiple transactions concurrently has significant
benefits.
www.BookSpar.com | Website for Students | VTU
NOTES -QUESTION PAPERS
ACID Properties
43
A transaction is a unit of program execution that accesses and possibly
updates various data items.To preserve the integrity of data the database
system must ensure:
• Atomicity. Either all operations of the transaction are properly reflected in the
database or none are.
• Consistency. Execution of a transaction in isolation preserves the
consistency of the database.
• Isolation. Although multiple transactions may execute concurrently, each
transaction must be unaware of other concurrently executing transactions.
Intermediate transaction results must be hidden from other concurrently
executed transactions.
• That is, for every pair of transactions Ti and Tj, it appears to Ti that either Tj,
finished execution before Ti started, or Tj started execution after Ti finished.
• Durability. After a transaction completes successfully, the changes it has
made to the database persist, even if there are system failures.
www.BookSpar.com | Website for Students | VTU
NOTES -QUESTION PAPERS
Transaction State
44
www.BookSpar.com | Website for Students | VTU
NOTES -QUESTION PAPERS
Concurrent Executions
45
• Multiple transactions are allowed to run concurrently in the system.
Advantages are:
• increased processor and disk utilization, leading to better
transaction throughput
• E.g. one transaction can be using the CPU while another is
reading from or writing to the disk
• reduced average response time for transactions: short
transactions need not wait behind long ones.
• Concurrency control schemes – mechanisms to achieve ACID
properties with concurrency.
• Concurrency control protocols are used in DBMSs
to control concurrency
www.BookSpar.com | Website for Students | VTU
NOTES -QUESTION PAPERS
Overall System Structure
46