L1 Introduction

Download Report

Transcript L1 Introduction

DBMS
DATA BASE
MANAGEMENT
SYSTEM
1
CS_502/BCA-401/INFOSYS
Faculty
• BBS: Bidyut Biman Sarkar
• Available at: Room # 305
• Contact hours: 4 classes per week
2
CS 502(Syllabus)
DATABASE MANAGEMENT SYSTEM
Introduction
•Concept & Overview of DBMS, Data
Models, Database Languages, Database
Administrator, Database Users, Three
Schema architecture of DBMS
Entity-Relationship Model
•Basic concepts, Design Issues, Mapping
Constraints, Keys, Entity-Relationship
Diagram, Weak Entity Sets, Extended E-R
features.
3
CS 502(Syllabus)
DATABASE MANAGEMENT SYSTEM
Relational Model
• Structure of relational Databases, Relational Algebra,
Relational
• Calculus, Extended Relational Algebra Operations,
Views,
• Modifications Of the Database.
SQL and Integrity Constraints
• Concept of DDL, DML, DCL. Basic Structure, Set
operations, Aggregate Functions, Null Values, Domain
constraints, Referential Integrity Constraints, assertions,
views, Nested
• Sub queries, Database security application development
using SQL, Stored procedures and triggers.
4
CS 502(Syllabus)
DATABASE MANAGEMENT SYSTEM
Relational Database Design
•Functional Dependency, Different anomalies in
designing a Database, Normalization using functional
dependencies,Decomposition, Boyce-Codd Normal Form,
3NF, Nomalization using multi-valued dependencies, 4NF,
5NF
Internals of RDBMS
Physical data structures, Query optimization: join
algorithm, statistics and cost bas optimization.
Transaction processing, Concurrency control and
Recovery Management : transaction model properties,
state serializability, lock base protocols, two phase
locking.
5
CS 502(Syllabus)
DATABASE MANAGEMENT SYSTEM
File Organization & Index Structures
• File & Record Concept, Placing file
records on Disk, Fixed and Variable sized
Records
• Types of Single-Level Index (primary,
• secondary, clustering), Multilevel Indexes,
Dynamic Multilevel Indexes using B tree
and B+ tree.
6
Reference Books
• 1. Database System Concepts (Mc Graw
Hill) By: Korth & Sudarshan
• 2.Fundamentals of Database Systems
(Pearson) BY:ELMASRI NAVATHE
• 3. An Introduction to Database Systems
(Galgotia) By :Bipin C. Deasi
7
INTRODUCTION
1.
2.
3.
4.
5.
6.
Concept & Overview of DBMS
Data Models
Database Language
Database Administrator
Database Users
Three Schema Architecture of DBMS
8
Concept & Overview of DBMS
9
Data base Properties
1.A data base is designed and built with data for a
specific purpose with an intended group of users
2.A database is a logically consistent collection of
data with some inherent meaning
3. On a database operations like Insert, Update,
Delete etc. can be carried out following a specific
procedure. i.e. for the above operations , some
procedural language should be supported
4. Data in the data base should be persistent,
logically connected ,shared , integrated and it
should maintain some structure
10
Functions of the DBMS SYSTEMS
•Data Modeling: Structure definition for data storage
•Query processing: Providing the mechanism for data
manipulation
•Concurrency Control: Ensures correctness of the
data during concurrent access by multiple users
•Information Security: Protection against theft and
unauthorized accesses.
•Crash Recovery: Safety of information in the
database.
11
Classes of Database Users
•Database Administrator (DBA)
•Database Designers
•End-Users
–Sophisticated
–Specialized
--Naïve
•Application Programmers
Design and implement application systems and stored
procedures. They interact with the system through the
DML calls embedded in a program written in a language
like C or C++. The DML pre compiler compiles and
generate object code.
12
Classes of Database Users
(DBA)
Is responsible for managing the database system.
DBA are responsible for authorizing access to the
database, coordinating and monitoring its use and
acquiring software and hardware resources
according to requirement.
Database Designers
Responsible for designing the database and
choosing appropriate structures to represent and
store data. The tasks are performed before the
database is actually built. The responsibilities of
the designer is to define the tasks of the DBA as
mentioned above
13
Classes of Database Users
End-Users
The persons use the database for querying,
updating and generating reports are referred
to as end-users. The different types of endusers are:
Sophisticated: Can accomplish tasks on their
own, Write SQL and interact with DML.
Specialized: Develop application programs
and interact with complex data systems.
Naive: Interact with the system only through
the supplied interface.
14
Data Model
Data Model
Conceptual
Physical
Object-Based
E-R ObjectModel Oriented
Model
Binary Semantic Functional
Model Data Model Data Model
Record-Based
Hierarchical
Model
Network
Model
Relational
Model
15
Data Model
• Object-Based Logical Models
• This model describes data at the
conceptual and view levels. Some of them
are::
• Entity-Relationship (E-R) model
• Object-Oriented model
16
Record-Based Models
Used to specify the overall logical structure of the
database. The data integrity constraints can’t be
specified explicitly with these models. The three
widely accepted data models are:
– Hierarchical Model
– Network Model
– Relational Model
These models are called record-based due to its
fixed-format. It can contain varieties of record
types with a set of attributes of fixed size. Separate
languages are available to store and retrieve
records in the respective data models.
17
Hierarchical Model
Hierarchical data model are record based and related
to each other in a hierarchy of many to one
relationship. Hierarchical database maintains a parent
child relation The hierarchical database consists of a
collection of hierarchical trees. The ordering of the
tree is according to the hierarchical structure diagram.
A many-to-many relationship between record types
cannot be expressed directly in the hierarchical
model. Such many to many relation-ship can be
expressed by using data replication or virtual records.
For example, the hierarchical representation of the
hospital database
18
Hierarchical Database
Hospital
Doctor
Ward
Patient
Outpatient
Doctor
Patient
Inpatient
Doctor
OT
Hospital database
19
Network Model
The network approach allows to model
many-to-many relationships more directly
than the hierarchical model. Two basic data
structures in the model: records and sets.
Data is stored in records. Each record
consists of a group of related data values.
Records are classified into record types.
Record types describe the structure of a
group of records that store the same type of
information. Each record type contains a
name and format for each data item.
20
Record type in a Relational model
EMPLOYEE
EMP-NO
NAME
DEPARTMENT
DOB
Data item Name
Format
EMP-NO
Character 10
NAME
Character 30
DEPT
Character 10
DOB
Date
CONTACT_NO
Character 10
CONTACT_NO
A record type student
21
Set Types and their Basic properties
A set type is a description of a 1: N relationship
between two record types . Each set type definition
consists of three basic elements:
1. A name of the set type
2. An owner record type
3. A member record type
In the database there will be many set occurrences
or set instances corresponding to a set type. Set
occurrences is composed of one owner record for
the owner record type and a number of related
member records . The stored representations of set
instances are either linked lists or doubly linked list
representation .
22
A stored representation of set instances
Information
Technology
Joy
Bjoy
Ajoy
A stored representation of set instances
For each set type an additional OWNER pointer is included in the member
record type. The owner pointer points directly to the owner record of the set.
23
Relational Model
The founder of the relational model was E. F. Codd of
IBM. The draft of the model was published in 1972.
There is no requirement of rigid structure on the data.
The familiar structure of files and records are viewed as
tables/relations and tuples /records respectively.
A relation has the following properties:
•The columns of a table are of the same kind (columns
are homogeneous)
•Each item has a simple value (Number/string)
•All rows of a table are distinct
•The ordering of rows within a table is immaterial
•The columns of a table are assigned distinct names
and the ordering of these column is immaterial.
24
CODD Commandments
Any truly relational database must be manageable through its
relational capabilities. DBMS can be regarded as relational
only if it obeys the following 12 rules:
Rules
1. Information rule: All information is explicitly and logically
represented in exactly one way-by data values in tables
2. Guaranteed access rule: All data items must be logically
addressable by resorting to a combination of table name,
primary key value and column name
3. Systematic treatment of null values: The null values are
supported in the representation of missing and inapplicable
information. This support for null values must be consistent
throughout the DBMS and independent of data type
25
4. The data base description rule: There must be a data
dictionary within the RDBMS that is constructed of tables
and/or views that can be examined using SQL.
5. The comprehensive sub-language rule: There must
be at least one language whose statements can be
expressed as character strings in a specific format to
support the following:
1. Data definition
2. View definition
3. Data manipulation
4. Integrity constraints
5. Authorization
6. Transaction boundaries
All the above implies that DBMS is manageable through its
own language namely SQL.
26
6. The view updating rule: All views can be updated in
theory, can be updated by the system
7. The insert and update rule: The data base should be
capable of inserting, updating and deleting data as a
relational set
8. The physical independence rule: Indicates that the
physical storage structures or devices used for storing the
data could be changed without changing the conceptual
view or any of the external views. The change would be
absorbed by the mapping between the conceptual and
internal levels
9. The logical data independence rule: Indicates that the
conceptual schema can be changed without affecting the
existing external schemas. The change would be absorbed
by the mapping between the external and conceptual levels
27
Codd Rules continuation
10. Integrity rules:
10.1 Entity Integrity: If the attribute A of
relation R is a prime attribute of R, then A cannot
accept null values.
10.2 Referential Integrity: Applies to inter
table reference. The attributes of a relation that are
those of the primary key of another relation.
Example: Given two relations R and S , suppose
R refers to the relation S via set of attributes that
forms the primary key of S and this set of
attributes forms a foreign key in R. then the value
of the foreign key in a tuple R must either be equal
to the primary key of a tuple of S or be entirely null.
28
Referential Integrity
The foreign key is used as surrogate for another
entity, the rule enforces the existence of a tuple for
the relation corresponding to the instance of the
referred entity.
During deletion of a tuple there can be two distinct
possibilities to maintain the database integrity.
1. All tuples of R being attempted to be deleted
and having relation in S, should also be deleted.
Referred to as cascading deletion or dominos
effect.
2. All tuples of R being attempted to be deleted
and having no relation in S, can only be deleted.
29
11. Distribution rule: A database must be
distribution independent. It implies that
application programs and terminal
activities remains unaffected.
1. When data distribution is first
introduced
2. When data is redistributed
12. No subversion rule: If an RDBMS
supports a lower level language that
permits for example, row-at-a-time
processing, then this language must not
be able to bypass any integrity rules or
constraints of the relational language.
30
Database Language
•DBMS provides one or more specialized programming
languages called Database Languages. A language
called Structured Query Language (SQL) has been
taken as the standard database language.
•Depending on the three different abstraction levels of
the database management system different languages
are available: The three abstraction levels are namely
external, conceptual and physical. To describe and
manipulate the data in the database the two different
languages are:
•Data Description Languages (DDL)
•Data Manipulation Languages (DML)
31
Database Administrator
The database administrator is a person o a group of person
having total control over data and programs accessing the
data. DBA is responsible for the following tasks to implement:
Schema definition: Data definition language compiler to set
tables that is stored permanently in the data dictionary
Storage structure and access method definition: Stores by
data and structure definition language compiler
Schema and physical-organization modification:
Modification of data dictionary
Granting of authorization for data access: Extensive use of
the Grant and Revoke commands
Designing security and integrity constraint specification:
With the help of the Security and Integrity commands set of
rules can be framed and stored in the data dictionary to protect
the database.
32
Three Schema architecture of DBMS
User 1
User 2
User 3
Mapping supplied by DBMS
CONCEPTUAL LEVEL
(Define by DBA)
DBA
INTERNAL LEVEL
(DBA defined for
optimization)
The three levels of the architecture
33
Three Levels of the Architecture
Let us introduce the architecture for a
database system. The aim is to present the
framework to build our concepts on DBMS.
• The internal level is the one closest to
physical storage—i.e., it is the one concerned
with the way the data is physically stored.
• The conceptual level is a "level of
indirection" between the other two.
• The external level is the one closest to the
users i.e., it is the one concerned with the way
the data is viewed by individual users.
34
Data at three levels
Data at three levels
User 2
User 1
Employee No
Employee Name
Employee Name
Employee Address
Employee Address
Department
Designation
Logical record 1
Logical record 2
Employee Name
:
string
Employee Address : string
Employee No
:
Integer
Department
:
string
Designation
:
string
Conceptual Record
DBA
Internal View
Name: String length 30
Address: String length 60
Employee no: 9 dec.
Department: String length 30
Designation: String length 3035
Data independence
There are two levels of data independence:
Physical data independence is the ability to modify the
physical schema without causing application programs to be
rewritten. Modifications at the physical level are occasionally
necessary to improve performance.
Logical data independence is the ability to modify the
logical schema without causing application programs to be
rewritten. Modifications at the logical level are necessary
whenever the logical structure of the database is altered (for
example, when money-market accounts are added to a
banking system).
Logical data independence is more difficult to achieve than is
physical data independence, since application programs are
heavily dependent on the logical structure of the data that
they access.
36
Data abstraction
View level
View 1
View 2
View 3
Logical
level
Physical
level
The three levels of data abstraction
Many database-systems users are not computers trained, so
developers hide the complexity from users through several
levels of abstraction, to simplify users interactions with the
system data abstraction concept is introduced.
37
Data abstraction
Physical level: - The lowest level of abstraction describes how
the data are actually stored. At the physical level, complex lowlevel data structures are described in detail.
Logical level: - The next-higher level of abstraction describes
what data are stored in the database, and what relationship
exists among those data. The entire database is thus described
in terms of a small number of relatively simple structures.
Although implementation of the simple structures at the logical
level may involve complex physical level structures, the user of
the logical level does not need to be aware of this complexity.
The logical level of abstraction is used by database
administrators, who must decide what information is to be kept
in the database.
View level: - The highest level of abstraction describes only part
of the entire database. Despite the use of simpler structures at
the logical level, some complexity remains, because of the large
size of the database. Many users of the database system will not
be concerned with all this information. Instead, such users need
to access only a part of the database. So that their interaction
with the system is simplified, the view level of abstraction is 38
1.
State True/False
1. A data base is a logically consistent collection of data
2. Insert and delete are data base operations
3. Nave users interact with the system only through the supplied interface
4. E_R is a data modeling tool
5. Object Model supports only complex objects
6. Network Model observe parent child relations
7. Sets are used in Hierarchical model as Record types
8. DML is used for all the security and control procedures
9. The external level is the closest to the user
10. DCL is used for updating database records
11. Entities can either exist independently or be related to each other
12. By building a model entities in a business process are transformed
into tables using a DBMS
13. Strong entities will transform into master tables
39
2. Match the blanks:
i)Structure definition for data storage is called ----------ii) --------provides the mechanism for data manipulation
iii) Loss of information in the data base can be protected by using ---------iv) Protection against theft and unauthorized accesses is known as ------------------v)---------------- ensures correctness of the data during concurrent access by multiple users.
vi) -------------- If the attribute A of relation R is a prime attribute of R, then A cannot accept null
values.
vii) A database must be distribution ----------------viii) DBA is responsible for the following tasks to implement -------------------ix) A language called------------------ has been taken as the standard database language
x) With the help of the ----------- and ------------ commands set of rules can be framed and stored in
the -----------------to protect the database.
OPTIONS
:i) Data Modeling ,ii) Entity Integrity iii) Query processing , iv) Independent , v) Crash Recovery ,
vi) Information Security, vii) Schema definition, viii) Concurrency Control
40
ix) Structured Query Language (SQL) , x) Data Dictionary , xi ) Security xii) Integrity
Model Answer
Q1
1 2 3 4 5 6 7 8 9 10 11 12 13
T T T T F F F F T F T T T
Q2
i
Data Modeling
vi
Entity Integrity
ii
iii
Query
Crash
processing
Recovery
vii
independent
viii
Schema
definition
iv
v
Information
Security
Concurrency
ix
Structured
Query
Language
(SQL)
x
xi ) Security
xii) Integrity
x) Data Dictionary ,
Control
41