Database Management Systems - The Institute of Finance

Download Report

Transcript Database Management Systems - The Institute of Finance

Database Management
Systems (DBMS)
Module 1: Introduction
Contents






Historical Perspective DBMS
Purpose of database systems
Data abstraction
Data models
Instances and schemes
Data independence
Contents






Data Definition Language (DDL)
Data Manipulation language (DML)
Database manager
Database administrator
Database users
Overall system structure
What is DBMS

A database management system
(DBMS), or simply a database
system (DBS), consists of


A collection of interrelated and persistent
data (usually referred to as the database
(DB)).
A set of application programs used to
access, update and manage that data
(which form the data management system
(MS)).
What is DBMS

The goal of a DBMS is to provide
an environment that is both
convenient and efficient to use
in:
Retrieving information from the
database.
 Storing information into the
database.

What is DBMS

Databases are usually designed to
manage large bodies of
information. This involves

Definition of structures for
information storage (data modeling).
What is DBMS



Provision of mechanisms for the
manipulation of information (file and
systems structure, query processing).
Providing for the safety of information in
the database (crash recovery and security).
Concurrency control if the system is shared
by users.
1. Historical Perspective



From the earliest days of computers, storing
and manipulating data have been a major
application focus.
The first general-purpose DBMS was designed
by Charles Bachman in the early 1960’s and
was called the Integrated Data Store.
It formed the basis for the network data
model which was standardised by the
Conference on Data System Language
(CODASYL)
1. Historical Perspective


In the late 1960s, IBM developed the
Information Management System (IMS)
DBMS, which formed the basis for an
alternative data representation framework
called the hierarchical data model.
The SABRE system for making airline
reservations was jointly developed by IBM
and America Airlines which allowed several
people to access the same data through
computer network.
1. Historical Perspective



In 1970 Edgar Codd at IBM came up
with new data representation
framework called relation data model.
The SQL query language for relational
database developed in 1980s became
the part of relational databases.
In the late 1980s and 1990s advances
have been made in many areas of DB
systems and allow complex analysis of
data from all parts of firms.
1. Historical Perspective


Several vendors (e.g. IBM, Oracle,
DB2,Microsoft, and Informix UDS) have
extended their system to enable storage
of images, and video data types.
Specialised systems have been
developed by numerous vendors for
creating data warehouses consolidating
data from several databases and for
undertaking analysis.
1. Historical Perspective


An interesting phenomenon is the
emergence of Enterprise Resource
Planning (ERP) which add a layer of
application-oriented features on top of
DBMS.
Now interestingly DBMS are part of
Internet Techs where it is used to store
data which can be accessed via Web.
1. Historical Perspective



Nowadays through DBMS more data is
brought online.
Now we have multimedia databases,
interactive video, streamlining data,
digital libraries, google earth which
store and process spatial data, etc.
Also DBMS is the part of E-Commerce
apps such as Amazon.com and EBay.com.
Summary: History of Data
Management







Manual Record (4000 BC – AD 1900).
Punched Card Record (1900 - 55).
Programmed Record (1955 - 1970).
On – Line Network Data Management
(1965 - 80).
Relational Data Management.
Multimedia Databases.
Web-Based System.
2. Purpose of database
systems

The purpose of database management
system can be derived from the
drawbacks of file processing system
which was supported by operating
system.
2. Purpose of database
systems

Using that system there are problems
which are:
Data redundancy and inconsistency


Same information may be duplicated in several
places.
All copies may not be updated properly.
2. Purpose of database
systems

Difficulty in accessing data



May have to write a new application program to
satisfy an unusual request.
E.g. find all customers with the same postal
code.
Data isolation



Data in different files.
Data in different formats.
Difficult to write new application programs.
2. Purpose of database
systems

Multiple users



Want concurrency for faster response time.
Need protection for concurrent updates.
E.g. two customers withdrawing funds
from the same account at the same time
account has $500 in it, and they withdraw
$100 and $50. The result could be $350,
$400 or $450 if no protection.
2. Purpose of database
systems

Security problems



Every user of the system should be able to
access only the data they are permitted to
see.
E.g. payroll people only handle employee
records, and cannot see customer
accounts; tellers only access account data
and cannot see payroll data.
Difficult to enforce this with application
programs.
2. Purpose of database
systems

Integrity problems
Data may be required to satisfy
constraints.
 E.g. no account balance below
$25.00.
 Again, difficult to enforce or to
change constraints with the leprocessing approach.

2. Purpose of database
systems

Therefore the these problems of earlier
file processing approach and others led
to the development of database
management systems.
2. Purpose of database
systems: Summary

Therefore the purpose of database
management system is to:
Keep information updated
 Keep information accurate
 Keep information organised

3. Data abstraction



This describes level of abstractions in
DBMS. The system hides certain details
of how data is stored and maintained.
In this complexity data should be
hidden from database users.
The data in DBMS is described I three
levels of abstraction
3. Data abstraction

The database description consists of a
schema at each of these three levels of
abstraction:



Conceptual Schema
Physical Schema
External Schema
Physical Level

This describe:



How the data are stored. E.g. index, Btree, hashing.
It is the lowest level of abstraction.
Also it is the complex low-level
structures described in detail.
Conceptual Schema/Level


This is the next highest level of
abstraction.
It describes:



what data are stored.
relationships among data.
It is the database administrator level.
Conceptual Schema/Level
External schemas

The three levels of data abstraction
View Level



It is the highest level of abstraction.
It describes part of the database for a
particular group of users.
Can be many different views of a
database. E.g. tellers in a bank get a
view of customer accounts, but not of
payroll data.
4. Data Models


According to (Ramakrshnan and
Gehrke, 2008) a data model is
collection of high-level data description
constructs that hide many low-level
storage details.
A DBMS allows a user to define the data
to be stored in terms of data model.
3. Data Models


Simply a data model is a collection of
conceptual tools used for describing
data, data relationships, data semantics
and data constraints.
Most DBM today are based on the
relational data model.
Types of Data Model

There are number of data models
existing, such as hierarchical, network,
and Semantic, however these can be
grouped into three major data models:


Object-based Logical Models.
Record-based Logical Models.
Physical Data Models.
Object-based Logical Models

Object-based logical models:




Describe data at the conceptual and view
levels.
Provide fairly flexible structuring
capabilities.
Allow one to specify data constraints
explicitly.
Over 30 such models, including
Object-based Logical Models

These can be sectioned into:







Entity-relationship model.
Object-oriented model.
Binary model.
Semantic data model.
Infological model.
Functional data model.
Our focus will be on the first two
models
Object Based Logical Model:
E-R Model


The entity-relationship model is based
on a perception of the world as
consisting of a collection of basic
objects (entities) and relationships
among these objects.
The overall logical structure of a
database can be expressed graphically
by an E-R diagram
Object Based Logical Model:
E-R Model




An entity is a distinguishable object that
exists.
Each entity has associated with it a set
of attributes describing it.
E.g. number and balance for an
account entity.
A relationship is an association among
several entities.
Object Based Logical Model:
E-R Model



e.g. A customer acct relationship associates a
customer with each account he or she has.
The set of all entities or relationships of the
same type is called the entity set or
relationship set.
Another essential element of the E-R diagram
is the mapping cardinalities, which express
the number of entities to which another entity
can be associated via a relationship set.
Object Based Logical Model:
E-R Model

The overall logical structure of a
database can be expressed graphically
by an E-R diagram:




rectangles: represent entity sets.
ellipses: represent attributes.
diamonds: represent relationships among
entity sets.
lines: link attributes to entity sets and
entity sets to relationships.
Object Based Logical Model:
E-R Model

A sample E-R diagram
Object Based Logical Model:
Object-Oriented Model

The object-oriented model is based on
a collection of objects, like the E-R
model.


An object contains values stored in
instance variables within the object.
Unlike the record-oriented models, these
values are themselves objects.
Object Based Logical Model:
Object-Oriented Model





An object also contains bodies of code that
operate on the the object.
These bodies of code are called methods.
Objects that contain the same types of values and
the same methods are grouped into classes.
A class may be viewed as a type denition for
objects.
Analogy: the programming language concept of an
abstract data type.
Object Based Logical Model:
Object-Oriented Model




The only way in which one object can
access the data of another object is by
invoking the method of that other object.
This is called sending a message to the
object.
Internal parts of the object, the instance
variables and method code, are not visible
externally.
Result is two levels of data abstraction.
Object Based Logical Model:
Object-Oriented Model

For example, consider an object representing
a bank account.




The object contains instance variables number and
balance.
The object contains a method pay-interest which
adds interest to the balance.
Under most data models, changing the interest
rate entails changing code in application
programs.
In the object-oriented model, this only entails a
change within the pay-interest method.
Object Based Logical Model:
Object-Oriented Model

Unlike entities in the E-R model, each
object has its own unique identity,
independent of the values it contains:


Two objects containing the same values
are distinct.
Distinction is maintained in physical level
by assigning distinct object identiers.
Record-based Logical Models


Also describe data at the conceptual
and view levels.
Unlike object-oriented models, are used
to


Specify overall logical structure of the
database, and
Provide a higher-level description of the
implementation.
Record-based Logical Models




Named so because the database is structured
in xed-format records of several types.
Each record type defines a fixed number of
fields, or attributes.
Each field is usually of a fixed length (this
simplifies the implementation).
Record-based models do not include a
mechanism for direct representation of code
in the database.
Record-based Logical Models




Separate languages associated with the
model are used to express database queries
and updates.
The three most widely-accepted models are
the relational, network, and hierarchical.
This course will concentrate on the relational
model.
The network and hierarchical models are
covered in appendices in the text.
Record-based Logical Models:
Relational Model



The central data description construct in this
model is a relation, which can be thought of
as a set of records.
A description of data is called a schema.
In the relational model the schema for a
relation specifies its name, the name for each
field (or attribute or column), and the type
for each field.
Record-based Logical Models:
Relational Model




A relation can be thought as a table.
Data and relationships are represented
by a collection of tables.
Each table has a number of columns
with unique names, e.g. customer,
account.
Consider the figure below:
Record-based Logical Models:
Relational Model

A sample relational database
Record-based Logical Models:
Relational Model

E.g. Student information in a university
database may be stored in a relation
with the following schema:
Students (sid: string, login: string, age: integer,
gpa: real)
Record-based Logical Models:
The Network Model



Data are represented by collections of
records.
Relationships among data are
represented by links.
Organization is that of an arbitrary
graph
Record-based Logical Models:
The Network Model

Figure shows a sample network database that is the
equivalent of the relational database
Record-based Logical Models:
The Hierarchical Model


Similar to the network model.
Organization of the records is as a
collection of trees, rather than arbitrary
graphs.
Record-based Logical Models:
The Hierarchical Model

A sample hierarchical database
Physical Data Models


Are used to describe data at the lowest
level.
Very few models, e.g.



Unifying model.
Frame memory.
We will not cover physical models.
Instances and Schemes



Databases change over time.
The information in a database at a
particular point in time is called an
instance of the database.
The overall design of the database is
called the database scheme.
Instances and Schemes

Analogy with programming languages:



Data type definition - scheme
Value of a variable – instance
There are several schemes,
corresponding to levels of abstraction:



Physical scheme
Conceptual scheme
Subscheme (can be many)
Data Independence

The ability to modify a scheme
definition in one level without affecting
a scheme definition in a higher level is
called data independence.
Data Independence

There are two kinds:

Physical data independence



The ability to modify the physical scheme without
causing application programs to be rewritten
Modifications at this level are usually to improve
performance
Logical data independence


The ability to modify the conceptual scheme without
causing application programs to be rewritten
Usually done when logical structure of database is
altered