204351 Database Management System

Download Report

Transcript 204351 Database Management System

Database Management System
Chapter 1 – An Overview
THE TRADITIONAL FILE ENVIRONMENT
A data file is a collection of logically related records. In a file management environment, each
application has a specific data file related to it, containing all the data records needed by the
application.
Registrar’s office
Class programs
Class file
Employee data
• Employee data contains information of the instructor
who is an employee.
• Student data is repeated.
Accounting dept.
Accounts programs
Student data
Course data
Registration data
Accounts file
Employee data
Student data
Tuition data
Financial aid
Problems with the File processing
Program – Data Dependence
Program – Data Dependence is the tight relationship between data
stored in various files and the specific programming language used to
access or execute them. Every traditional computer program has to
describe the input file format with which it works. Program – Data
Dependence leads to the potential for Data redundancy.
Data Redundancy is the presence of duplicate data in multiple data
files. Data redundancy occurs when different divisions, functional
areas, and groups in an organization independently collect the same
piece of data and build separately their own systems with different
languages, platforms, and data file format . Data Redundancy leads to
the potential for data inconsistency.
Data Isolation : File processing can leads to difficulty in accessing data
from different applications. (Data can not be shared). With applications
uniquely designed and implemented, data are scattered in various files, and
files may be in different formats.
Integrity problems : The data values stored in the database must satisfy
certain types of consistency constraints. Developers enforce these
constraints in the system by adding appropriate code in the various
application programs. However, when new constraints are added, it is
difficult to change the programs to enforce them.
Security problems: Enforcing security constraints in the application is
much more difficult than that of in the database environment. Not every
user of the database system should be able to access all the data.
Applications should not have to be developed with regard to how the
data are stored. That is, the applications and data in computer systems should
be independent - a characteristic called application/data independence.
Databases : The modern approach
Registrar’s
office
Class
programs
Academic info
Student data
Accounting
dept.
Athletics
dept.
Accounts
programs
Database
management
system
Course data
Registration data
Employee data
Team data
Sports
programs
Tuition data
Financial aid
A database, which is a logical group of related files, can eliminate many of the problems
associated with a traditional file environment. With the database approach, all the data are
typically contained in the same storage location, rather than residing in many different files
across the organization.
Unlike the traditional approach, in which different programs access the different
data files, the database is arranged so that one set of software programs (DBMS) provides
access to all the data.
DBMS: Database Management System
DBMS consists of a collection of interrelated data and a set of programs
to access those data. The primary goal of DBMS is to provide an
environment that is both convenient and efficient to use in retrieving
and storing database information. DBMS acts as the interface software
between users and data stored in the database by hiding the detail of
internal
The enterprise system collects data from various key business processes
and stores the data in a single comprehensive data repository where they
can be used by other parts of the business
กรรมวิธีนำเข้ำ
ฐานข้ อมูลกลาง
กำรให้บริ กำร
มาตรฐานข้ อมูล
คลังข้ อมูล
Locating Data in Databases
A centralized database has all the related files in one physical location.
Central
Location
Locating Data in Databases
A distributed database has complete copies of a database, or portions of a
database, in more than one location.
Location
B
Location
A
Central
Location
Location
C
Location
D
There are two types of distributed databases:
• replicated
• partitioned
A replicated database has complete copies of the entire database
in many locations, primarily to alleviate the single-point-of-failure problems
of a centralized database as well as to increase user access responsiveness.
A partitioned database is subdivided, so that each location has a portion
of the entire database (usually the portion that meets users’ local needs).
This type of database provides the response speed of localized files
without the need to replicate all changes in multiple locations.
Databases : The modern approach
Registrar’s
office
Class
programs
Academic info
Student data
Accounting
dept.
Athletics
dept.
Accounts
programs
Database
management
system
Course data
Registration data
Employee data
Team data
Sports
programs
Tuition data
Financial aid
A database, which is a logical group of related files, can eliminate many of the problems
associated with a traditional file environment. With the database approach, all the data are
typically contained in the same storage location, rather than residing in many different files
across the organization.
Unlike the traditional approach, in which different programs access the different
data files, the database is arranged so that one set of software programs (DBMS) provides
access to all the data.
Abstract View of Data
DBMS provides the ability for many different users to share data and process
resources. How can a single, unified database meet the differing requirements of
so many users? That is, the system hides certain details of how the data are stored
and maintained.
A DBMS minimizes these problems by providing 3 views of the database data:
(Since many database-systems users are not computer trained, developers hide
the complexity from users through several levels of abstraction :
A physical view : The lowest level of abstraction describes how the data are
actually stored. It deals with the actual, physical arrangement and location of
data in the direct access storage devices (DASDs).
A logical view The next higher level of abstraction describes what data are
stored in the database, and what relationships exist among those data. The logical
describes the entire database in terms of a small number of relatively simple
structures.(Users may not want to know all the technical details of physical storage.
An external view : The higher level of abstraction describes only part of the
entire database. Many users of the database do not need all this information;
instead, they need to access only a part of the database.
Data Abstraction
External
View 1
External
View 2
Conceptual (Logical)
Level
Physical Level
External
View 3
Billing
Application
Load Research
Application
Billing & Research
Data Model
Maintenance
Application
Maintenance
Data Model
Information Services Data Model
Database Management System
Service
Databases
Logical User Views
Data elements and relationships
needed for Billing, Load Research,
or Maintenance processing
Data elements and relationships
(the schema) needed for the
support of all information services.
Software Interface
The DBMS provides access to the
information services databases
Physical Data Views
Organization and location of data on
the storage media.
One strength of a DBMS is that while there is only one physical
view of the data, there can be an endless number of different external
views - one specifically tailored to each individual user, if necessary.
Data Model
Data Model defines how data are conceptually structured.
A collection of conceptual tools for describing data, data relationships,
data semantics, and consistency constraints.
Examples of data model : hierarchical, network, relational, objectoriented, entity-relation model, object-relational, hypermedia, and
multidimensional models.
Instance และ Schema
Instance ของฐานข้ อมูล : เหล่าข้ อมูลทีถ่ ูกจัดเก็บใน
ฐานข้ อมูลในขณะใดขณะหนึ่ง ดังนั้น Instance จึงเปรียบเสมือน
กับค่ าของตัวแปรในขณะใดขณะหนึ่งใน Programming Language
การเปลีย่ นแปลงข้ อมูลในระบบฐานข้ อมูล มีสาเหตุมาจากการทีข่ ้ อมูลมี
การ Insert, Delete หรื อ Update เกิดขึน้ ตลอดเวลา
Schema : เค้าร่ าง (โครงสร้ าง) ทีไ่ ด้ จากการออกแบบ
ฐานข้ อมูลโดยรวม เช่ นการกาหนดว่ า ระบบฐานข้ อมูลควร
ประกอบด้ วยไฟล์อะไรบ้ าง และไฟล์เหล่านั้นมีความสั มพันธ์ กนั
อย่ างไร แต่ ละไฟล์ควรประกอบด้ วยฟิ ลด์ (fields) อะไรบ้ าง
Instance and Schema
Instance of Database : The collection of information stored in
the database at a particular moment. Instance of Database
change over time as information is inserted and deleted.
The concept of instances can be understood by analogy to a
program written in a programming language. Each variable
has a particular value at any given instant of time.
Schema : The overall design of the database which must be
specified as following :
What files are included in a database?
What fields are included in each file?
How those files are related?
A database schema corresponds to the variable declarations
in a program.
Database Language
A database system provides a Data Definition Language (DDL) to
specify the database schema and a Data Manipulation Language
(DML) to express database queries and updates.
Data Definition Language (DDL)
A database schema can be specified by a set of definitions
expressed by a special language called a Data-Definition Language.
DDL defines what types of information are in the database and how
they will be structured. The DDL defines each data element as it
appears in the database before that data element is translated into the
forms required by the applications.
Database Languages
Examples of Data Definition Language (DDL)
CREATE TABLE
ALTER TABLE
DROP TABLE
CREATE VIEW
DROP VIEW
CREATE TABLE S
( S#
Char(5) Not Null,
Sname
Char(20) Not Null,
Status
SmallInt Not Null,
City
Char(15) Not Null,
PRIMARY KEY (S#) );
CREATE INDEX
DROP INDEX
After DDL statement has been interpreted, DBMS generates a special
set of tables called the data dictionary or data directory.
A data dictionary contains metadata – that is, data about data.
The schema of a table is an example of metadata. A database system
consults the data dictionary before reading or modifying actual data.
ผลจากการคอมไพล์ประโยคที่เขียนด้วย DDL (DDL Interpreter) จะทาให้เกิดตารางที่จะจัดเก็บ
ในไฟล์ชนิดหนึ่ งที่เรียกว่า พจนานุกรมข้อมูล (Data Dictionary) ซึ่งเก็บข้อมูลที่เกี่ยวกับ
โครงสร้างที่ได้จากการออกแบบฐานข้อมูลนัน้ ๆ และถ้ามีการเปลี่ยนแปลงแก้ไข หรือ เรียกใช้ข้อมูล
DBMS ต้องอาศัยข้อมูลของโครงสร้างจากไฟล์นี้เสมอ
The data values stored in the database must satisfy certain
consistency constraints. The database systems check these
constraints every time the database is updated.
Data Manipulation Language (DML)
การใช้ ข้อมูลในระบบฐานข้ อมูล อาจจาแนกได้ เป็ น
- การเรียกดูข้อมูล (RetrieveData)
- การเพิม่ ข้ อมูลใหม่ (Insert Records)
- การลบข้ อมูลเดิม (Delete Records)
- การเปลีย่ นแปลงแก้ ไขข้ อมูลเดิม (Update Fields)
การใช้ ข้อมูลในระดับของผู้ใช้ น้ัน สามารถกระทาได้ โดยผ่ านทาง DBMS
ซึ่งได้ เตรียมภาษาสาหรับการใช้ ข้อมูล (DML is a language that enables
users to access or manipulate data as ornagized by the appropriate
data model.)
Database Access from Application Programs
Application Program are written in a Host Language. To access the
database, DML statements need to be executed from the hose
language by
• Providing an application program interface (API) that can be
used to send DML and DDL statements to the database, and
retrieve the result. Ex. ODBC, JDBC.
• Extending the host language syntax to embed DML calls within
Host Language program.. However, DML syntax is different
from syntax of Host Language, Query Processor of DBMS
consists of :
DDL Interpreter
DML Compiler
Query Evaluation Engine
DDL Interpreter :
แปล (interpret) ประโยคคาสั่ งทีเ่ ขียนด้ วย DDL และบันทึกผลการแปลคาสั่ งลงใน data
dictionary
DML Compiler :
แปล (translate) ประโยคคาสั่ ง DML ซึ่งเป็ นภาษาทีใ่ ช้ ในการสื บค้ น (structure query
language) ไปเป็ น low-level instructions ที่ Query Evaluation Engine สามารถเข้ าใจได้
(คาสั่ ง query 1 คาสั่ ง อาจถูกแปลออกมาได้ มากกว่ า 1 query evaluation plans ทีใ่ ห้ ผลลัพธ์ ที่
เหมือนกัน ดังนั้น เป็ นหน้ าทีข่ อง DML compiler ทีจ่ ะต้ องทา query optimization เพื่อเลือก query
evaluation plans ทีป่ ระมวลผลได้ เร็วทีส่ ุ ด (The DML compler also performs query
optimization; that is, it picks the lowest cost evaluation plan from among the
alternatives.)
Query Evaluation Engine :
เมื่อ DML Compiler สร้ าง Object Code ทีเ่ ป็ น Low-level Instructions ขึน้ มาแล้ ว ก็
จะส่ งต่ อให้ กบั Query Evaluation Engine ทาการประมวลผลต่ อไป
Data Consumers
Visual Basic Application
Visual C++ Application
ASP page
Other Applications
ADO Connection
Service Components
Cursor Engine
Data Providers
ODBC Provider
ODBC
Database
Server
DML Compiler
DDL Interpreter
Query Evaluation Engine
Query Processor
Simple Data Provider
Mainframe
Data
Non
Relation
Data
How Database Represent Information
A database management system (DBMS) is a combination of software
and data:
• The physical database :
a collection of files that contain the data content.
•The schema:
a specification of the information content of the physical database
•The database engine:
software that supports access to and modification of the contents of the database
•The data definition and manipulation language:
programming languages that support schema definition and database access.
Overall System Structure
Query processor components
• DML compiler
• DDL interpreter
• Query evaluation engine
Storage manager components : provides the interface between the low level data stored
in the database and the application programs and queries submitted to the system
• Authorization and integrity manager : tests for the satisfaction of integrity constraints
and checks the authority of users to access data.
• Transaction manager : ensures that the database remains in a consistent state
despite system failures and that concurrent transaction executions proceed without
conflicting.
• File manager : manages the allocation of space on disk storage and the data
structures used to represent information stored on disk.
• Buffer manager : responsible for fetching data from disk storage into main memory.
• Data files : store the database itself.
• Data dictionary : stores metadata about the structure of the database.
The data dictionary is used heavily. Therefore, great emphasis
should be placed on developing a good design and efficient
implementation of the dictionary.
• Indices : provide fast access to data items that hold particular values.
Sophisticated
users
Application
programmers
Naïve users
use
use
write
Application interfaces
users
use
Application programs
Query tools
Compiler and linker
DML compiler
Application programs
object code
Database
administrator
Administration tools
DDL
interpreter
DML compiler and organizer
query
procesor
Query evalution engine
Transaction
manager
File
manager
Data files
Buffer
manager
Indices
Authorization
and integrity
manager
Data dictionary
Storage
manager
Disk storage
Database Processing Advantages
 Data redundancy (Repetition of data) can be reduced.
 Data inconsistency can be avoided.
 Data can be shared.
 Standard of data can be controlled.
 Concurrent-access to database can be provided.
 Data Integrity and data security can be maintained.
Business Functions
Vendors
Vendors
Vendors
Organizational Boundaries
Manufacturing
Accounting
Business
processes
Business
processes
Manufacturing
Systems
Accounting
Systems
Finance
Business
processes
Finance
Systems
Marketing
and Sales
Business
processes
Marketing
and Sales
Systems
Human
Resources
Business
processes
Human
Resources
Systems
Information Systems
Traditional view of systems. In most organizations, separate systems built over a long period of time
support discrete business processes and discrete business functions. The organization’s system rarely
included vendors and customers.
Customer
Customer
Customer
PRODUCTION
MAINTENANCE
FINANCE AND ACCOUNTING
SALES AND MARKETING
How to integrate functions and business processes
1. Implement a separate “middleware” bridge to each of these
specialized systems to link them all together
PRODUCTION
MAINTENANCE
FINANCE AND ACCOUNTING
SALES AND MARKETING
How to integrate functions and business processes
1. Implement a separate “middleware” bridge to each of these
specialized systems to link them all together
2. Implement or buy entirely new enterprise applications that can
coordinate activities, decision, and knowledge across many
different functions, levels, and business units in a firm.
Enterprise systems
• Supply chain management systems SCM)
• Customer relationship management systems (CRM)
Each of these enterprise applications integrates a related set of functions and
business process to enhance the performance of the organization as a whole.
Enterprise systems
also known as enterprise resource planning (ERP)
The enterprise systems,
designed to support
organization- wide, coordinates and integrates a related set
of major internal functions and business processes to enhance the
performance of the organization as a whole.
They solve the problem of organizational inefficiencies created
by isolated islands of information, business processes, and technology.
Accounting
Business process
Business process
Business process
Enterprise-wide business Process
Human
Resources
Marketing
and Sales
Organizational Boundaries
Vendors
Organizational Boundaries
Manufacturing
Customers
Finance
Enterprise systems. Enterprise systems can integrate the key business
processes of an entire firm into a single software system that allows information
to flow seamlessly throughout the organization. These systems focus primarily
on internal processes but may include transactions with customers and vendors.