Database Systems: Design, Implementation, and Management
Download
Report
Transcript Database Systems: Design, Implementation, and Management
10
Chapter 10
Distributed Database
Management Systems
Database Systems: Design, Implementation, and
Management, Fifth Edition, Rob and Coronel
10
In this chapter, you will learn:
• What a distributed database management system
(DDBMS) is and what its components are
• How database implementation is affected by
different levels of data and process distribution
• How transactions are managed in a distributed
database environment
• How database design is affected by the
distributed database environment
Database Systems: Design, Implementation, & Management, 5th Edition, Rob & Coronel
2
10
Evolution of DDBMS
• Decentralized database management systems (DDBMS)
– Interconnected computer systems
– Data/processing functions reside on multiple sites
• 1970’s: Centralized DBMS
• 1980’s: Social and Technical Changes
– Ad hoc capability required
– Decentralized management structure common
• 1990’s: New forces
– Internet and the World Wide Web used for data access and
distribution
– Data analysis through data mining and data warehousing
Database Systems: Design, Implementation, & Management, 5th Edition, Rob & Coronel
3
10
DDBMS Advantages
•
•
•
•
•
•
•
•
•
Data located near site with greatest demand
Faster data access
Faster data processing
Growth facilitation
Improved communications
Reduced operating costs
User-friendly interface
Less danger of single-point failure
Processor independence
Database Systems: Design, Implementation, & Management, 5th Edition, Rob & Coronel
4
10
DDBMS Disadvantages
•
•
•
•
•
•
Complexity of management and control
Security
Lack of standards
Increased storage requirements
Greater difficulty in managing data environment
Increased training costs
Database Systems: Design, Implementation, & Management, 5th Edition, Rob & Coronel
5
10
Distributed Processing
Shares database’s logical processing among
physically, networked independent sites
Figure 10.1
Database Systems: Design, Implementation, & Management, 5th Edition, Rob & Coronel
6
10
Distributed Database
Stores logically related database over physically
independent sites
Figure 10.2
Database Systems: Design, Implementation, & Management, 5th Edition, Rob & Coronel
7
10
Distributed Database
vs. Distributed Processing
• Distributed processing
– Does not require distributed database
– May be based on a single database on single
computer
– Copies or parts of database processing functions
must be distributed to all data storage sites
• Distributed database
– Requires distributed processing
• Both
– Require a network to connect components
Database Systems: Design, Implementation, & Management, 5th Edition, Rob & Coronel
8
10
Functions of DDBMS
•
•
•
•
•
•
•
•
•
•
•
•
Application/end user interface
Validation to analyze data requests
Transformation to determine request components
Query optimization to find the best access
strategy
Mapping to determine the data location
I/O interface to read or write data
Formatting to prepare the data for presentation
Security to provide data privacy
Backup and recovery
DB Administration
Concurrency Control
Transaction Management
Database Systems: Design, Implementation, & Management, 5th Edition, Rob & Coronel
9
10
Centralized Database
Figure 10.3
Database Systems: Design, Implementation, & Management, 5th Edition, Rob & Coronel
10
10
Fully Distributed Database
Management System
Database Systems: Design, Implementation, & Management, 5th Edition, Rob & Coronel
Figure 10.4
11
10
DDBMS Components
•
•
•
•
Computer workstations
Network hardware and software components
Communications media
Transaction processor (TP)
– Also called application manager (AP) or
transaction manager (TM)
• Data processor (DP)
– Also called data manager (DM)
Database Systems: Design, Implementation, & Management, 5th Edition, Rob & Coronel
12
10
Distributed Database Components
Figure 10.5
Database Systems: Design, Implementation, & Management, 5th Edition, Rob & Coronel
13
10
DDBMS Protocols
• Interface with network to transport data and
commands between DPs and TPs
• Synchronize data received from DPs and route to
appropriate TPs
• Ensure common database functions
– Security
– Concurrency control
– Backup and recovery
Database Systems: Design, Implementation, & Management, 5th Edition, Rob & Coronel
14
10
Levels of Data and Process
Distribution
Database systems can be classified based on
process distribution and data distribution
Table 10.1
Database Systems: Design, Implementation, & Management, 5th Edition, Rob & Coronel
15
10
Single-Site Processing, Single-Site
Data (SPSD)
•
•
•
•
•
•
All processing on single CPU or host computer
All data are stored on host computer disk
DBMS located on the host computer
DBMS accessed by dumb terminals
Typical of mainframe and minicomputer DBMSs
Typical of 1st generation of single-user
microcomputer database
Database Systems: Design, Implementation, & Management, 5th Edition, Rob & Coronel
16
10
Single-Site Processing, Single-Site
Data (con’t.)
Figure 10.6
Database Systems: Design, Implementation, & Management, 5th Edition, Rob & Coronel
17
10
Multiple-Site Processing, Single-Site
Data (MPSD)
• Requires network file server
• Applications accessed through LAN
• Variation known as client/server architecture
Figure 10.7
Database Systems: Design, Implementation, & Management, 5th Edition, Rob & Coronel
18
10
Multiple-Site Processing,
Multiple-Site Data (MPMD)
• Fully distributed DDBMS with support for multiple
DPs and TPs at multiple sites
– Homogeneous I
• Integrate one type of centralized DBMS over the
network
– Heterogeneous
• Integrate different types of centralized DBMSs over a
network
Database Systems: Design, Implementation, & Management, 5th Edition, Rob & Coronel
19
10
Heterogeneous Distributed Database
Scenario
Figure 10.8
Database Systems: Design, Implementation, & Management, 5th Edition, Rob & Coronel
20
10
Distributed DB Transparency
• Allows end users to feel like only database user
• Hides complexities of distributed database
• Transparency features
–
–
–
–
Distribution
Transaction
Failure
Performance
– Heterogeneity
Database Systems: Design, Implementation, & Management, 5th Edition, Rob & Coronel
21
10
Distribution Transparency
• Allows management of a physically dispersed
database as though it were centralized
• Three Levels
– Fragmentation transparency
– Location transparency
– Local mapping transparency
Table 10.2
Database Systems: Design, Implementation, & Management, 5th Edition, Rob & Coronel
22
10
Transaction Transparency
• Ensures transactions maintain integrity and
consistency
• Completed only if all involved database sites
complete their part of the transaction
• Management mechanisms
–
–
–
–
Remote request
Remote transaction
Distributed transaction
Distributed request
Database Systems: Design, Implementation, & Management, 5th Edition, Rob & Coronel
23
10
Remote Request
Figure 10.10
Database Systems: Design, Implementation, & Management, 5th Edition, Rob & Coronel
24
10
Remote Transaction
Figure 10.11
Database Systems: Design, Implementation, & Management, 5th Edition, Rob & Coronel
25
10
Distributed Transaction
Figure 10.12
Database Systems: Design, Implementation, & Management, 5th Edition, Rob & Coronel
26
10
Distributed Requests
Figure 10.13
Database Systems: Design, Implementation, & Management, 5th Edition, Rob & Coronel
27
10
Distributed Requests (con’t.)
Figure 10.14
Database Systems: Design, Implementation, & Management, 5th Edition, Rob & Coronel
28
10
Distributed Concurrency Control
• Multisite, multiple-process operations more likely
to create data inconsistencies and deadlocked
transactions
• Problems
– Transaction committed by local DP
– One DP could not commit transaction’s result
– Yields inconsistent database
Database Systems: Design, Implementation, & Management, 5th Edition, Rob & Coronel
29
10
Two-Phase Commit Protocol
• DO-UNDO-REDO protocol
– Write-ahead protocol
– Two kinds of nodes
• Coordinator
• Subordinates
• Phases
– Preparation
• Coordinator sends message to all subordinates
• Confirms all are ready to commit or abort
– Final Commit
• Ensures all subordinates have committed or aborted
Database Systems: Design, Implementation, & Management, 5th Edition, Rob & Coronel
30
10
Performance Transparency
and Query Optimization
• Objective: Minimize total cost associated with
execution of request
• Main costs
– Access time
– Communication
– CPU time
• Basis for query optimization algorithms
– Optimum execution order
– Sites accessed to minimize communication costs
• Dynamic or static optimization
• Statistically based vs. rule-based query
optimization algorithms
Database Systems: Design, Implementation, & Management, 5th Edition, Rob & Coronel
31
10
Distributed Database Design
• Partition database into fragments
– Horizontal
– Vertical
– Mixed
• Fragments to replicate
– Storage of data copies at multiple sites
– Fully, partially, unreplicated databases
• Data allocation
– Where to locate data
– Centralized, partitioned, replicated
Database Systems: Design, Implementation, & Management, 5th Edition, Rob & Coronel
32
10
Client/Server Advantages Over DDBMS
• Client/server less expensive
• Client/server solutions allow use of
microcomputer’s GUI
• More people with PC skills than mainframe skills
• PC is well established in workplace
• Numerous data analysis and query tools exist
• Considerable cost advantages to off-loading
application development
Database Systems: Design, Implementation, & Management, 5th Edition, Rob & Coronel
33
10
Client/Server Disadvantages
• Creates more complex environment with different
platforms
• Increased number of users and sites creates
security problems
• Training issues become more complex and
expensive
Database Systems: Design, Implementation, & Management, 5th Edition, Rob & Coronel
34
10
Date’s 12 Commandments for
Distributed Databases
1. Local Site Independence
2. Central Site Independence
3. Failure Independence
4. Location Transparency
5. Fragmentation Transparency
6. Replication Transparency
Database Systems: Design, Implementation, & Management, 5th Edition, Rob & Coronel
35
10
Date’s 12 Commandments for
Distributed Databases
7. Distributed Query Processing
8. Distributed Transaction Processing
9. Hardware Independence
10. Operating System Independence
11. Network Independence
12. Database Independence
Database Systems: Design, Implementation, & Management, 5th Edition, Rob & Coronel
36