R*: An overview of the Architecture

Download Report

Transcript R*: An overview of the Architecture

R*: An overview of the
Architecture
By R. Williams et al.
Presented by D. Kontos
Instructor : Dr. Megalooikonomou
Overview
Distributed Database Systems.
R* : an experimental DDMS developed at IBM
Almaden Research Center in 1981.
Overview of the architecture:

Transaction management

Object naming, catalogue management

Authorization, communication etc.
Conclusions on the issues arising in a DDBMS
Distributed DBMS (DDBMS)
Need for sharing resources, data.
Preserve transparency of network
communication and data organization.
Maximum independence – “site autonomy”.
R*: DDBMS consisting of a confederation of
voluntarily co-operating sites, each supporting
Relational data model communicating via IBM’s
CICS.
Architecture aspects
Environment and Data Definitions.
Object Naming.
Distributed Catalogs.
Transaction management, commit protocols.
Query preparation.
Query execution.
SQL additions and changes.
Environment and Data Definitions
Several database sites communicating via
network topology (CICS).
Data stored in relations

dispersed

replicated

partitioned
End user not aware of the data distribution,
organized by the DDBMS.
Object Naming
Site autonomy – not a global naming system.
Network details transparency to the user, programming
as simple as possible.
Mapping end user name  “print names”
internal System Wide Names (SWN)
USER @ USER_SITE.OBJECT_NAME @ BIRTH_SITE
e.g. BRUCE at SAN_JOSE accesses table T
BRUCE @ SAN_JOSE.T @ SAN_JOSE
Distributed Catalogs
Distributed Catalog Architecture
Each site keeps and maintains catalogs regarding the
objects at the database, replicas, fragments stored at the
particular site.
The “birth” site of each object keeps information about
where it is currently stored.
Object located through its SWN, catalogs store access
paths.
Search path:
local catalog  birth site catalog  indicated current site
Transaction management
commit protocols
Unique sequence transaction number.
Starts from the site it was entered

Synchronous & asynchronous execution.
Commit  UNIFORM (all abort OR all commit)
Two phase commit protocol

Coordinator makes the final decision
Other sites prepared to commit  awaiting
Lost commit messages detected by time-out.
Query preparation
Name resolution
Authorization : Each site checks authorization on it’s own
local data  trusts the remote sites.
Global compilation plan by the master, access strategies.
Plan distribution, local compilation of parts.
Final code generated at the master, two phase
compilation.
Optimization of access paths included  minimization of
query execution time.
Query execution
Code loaded locally, parallel execution  messages for
communication.
Concurrency control

Distributed deadlock detection by periodically checking at each
site wait-for information gathered locally or from other sites.

Deadlock cycle breaker  abort transaction.
Logging and recovery:

Resources held only if a transaction fails after entering the
second phase of the commit protocol.
SQL additions and changes
SQL extended to include the distributed capabilities.
Conclusions
November 1981 R* experimental prototype system.
Key ingredient  autonomy of the sites.
Distributed data authorization, compilation, commit etc.
Based on a master – apprentices approach, two phase
protocols.
Transparent network topology, data definition and
management.
A promising step towards a REAL DISTRIBUTED DBMS.
THANK YOU!!
Questions??