fragments - Seneca - School of Information & Communications
Download
Report
Transcript fragments - Seneca - School of Information & Communications
Database Design – Lecture 16
Distributed Databases
Lecture Objectives
Distributed Processing and Distributed
Databases
Distributed Database Management System
(DDBMS)
Distributed Database Design
2
Distributed Processing
Shares the
database’s logical
processing among
two or more
physically
independent sites
that are connected
through a network.
Note: data resides at
only one site and
is shared by other
sites
(“centralized”)
3
Distributed Databases
Stores a logically
related database
over two or more
physically
independent sites.
The sites are
connected by a
computer network.
Note: database is
composed of
several parts
know as
database
fragments.
These
fragments are
located at
several
different sites.
4
Distributed Processing and
Distributed Databases
In a distributed database environment, the
users do not need to know the name or
location of each database fragment in order
to access the database – transparent to the
user
Distributed processing does not require a
distributed database but a distributed
database requires distributed processing
Both distributed processing and distributed
databases require a network to connect all
components
5
Lecture Objectives
Distributed Processing and Distributed
Databases
Distributed Database Management System
(DDBMS)
Distributed Database Design
6
DDBMS Advantages
Data are located near/at “greatest demand”
site – improved performance
Improved reliability – data replication
Growth facilitation
Reduced operating costs
7
DDBMS Disadvantages
Complexity
Cost
Database design more complex
8
Distributed Database Management
System(DDBMS)
Governs the storage and processing of a
single logically related database over
interconnected computer systems in which
both data and processing functions are
distributed among several sites.
9
Distributed Database Management
System(DDBMS)
A DDBMS must have at least the following
functions to be classified as distributed:
-
Application Interface
- Validation
Transformation
- Query Optimization
Mapping
- I/O Interface
Formatting
- Security
Backup & Recovery
- DB Administration
Concurrency Control
- Transaction Management
Computer Workstations (sites or nodes)
Network Hardware & Software
Communications Media
10
Distributed Database Management
System(DDBMS)
A DDBMS must have at least the following
functions to be classified as distributed:
Application Interface
Validation
Allows the interaction with the end user or application
programs and with other DBMSs within the distributed
database
Able to analyze data requests
Transformation
To determine which data request components are
distributed and which ones are local
11
Distributed Database Management
System(DDBMS)
A DDBMS must have at least the following
functions to be classified as distributed:
Query Optimization
Mapping
To find the best access strategy
To determine the data location of local and remote
fragments
I/O Interface
To read or write data from or to permanent local
storage
12
Distributed Database Management
System(DDBMS)
A DDBMS must have at least the following
functions to be classified as distributed:
Formatting
Security
To prepare the data for presentation to the end user or
an application program
To provide data privacy at both local and remote
databases
Backup and Recovery
To ensure the availability and recoverability of the
database in case of a failure
13
Distributed Database Management
System(DDBMS)
A DDBMS must have at least the following
functions to be classified as distributed:
DB Administration
Concurrency Control
To allow the Database Administrator to maintain the
databases
To manage simultaneous data access and ensure data
consistency across database fragments in the DDBMS
Transaction Management
To ensure that the data move from on consistent state
to another – synchronizing transactions
14
Distributed Database Management
System(DDBMS)
A DDBMS must have at least the following
components:
Computer Workstations (sites or nodes)
Network Hardware and Software
Form the network system
Components that reside in each workstation
Allows all sites to interact and exchange data
Communications media
Carries data from one workstation to another
15
Distributed Database Management
System(DDBMS)
A DDBMS must have at least the following
components:
Transaction Processor (TP)
Software component found in each computer that
requests data
Receives and processes the application’s data requests
(remote and local)
Data Processor (DP)
Software component residing on each computer that
stores and retrieves data located at the site
16
Distributed Database Environment
17
Lecture Objectives
Distributed Processing and Distributed
Databases
Distributed Database Management System
(DDBMS)
Distributed Database Design
18
Distributed Database Design
Designing for a relational data base
structure does not change – start with a top
down approach
HOWEVER, need to consider the following
as well:
How to partition the database into fragments
Which fragments to replicate
Where to locate those fragments and replicas
More frequently used fragments should be stored
locally
Fragments used by all users should be stored centrally
19
Distributed Database Design
Data Fragmentation:
Allows a single object to be broken into two or
more segments or fragments
Each fragment can be stored at any site on the
network
Data fragmentation information is stored in the
distributed data catalog (DDC), from which it is
accessed by the TP to process user requests
20
Distributed Database Design
Types of Data Fragmentation:
Horizontal
Vertical
Mixed
21
Distributed Database Design
Types of Data Fragmentation:
Horizontal
The division of a relation into tuples (rows)
Each fragment is stored at a different node and each
fragment has unique rows
Each tuple has the same attributes (columns) but the
rows are fragmented
22
Distributed Database Design
Example of horizontal fragmentation
Original structure:
5th Edition
Fragmented
structure: Split
by state
6th Edition
23
Distributed Database Design
Example of horizontal fragmentation
Resulting structure:
Fragmented
structure: Split
by state
5th Edition
24
Distributed Database Design
Types of Data Fragmentation:
Vertical
The division of a relation into subsets by attributes
(column)
Each subset is stored at a different node, and each
fragment has unique columns – with the exception of
the key column, which is common to all fragments
Transaction issues here because same record may
need to be inserted into two tables (part of record into
1 table and other part into another table). If only 1
insert is successful; end up with inconsistent data.
25
Distributed Database Design
Original structure:
5th Edition
Fragmented
structure: Split
by location
6th Edition
26
Distributed Database Design
Original structure:
5th Edition
Example of Vertical
Fragmentation
Fragmented
structure: Split
by location
5th Edition
27
Distributed Database Design
Types of Data Fragmentation:
Mixed
A combination of horizontal and vertical strategies
28
Distributed Database Design
Example of Mixed Fragmentation:
29
Distributed Database Design
Example of
Mixed
Fragmentation:
30
Data Replication
Storage of data copies at multiple sites served
by a computer network
Fragment copies can be stored at several sites
to serve specific information requirements
Can enhance data availability and response time
Can help to reduce communication and total query
costs
31
Replication Scenarios
Fully replicated database:
Partially replicated database:
Stores multiple copies of each database fragment
at multiple sites
Can be impractical due to amount of overhead
Stores multiple copies of some database
fragments at multiple sites
Most DDBMSs are able to handle the partially
replicated database well
Unreplicated database:
Stores each database fragment at a single site
No duplicate database fragments
32
Data Allocation
Deciding where to locate data
Allocation strategies:
Centralized data allocation
Partitioned data allocation
Database is divided into several disjointed parts
(fragments) and stored at several sites
Replicated data allocation
Entire database is stored at one site
Copies of one or more database fragments are stored at
several sites
Data distribution over a computer network is
achieved through data partition, data
replication, or a combination of both
33
Distributed Database Design
How is a distributed database managed?
Distributed Data Catalog (DDC)
Contains the description of the entire database as seen
by the DBA
Translates user requests into sub-queries (remote
requests) that will be processed by different DPs
DDC is distributed and replicated at network nodes
(the location of a database fragment)
34
Examples of Distributed Databases
Banking
Account data distributed at each local branch
Loan data distributed at each local branch
Corporate data at head office (summarized
branch information)
Insurance
Policy data with each branch
Corporate data at head office
35
Examples of Distributed Databases
Retail
Inventory data distributed at each local store
Employee Scheduling data at each store
Corporate data at head office (summarized store
information)
Payroll data at head office
Utilities
Utility monitoring data at each location (I.e.
nuclear station monitoring – air, water etc at
each location)
Corporate data at head office
36
Distributed Database vs Client Server
Client/Server is really an architecture which models
a computerized solution based on the distribution
of functions between servers and clients. A client
requests specific services from a server and a
server provides requested services to clients
Distributed processing could be one aspect of
client/server architecture – data ‘centralized’
The DDBMS distributes data to different locations –
could be used in a Client/Server architecture
37
Distributed Database Design
Steps:
1.
2.
3.
4.
5.
6.
Always start with a centralized view design
Consider horizontal fragmentation of a
centralized database
Consider vertical fragmentation of a horizontally
fragmented database
Re-consider PK for all fragments of the database
Define data replication rules (scenarios)
Complete Design
38