Database System Architectures

Download Report

Transcript Database System Architectures

Database System Architectures
 Client-server Database System
 Parallel Database System
 Distributed Database System
Wei Jiang
Centralized Systems
CPU
Disk
Controller
Printer
Controller
Tape-drive
Controller
System bus
Memory Controller
Memory
Databases are designed in two ways
due to Computer System
Single-user system
Multi-user System
•
•
A desktop unit used by a single
person
•
•
Only one CPU, one or two disk
It serves a large number of users who
are connected to system via terminals
More disks, more memory, multiple
CPUs
Database
Designed by Single user
•Not support concurrency control
•Provision for crash-recovery are either absent or primitive
•Do not support SQL,but provide a simpler language, such as QBE
Designed by multi-user
Support full transactional features
Currently development
Databases designed for Single-processor machines provide multitasking
Client-Server Systems
General structure of a client-server System
Client
Client
Client
….
Client
Server
Front-end and back-end functionality
SQL uerinterface
Forms
interface
Report
writer
Graphical
interface
Front-end
Interface
(SQL +API)
SQL engine
Back-end
Server System Architectures
 Transaction-server
Systems
Provide an interface to which
client can send requests to
perform an action, in
response to which they
execute the action and send
back results to the client.
Requests may be specified
by using SQL, or through a
specialized application
program interface
 Data-server Systems
Allow clients to interact with
the servers by making
requests to read or update
data, in units such as files or
pages.
For example, file servers
provide a file-system
interface where clients can
create, update,read and
delete files.
Transaction Server Process Structure
• Server Processes: receive user queries, execute them and sent result back
• Lock manager process: implements lock manager functionality, which
includes lock grant, lock release, and deadlock detection
• Database writer process: for output modified buffer blocks back to disk
• Log writer process: process output log records from the log record buffer
to stable storage
• Checkpoint process: performs periodic checkpoints
• Process monitor process: monitors other processes, and take recovery
actions for the failed process
The shared memory:
Buffer pool, Lock table, Log buffer,Cached query plans
Mutual exclusion rule
Data Servers
•
Prefetching page shipping: when an item (both tables and tuples)
is requested, it also sends back other items that are likely to be used in
near future
•
De-escalation Locking: if the client does not need a prefetched
item, it can transfer locks on the items back to the server, and the lock
can be allocated to other clients
• Data Caching Coherency: even if a transaction finds cached data,
it must make sure that those data are up to date. A message must still
be exchanged with the server to check validity of the data, and acquire
a lock on data.
• Call back lock caching: If a client requests a lock from the server,
the server must call back all conflicting locks on the data item from
any other client that have cached the locks.
Parallel Systems
• Purpose: improve processing and I/O speeds
• Two kinds: 1)a coarse-grain parallel machine
2)a massively parallel machine
• Measures of performance:
1)throughput: the number of tasks that can be completed in
a time interval
2)response time: the amount of time it takes to complete a
single task from the time it is submitted
• Two important issues: speedup and scaleup
Speedup
Def: Running a given task in less
time by increasing the degree of
parallelism
Goal: to process the task in time
inversely proportional to the
number of processors and disks
allocated.
Calculation: speedup is Ts/Tn
when the larger system has N
times the resources of smaller
speed
system
Linear
speedup
Sublinear
speedup
Type: 1)linear speedup: N
2)sublinear speedup: <N
resources
Scaleup
 Def: the ability to process
larger tasks in the same
amount of time by
providing more resources.
 Calculation: Ts / Ta
(Ts-execution time of tasks Q on
machine with size Ms;
Ta-execution time of tasks N*Q on
machine with size N*Ms)
Linear scaleup
Ts/Ta
Sublinear scaleup
Type: 1)linear scaleup: Ts = Ta
2)sublinear scaleup: Ta > Ts
Problem size
Scaleup Continue
Two kinds of scaleup
 Batch scaleup: The large jobs whose runtime depends on
the size of database.
 Transaction scaleup: The rate at which transactions are
submitted to the database depends on the size of the
database .
The factors work against efficient parallel operation
 Startup costs:the startup time overshadow the actual
processing time, affecting speedup adversely
 Interference: it competes with existing processes for
commonly held resources
 Skew: it is often difficult to divide a task into exactly
equal-sized parts
Interconnection Networks
Bus: send and receive data from a a single communication bus.It can handl
communication from only one component at a time. Work well for small
numbers of processors, but not scale well with increasing parallelism
Mesh: The number of communication links grows as the number of
components grows, and work well with increasing parallelism
Hypercube: The components are numbered in binary, each of the n
components is connected to log(n) other components. The communication
delays are lower than mesh
bus
mesh
hypercube
Parallel Database Architectures
Shared Memory:
--Def: The processors and disks have access to a common memory via a
bus or through an interconnection network.
--strength: extremely efficient communications between processes
--weakness: Maintaining cache increasing overhead
Shared Disk:
--Def: all processors can access all disks directly, but the processors have
private memory.
--strength: 1) each processor has its own memory, the memory bus is not
a bottleneck.
2)if a processor fails, the other processors can take over its
tasks.
--weakness: The interconnection to the disk subsystem is now a
bottleneck.
Parallel Database continued
Shared Nothing:
--Def: each node of machine consists of a processor,
memory, and one or more disks.
--strength: more scalable and easily support a large number
of processors.
--weakness: cost high
Hierarchical:
--Def: combines the above three
Distributed System
Def: The database is stored on several computers which
communicate with each other through communication media.
The main difference between Distributed database and sharenothing parallel database:
1)the databases are separated geographically, so a slow
interconnection
2)there exist a local transaction and global transaction
The reasons for building Distributed System:
1)sharing data
2)Autonomy
3)Availability
The end