2515 - Distributed Databases

Download Report

Transcript 2515 - Distributed Databases

G063 - Distributed Databases
Learning Objectives:
By the end of this topic you should be able to:
• explain how databases may be stored in more than
one physical location
• explain the methods by which this distribution may be
carried out
• explain reasons why distribution would be carried out
• explain the security issues of distributed databases
A Distributed Database is:
• a database not stored in its entirety at a single location
• database spread across a number of computers
– in multiple locations
– computers connected by a data communications link
 LAN and/or WAN
Why distribute a database:
Since the database is not in a single location:
• allows faster local queries
– faster searching
• reduces network traffic
– speeds up other network operations
 due to some data queries being handled locally
• improved reliability
– data may be replicated at multiple sites
 if 1 copy corrupted or lost - another copy exists
• allows for modular growth of the database
– can easily add new sites
3 Types of Distributed Database
• Replicated
• Centralised
• Partitioned
Replicated database:
= a copy of the complete database at each site
Replicated Database
• a copy of the complete database at each site
• exact copy of the database stored & accessed locally
• replicated versions are usually read only
– transaction files created of changes at each centre
• updates made on a master database
– a ‘new’, updated copy of database sent to each centre
 at regular intervals
Replicated Database
Advantages:
• reliability
– data is always available locally
– not reliant on the network or central server
• fast response to searches
– local access will be faster than WAN access
Replicated Database
Disadvantages:
• data integrity issues
– local copies of data may be different to each other
 if replicated data is not updated simultaneously
• additional local storage space requirements
• additional time required for update operations
Centralised database:
Centralised Database
• single database held centrally (possibly at Head Office)
• an index to the central database is held locally
– speeds up queries/transactions
• each site accesses database through a WAN
Example: Booking systems
• need distributed access to a central database
– sharing of up-to-date information important,
 avoids double bookings.
Centralised Database
Advantages:
• better security of data
– one copy rather than several (replicated copies)
– security of data handled centrally
• good data integrity
– one copy rather than several
 always sharing the same data
• data always up-to-date
– data is updated in real time
• centralised backup
– can be automated
Centralised Database
Drawbacks:
• a virus in the central system could spread
– throughout all sites
• possibility of update clashes
– two sites trying to modify the same record at the same time
Partitioned database:
Partitioned Database
• not every site needs to have all the data
– → give each site just the data that is relevant to that site
• database is split into sections
• each site on the network stores local data
– the section of the database that relates to that site,
 e.g. the section of the database that relates to a single supermarket’s
stock is stored at that site,
• other (global) data is held centrally
– changes to central data can be dealt with overnight
– by a batch update from the sites,
Partitioned Database
Advantages:
• speed:
– faster access to local data
 less network access required
• local control over local data
• scalability
– can add new sites as required
• not reliant on network or server for day-to-day tasks
• each partition can have its own transaction log
– local reporting (access/sales)
Partitioned Database
Drawbacks:
• data inconsistency
– possibility of different data being held centrally to that on
partition
– regular batch update required to maintain consistency
• unsuitable for certain applications
– if data changes at one node must be instantly seen by all nodes
 e.g. holiday bookings
• high network usage during update process
– will slow down other network processes
Partitioned Database
Two types:
• Horizontal partitioning
• Vertical partitioning
Horizontal partitioning
Example:
• branch offices deal mostly with a set of local customers
– e.g. Euston Road branch stores the fragment where contents of
the Branch field = 'Euston Road'
So:
• split the table into number of smaller tables
– on the basis of rows (records)
 i.e. specific field contents
• each site (branch) stores just the table relevant to them
Horizontal partitioning
• this table represents the database for an estate
agency with 3 branches
Horizontal partitioning
• the database is horizontally partitioned
– so that the data for each branch is stored on the
server in that branch:
– this will speed up local queries
 Boldmere staff searching for properties in Boldmere
Horizontal partitioning
• this means that the data is stored like this:
Horizontal partitioning
• this means that the data is stored like this:
Vertical partitioning
• data is separated across sites based upon fields
– dividing the table based on the different columns
• different columns of a table located at different sites
– e.g. stock descriptions
 item descriptions & prices at sales outlet
 item’s country of origin, supplier name at Head Office
• each site can search locally for its own data
– can also perform a global search to find data stored at other sites
Vertical partitioning
Advantages:
• faster searching
– reduced amount of data being sent between sites
• help to conform to the DPA
– personal information kept separate from sales records
• access rights
– ensures that only certain people see certain fields
– e.g. financial matters not revealed to all
Vertical partitioning
Disadvantages:
• regular backups essential since there is no data replication
• potential exists for inconsistency in the data stored
• complex & time consuming to set up and modify