Systems Analysis & Design with Databases

Download Report

Transcript Systems Analysis & Design with Databases

Information Retrieval and Use
De-normalisation and Distributed
database systems
Geoff Leese September 2008,
revised October 2009
1
Mapping the logical model
onto physical design

Entities become tables





2
More often than not!
Attributes become fields (columns)
Unique identifiers become primary keys
Relationships implemented by foreign key
columns
Resolve M:N relationships by inserting
intersection table
Mapping considerations
Independence
 Privacy
 Efficiency of queries

3
Denormalisation
Joins take time!
 Split or merge normalised entities
based on frequent associated use

 Remove
redundant relationships
 Merge entities with 1:1 relationships
 Use summary fields
 Use summary tables and views
4
Using summary field(1)

Consider running a query “give the
total value of all orders for customer
X”
How many joins?
5
Using summary field (2)

Note summary field in Orders table
How many joins now?
6
Distributed database systems

7
Special rules apply!
The traditional model
One centralised database
 Terminals at remote locations
 Disadvantages

 Networks
are slow (esp WANS!)
 Central machine does all processing
 If central machine fails, database is
down (Integrity, redundancy and
disaster recovery considered in later
lectures!)
8
The Client/Server model
Client – application – “front end”
 Server – DBMS – “back end”
 Still dependent on central database

9
Client responsibilities





10
Manages user interface
Accepts user data
Has local processing capability within the
application
Generates database requests and
transmits them via network to server
Receives results from server and formats
them as required by application
Server responsibilities
Accepts database requests from
client
 Processes database requests

 Handles
security issues
 Deals with concurrency issues
 Optimizes queries
 Handles recovery/rollback issues

11
Returns results to client
Distributed database architecture
A collection of logically related
“sites”, connected together so that
the users view is that of a single
database at a single location.
 Each site is a database in it’s own
right
 Not necessarily physically or
geographically separated, but often
are – and are logically separated.

12
Advantages
Organisations are distributed, why
shouldn’t their data be?
 Improved efficiency

 Store
13
data close to where it’s used
Types of DDS
Homogenous – same type of RDBMS
at each site (easy!)
 Heterogeneous – different types of
DBMS at each site (not so easy!)

14
Implementation methods (1)

Fragmentation – splitting data
between sites
– row based – e.g. store all
employee records for a location at that
location
 Vertical – column based – e.g. store all
payroll columns in payroll department,
all other employee data in HR
 Horizontal

15
Either way, fragments must be able
to be put back together!
Implementation methods (2)

Replication
 Controlled
duplication of data at more
than one site

16
Update propagation?
Objectives (1)

Local autonomy
 Local
data locally owned and managed
– minimal data requirements from
remote sites.
No reliance on central site
 Continuous operation

 Reliability
 Availability
17
Objectives (2)

Location independence
 From

user’s view, all data is at their site.
Fragmentation independence
 Needs
joins and unions to put
fragments back together

18
Replication independence
Objectives (3)
Distributed query processing
 Distributed transaction management

 Transactions
carried out by “agents” at
distributed sites
 Two-phase commit
 Locking issues (later lecture)
19
Objectives (4)
Hardware independence
 Operating system independence
 Network independence
 DBMS independence

20
DDS issues

Query processing
 Optimisation

even more important
Catalogue (data dictionary)
management
 Centralised?
 Fully
replicated?
 Partitioned?
 Combination of first and third?
21
DDS issues

Update propagation
 An
issue where replication is used.
 “Primary copy” system

Recovery
 Two-phase

Recovery
 Locking
22
commit
strategies
Summary





Mapping the logical model
Denormalisation
Traditional database architecture
Client/server model
Distributed Database systems




23
Advantages
Objectives
Implementation methods
Issues
Further reading
Rolland chapter 10
 Hoffer chapters 12


24
Denormalisation - click to follow the link!