04CM20145Lecture7 - Department of Computer Science
Download
Report
Transcript 04CM20145Lecture7 - Department of Computer Science
Dr Alwyn Barry
Dr Joanna Bryson
CM20145
Database Design
Lecture Plan
1. Basic Concepts
2. Data, Information & Knowledge
3. Data Models (The E-R Model)
4. The Relational Algebra
5. Introduction to SQL
6. Further SQL (Joins, RA Equivalences)
7. Database Design
8. Further DB Design – Normalisation
9. Architectures and Implementations
10. Integrity and Security
Lecture Plan
`Reading’ (PROJECT) Week
11. Ethics and Professional Conduct
12. Legal Issues
13. Transactions
14. Recovery
15. Concurrency Control
16. Storage and File Structure
17. Indexing and Hashing
18. Query Processing & Optimisation
January… Review Session
Overview
Database Design and Software
Engineering.
Example from real life.
Views and Users.
Redundancy and
Decomposition.
Loosely based on A. Keller, 2002, USCU
What Is Relational Algebra For?
Fundamental Operators:
Unary
Binary
Fundamental
Restriction
Projection
Cartesian Product
Union
Difference
Useful Operators:
Not
Primitive
Join
Intersection
Division
File Problems DBs Should Solve
Data redundancy & inconsistency
Duplication of information; Different formats
Data isolation
Different formats; Different locations
Limited access
Need new program for each query
Cannot support “Business Rules”
Consistency
Validity
Atomicity
Failures leave data in inconsistent state; Two users
cannot update the database at once
Security
Needs bespoke security in each application
Database Design
DB design usually covers formal
methods for trying to ensure
databases meet all these goals.
No formal system is perfect:
Formal methods take a long time,
may be computationally intractable.
Human error: mistakes in proofs.
Formally correct databases can still
have errors (will be examples.)
Design is an art; but science can
help.
E-R Model – System Design
From Elmasri & Navathe, 2003, pg 51
Problem
domain
DBMS Dependentt
DBMS Independent
Requirements
Collection & Analysis
Functional Requirements
Data Requirements
Functional Analysis
Conceptual Design
High-level Transaction Spec.
Application Program
Design
Conceptual Schema
Logical Design
(Data model mapping)
Logical Schema
Transaction
Implementation
Physical Design
Internal Schema
Application Programs
Software Development Iterates
http://www.extremeprogramming.org/
Waterfall Feigning Iteration
This is not agile SE.
People don’t want to know, but
iteration happens.
The question is how well you cope.
Software Development Iterates
Models change as programmers
understand the problem better.
Requirements change as users
understand possibilities better.
More resources become available.
So:
Save and maintain all modelling and
planning tools.
Interact with users frequently.
Learn rules to recognize common
failures.
Overview
Database Design and Software
Engineering.
Example from real life.
Views and Users.
Redundancy and
Decomposition.
Loosely based on A. Keller, 2002, USCU
Where Do Models Come From?
Demand – What do the users
want to see?
Data – Don’t throw anything
away.
Design – Never store anything
twice.
Efficiency.
Integrity.
Clarity.
Security.
Automating Trading Operations
Automating Trading Operations
Data from:
trading cards,
market prices,
underlying
values.
Demand from:
Knowing what is
owned for
determining risk.
Knowing what
individuals did for
determining pay.
Entities and attributes
Trades
Instrument, trade,
price, 2 traders, 2
clearing firms.
Traders
Positions, trades.
Trading cards
Trades, trader,
number, time.
Instruments
Daily values,
volatilities, expiration
dates.
(draw on board)
New Requirement: Reconciling
Business process – dual entry:
Operations: primary concerns are
risk, execution on trading floor,
relations with individual traders.
Clearing: primary concerns are
accounting, banks, law, relations
with other clearing firms.
The same trading cards resulted
in different trade quantities.
The same trade has different
dates!
Models will change!
Design – Find out models are
inefficient or clumsy to maintain.
Data – Discover new categories,
salient values.
Demand – Users see new
potential.
Overview
Database Design and Software
Engineering.
Example from real life.
Views and Users.
Redundancy and
Decomposition.
Loosely based on A. Keller, 2002, USCU
Views: Why they’re important.
The same data may be seen by
different users in different ways.
Shorthand for frequent joins,
formulas – may be more efficient.
Automate / enforce security –
make access to tables and views
depend on user’s function.
Keep reports logically
independent from underlying
representation – protect the
users!
Why Limit / Protect Users?
Databases contain all
data for a company.
Limits on access eliminate
suspects for errors, crime.
Information overload.
Smart, authoritative users
still need to find things
quickly.
Some users really are
naïve.
Programmers are users.
Overview
Database Design and Software
Engineering.
Example from real life.
Views and Users.
Redundancy and
Decomposition.
Loosely based on A. Keller, 2002, USCU
Software Development Iterates
Models change as programmers
understand the problem better.
Requirements change as users
understand possibilities better.
More resources become available.
So:
Save and maintain all modelling and
planning tools.
Interact with users frequently.
Learn rules to recognize common
failures.
Software Development Iterates
Models change as programmers
understand the problem better.
Requirements change as users
understand possibilities better.
More resources become available.
So:
Save and maintain all modelling and
planning tools.
Interact with users frequently.
Learn rules to recognize common
failures.
Pitfalls in Relational DB Design
A bad design may lead to:
redundant information,
difficulty in representing certain
information, or
difficulty in checking integrity
constraints.
Design goals:
Avoid redundant data.
Ensure that relationships among
attributes are represented.
Facilitate the checking of updates
for violation of integrity constraints.
©Silberschatz, Korth and Sudarshan
Modifications & additions by S Bird, Melbourne
Example of Bad Design
Consider the relation schema:
Lending-schema = (branch-name,branch-city,assets,
customer-name,loan-number,amount)
Redundant Information:
Data for branch-name, branch-city, assets are
repeated for each loan that a branch makes.
Wastes space and complicates updates, introducing
possibility of inconsistency of assets value.
Difficulty representing certain information:
Cannot store info. about a branch if no loans exist.
Can use null values, but they are difficult to handle.
Solution: Decomposition
Break up redundant tables into
multiple tables - this operation is
called decomposition.
E.g. Lending-schema = (branchname, branch-city, assets, customername, loan-number, amount)
Branch-schema = (branch-name,
branch-city,assets)
Loan-info-schema = (customer-name,
loan-number, branch-name, amount)
Lossless-Join Decomposition:
Want to ensure that the original
data is recoverable.
1.All attributes of the original schema
(R) must appear in the decomposition
(R1, R2), i.e. R = R1 R2
2.Decomposition must be a lossless-join
decomposition.
Definition: R1,R2 is a lossless-join
decomposition of R
if, for all possible relations r(R)
r = R1 (r) ⋈ R2 (r)
Bad Decomposition Example
A Non Lossless-Join Decomposition
R = (A, B) R1 = (A), R2 = (B)
A B
A
B
A
B
1
2
A(r)
B(r)
1
2
1
2
1
2
1
r
A (r) ⋈ B (r)
Thus, r is different to A (r) ⋈ B (r)
So A,B is not a lossless-join
decomposition of R.
Overview
Database Design and Software
Engineering.
Example from real life.
Views and Users.
Redundancy and
Decomposition.
Loosely based on A. Keller, 2002, USCU
Summary
Database design is an ongoing,
iterative process.
Requirements come from data, user
demands, design issues.
Change occurs:
Corporations & technologies grow.
Programmers & users learn.
Views / security.
Lossless-join decomposition
Next: Science for improving design.
Reading & Exercises
Reading
Connolly & Begg Chapter 9, (13, 14)
Silberschatz Chapter 7.
Much of 7 will be in the next lecture!
Exercises:
C&B: 9.2, 9.3/11, 9.9, 9.10
Silberschatz:7.1, 7.2, 7.16
These need functional dependencies,
which are covered next lecture.