Guest lecture on MSc course - Product Design at the University of

Download Report

Transcript Guest lecture on MSc course - Product Design at the University of

Databases in Context
Wendy Moncur
Department of Computing Science,
University of Aberdeen
Databases in Context

Database design in a major bank

Database management

6000-table Personnel database
Who am I ?

Wendy Moncur

DataBase Administrator (DBA) at one of UK’s largest banks.

Designed databases for high performance & availability.

19 years industry experience.

Platform: DB2 & SQL

Largest database: 6000 tables
Why listen?

DBA Average Minimum Salary £41,896

DBA Average Maximum Salary £47,147

Source: http://www.itjobswatch.co.uk
What does a DBA do?

Database design & optimisation

Quality assurance of SQL

Performance management

Database administration
Case study: the monster database
Case study: the monster database
• 6000+ tables
• 18000+ indexes
Part1: Challenges






“One size fits all”
External supplier
6000+ tables
18000+ indexes
1 tablespace
Short timescale
Challenges: “one size fits all”?

One size does not fit all.

Performance of SQL statements dependent on:

Database design
Index design

The DATA

Challenges: “one size fits all”?

Every company has different requirements.

Customers demand high performance... and control the budget.

Service Level Agreements (SLAs) dictate …





Minimum transaction speed
Number of concurrent users
Number of remote locations
Daily system availability
Database must be tailored to achieve site-specific SLAs.
Challenges: external supplier

Software package & database from external supplier.

Cannot change this.
Challenges: 6,000+ tables

Cannot change tables: no denormalisation allowed.

Supplied program code demands these tables exist.

Cannot change supplied program code unless essential.
Challenges: 18,000+ indexes

Can change indexes:

Unique indexes

Clustering indexes

Secondary indexes
Unique index

Defines what makes a row unique.

Components of the index cannot be changed.

Order of components can be changed.
Unique index
E.g. – for Table “EMPLOYEE”
Unique index = DateOfBirth, Firstname, Surname.
Most queries ask for data where only Surname, Firstname are known.
SELECT Surname, Firstname, DateOfBirth
From Employee
Where Surname = “Jenkins” And Firstname = “Malcolm” ;
Recommendation: Change order of unique index to
Surname, Firstname, DateOfBirth.
Clustering indexes

Defines the physical order in which rows of data
should be stored.

Components of the index can be changed.

Order of components can be changed.
Clustering indexes
E.g. – Table “EMPLOYEE”
Clustering index = DateOfBirth
Yet most queries order by EmploymentStartDate
SELECT EmploymentStartDate, Surname, Firstname
From Employee
Where Surname = “Jenkins” And Firstname = “Malcolm” ;
Order by EmploymentStartDate;
Recommendation: Change clustering index to use
EmploymentStartDate.
Secondary indexes

Not unique.

Do not dictate how the data is to be held.

Created to improve performance of queries and updates.

Increases cost of insert and update, as must be created and maintained along
with the table.
Recommendation: Drop superfluous secondary indexes.
Challenges: One tablespace
Many tablespaces


Create many new tablespaces.
Split the tables between them, according to table function.
Challenge: Short timescale

At least 4 test environments:
Vanilla

Unit test
System test
96,000 objects!
((6,000 tables + 18,000 indexes) * 4 environments)

3 months
Pre-live
Tools

Use tools to…

Check performance of each SQL statement

Manage change process
Check performance

“EXPLAIN”

Evaluates route to data for every SQL statement.

Identifies what indexes are used

Doesn’t identify redundant indexes

Doesn’t identify indexes that need to be changed.
Manage change process

Rigorous control needed

Achieved through…




Consistent naming standards
Detailed record of every change
Consistent route through environments, no short cuts
DBA tools
Part1: Recap of challenges
Can’t change:

“One size fits all”
 External supplier

6000+ tables

Can change:

18000+ indexes

1 tablespace

Short timescale
Part2: The Production Database

Does it perform?

Can the right people use it?

If disaster strikes, can the data be recovered?
Does the database perform?

Database performance monitored against Service Level
Agreements (SLAs).

Regular health checks carried out:



Data stored in sequence?
Enough space?
If sub-standard performance, further database design work
done.
Can the right people access the data?
PERSONNEL
database
Can the right people access the data?
PERSONNEL
database
Personnel team
Query & update data at
individual or regional level
Can the right people access the data?
DBA
Backup/ restore data
Reorganise data
Change database definitions
Update statistics on data
PERSONNEL
database
Personnel team
Query & update data at
individual or regional level
Can the right people access the data?
DBA
Chief executive
Backup/ restore data
Employee statistics
Reorganise data
Change database definitions
Update statistics on data
PERSONNEL
database
Personnel team
Query & update data at
individual or regional level
Can the right people access the data?
DBA
Chief executive
Backup/ restore data
Employee statistics
Reorganise data
Change database definitions
Update statistics on data
PERSONNEL
database
Staff member
Their own data
Personnel team
Query & update data at
individual or regional level
Can the right people use the database?

Different people, different information needs.

Sensitive data – salary, health, discipline…

Solution


VIEWS
Transaction Management
If disaster strikes,
can the data be recovered?

Robust backup & recovery strategies for:


Hardware failure
Software failure
Part2: Recap of Production
Database issues

Database must perform to acceptable level.

Only the right people should have access to any data item.

No matter what, the data must be recoverable.