Chapter 9 Physical Database Design Methodology

Download Report

Transcript Chapter 9 Physical Database Design Methodology

Chapter 16-17
Physical Database Design Methodology
•
•
•
•
•
Software & Hardware
Mapping Logical Design to DBMS
Physical Implementation
Security Implementation
Monitoring and Tuning DBMS
Software & Hardware
• Define user’s requirement
• Select software based on user’s requirement
• Select hardware based on software
requirement
Mapping Logical Design to DBMS
• Create base relations
–
–
–
–
–
–
Name of relation
Attributes (domain & defaults)
Primary key (not null)
Foreign key (referential integrity)
Alternative keys
Indexes
• Integrity rules
– Enterprise
– Referential
– Entity
Physical Implementation
• Efficiency measurement
– Throughput
– Response time
• Factors for efficiency measurement
–
–
–
–
–
Main memory
CPU
Disk I/O (os, database, index, recovery log)
Network (traffic, collisions)
File organization & access method
Physical Implementation
•
•
•
•
•
Analysis transactions
Choose file organizations
Add secondary indexes
Balance flexibility and performance
Estimate disk space
Analyze Transactions
• Frequency of the transaction - logical access map
– max
– avg
– min
• Access structure - attributes used in “where”
conditions
– Relational operations (join, select, project)
– Data operations (I, D, U, R)
• User response time requirement
– Day
– Time
File Organization Selection Criteria
• Heap
–
–
–
–
Bulk-loading
A few pages
Retrieve most record
Primary & secondary indexes
• Indexed sequential
– Exact key match
– Direct & sequential accesses
– Update deterioration
File Organization Selection Criteria
• Direct (hash or random)
–
–
–
–
Direct access
Primary and secondary indexes
Update deterioration
Not for pattern match or range of values
• B+ tree
–
–
–
–
Similar to indexed sequential
No overflow problem (deterioration)
Pattern match or range of values
Dynamic growth
Secondary Indexes
• Overhead consideration
– Spaces
– Update
– Query performance
• Selection criteria
– Do
• Primary key
• Heavily used attributes or foreign keys
– Don’t
•
•
•
•
Small relations
Frequently updated relations
Long character strings
Retrieve most records
Balance Flexibility and Performance
• Denormalization
– Low update rate
– High query rate
– Cross-referencing transactions and relations
table
• Steps
– Derived data
– Redundancy
Derived Data
• Storage costs
• Calculation costs
• Response time
Redundancy - Reduce Join
• Combining one-to-one relationship (company,
store)
• Duplicating nonkey attributes or foreign keys in
one-to-many relationship (po, supplier)
• Duplicating attributes in man-to-many relationship
(student, class)
• Introducing repeating group
– Static
– Maximum no.
– less than 12
Redundancy - Reduce Join
• Reference table (type, description)
• Creating extract tables or relations
– Multi-relations (static data and not current or accurate)
– Large amount of derived data
Estimate Disk Space
•
•
•
•
•
•
•
Software (DBMS)
Hardware (disk spaces)
Record size
Relation size
Storage area for indexes
System overhead
File organization
Security Implementation
• User views
• Access rules
– Authorization
– Privileges
Monitoring and Tuning DBMS
• Correct inappropriate design
• Implement new minor requirement
• Produce better Response time and
throughput
• Increase user’s satisfaction
Assignment
• Review chapter 5-6, 11-17, 24-26
• Read chapter 18
• Project (enhanced e-r & normalization)
– Due date:
• Project (SQL part)
– Due date: