Lecture Note 10

Download Report

Transcript Lecture Note 10

ITEC 3220M
Using and Designing Database Systems
Instructor: Prof. Z.Yang
Course Website:
http://people.math.yorku.ca/~zyang/ite
c3220m.htm
Office: TEL 3049
Concurrency Control
with Time Stamping Methods
• Assigns a global unique time stamp to each
transaction
• Produces an explicit order in which transactions
are submitted to the DBMS
• Uniqueness
– Ensures that no equal time stamp values can exist
• Monotonicity
– Ensures that time stamp values always increase
2
Wait/Die and Wound/Wait
Schemes
• Wait/die
– Older transaction waits and the younger is
rolled back and rescheduled
• Wound/wait
– Older transaction rolls back the younger
transaction and reschedules it
3
Wait/Die and Wound/Wait
Concurrency Control Schemes
4
Example
T1
T2
R(A)
W(A)
R(B)
R(C)
W(B)
5
Concurrency Control
with Optimistic Methods
• Optimistic approach
– Based on the assumption that the majority
of database operations do not conflict
– Does not require locking or time stamping
techniques
– Transaction is executed without
restrictions until it is committed
– Phases are read, validation, and write
6
Better Performance than Locking
7
Example
T1
T2
R(A)
W(A)
R(B)
R(B)
commit
8
Database Recovery Management
• Database recovery
– Restores database from a given state, usually
inconsistent, to a previously consistent state
– Based on the atomic transaction property
• All portions of the transaction must be treated as a
single logical unit of work, in which all operations must
be applied and completed to produce a consistent
database
– If transaction operation cannot be completed,
transaction must be aborted, and any changes to the
database must be rolled back (undone)
9
Transaction Recovery
• Deferred write
– Transaction operations do not immediately update
the physical database
– Only the transaction log is updated
– Database is physically updated only after the
transaction reaches its commit point using the
transaction log information
• Write-through
– Database is immediately updated by transaction
operations during the transaction’s execution, even
before the transaction reaches its commit point
10
Example
• Describe the restart work if transaction T1
committed after the checkpoint but prior to
the failure. Assume that the recovery manager
uses
– the deferred update approach
– the write though approach
Backup
Checkpoint
Failure
T1
11
Review
• Transaction property
• Transaction log
• Potential problems in multiuser
environments
• Different locking methods and how they
work
• Database recovery management
12
Chapter 13
The Data Warehouse
Transaction Processing Versus
Decision Support
• Transaction processing allows organizations
to conduct daily business in an efficient
manner
– Operational database
• Decision support helps management provide
medium-term and long-term direction for an
organization
14
Decision Support System (DSS)
Components
15
Operational vs. Decision Support
Data
• Operational data
– Relational, normalized database
– Optimized to support transactions
– Real time updates
• DSS
– Snapshot of operational data
– Summarized
– Large amounts of data
• Data analyst viewpoint
– Timespan
– Granularity
– Dimensionality
16
The DSS Database Requirements
• Database schema
– Support complex (non-normalized) data
– Extract multidimensional time slices
• Data extraction and filtering
• End-user analytical interface
• Database size
– Very large databases (VLDBs)
– Contains redundant and duplicated data
17
Data Warehouse
• Integrated
– Centralized
– Holds data retrieved from entire organization
• Subject-Oriented
– Optimized to give answers to diverse questions
– Used by all functional areas
• Time Variant
– Flow of data through time
– Projected data
• Non-Volatile
– Data never removed
– Always growing
18
Data Marts
• Single-subject data warehouse subset
• Decision support to small group
• Can be tested for exploring potential
benefits of Data warehouses
• Address local or departmental problems
19
Data Warehouse Versus Data Mart
20
Star Schema
• Data-modeling technique
• Maps multidimensional decision support into
relational database
• Yield model for multidimensional data analysis
while preserving relational structure of
operational DB
• Four Components:
–
–
–
–
Facts
Dimensions
Attributes
Attribute hierarchies
21
Simple Star Schema
22
Slice and Dice View of Sales
23
Star Schema Representation
• Facts and dimensions represented by
physical tables in data warehouse DB
• Fact table related to each dimension table
(M:1)
• Fact and dimension tables related by
foreign keys
• Subject to the primary/foreign key
constraints
24
Star Schema for Sales
25
Example
Canadian financial organization is interested in building a data
warehouse to analyze customers’ credit payments over time, location
where the payments were made, customers, and types of credit
cards. A customer may use the credit card to make a payment in
different locations across the country and abroad. If a payment is
made abroad it can be based on domestic currency and then
converted into Canadian dollars based on currency rate.
•Time is described by Time_ID, day, month, quarter and year.
•Location is presented by Location_ID, name of the organization
billing the customer, city and country where the organization is
located, domestic currency.
•A credit card is described by credit card number, type of the credit
account, and customer’s credit rate. The customer’s rate depends on
the type of the credit account.
•A customer is described by ID, name, address, and phone.
26
Performance-Improving Techniques
for Star Schema
• Normalization of dimensional tables
• Multiple fact tables representing
different aggregation levels
• Denormalization of the fact tables
• Table partitioning and replication
27
Normalization Example
• Normalize the star schema that you developed
for Canadian financial organization on page
26 into 3NF.
28
More Example
A supermarket chain is interested in building a data
warehouse to analyze the sales of different products in
different supermarkets at different times using different
payment method.
– Each supermarket is presented by location_ID, city,
country, and domestic currency.
– Time can be measured in time_ID, day, month,
quarter, and year.
– Each product is described by product_ID,
product_name, and vendor.
– Payment method is described by payment_ID,
payment_ type.
Design a star schema for this problem and then
normalize the star schema that you developed into 3NF.
29