Transcript LN7
CPT-S 580-06
Advanced Databases
Yinghui Wu
EME 49
1
Temporal databases
2
Temporality in Databases
Modelling temporal data
Temporal operations
Temporal integrity constraints
Temporal Queries
3
Modelling temporal data
4
From Data to Big Data to Long Data
Data: current raw facts
Temporal data: Data anchored at a time
Big data: Gartner’s 3V’s
– High volume
– High velocity
– High variety
Long data: Data with a “massive historical sweep”
5
Temporal DBs – Motivation
Conventional databases represent the state of an
enterprise at a single moment of time
Many applications need information about the past
– Financial (payroll)
– Medical (patient history)
– Government
Temporal DBs: a system that manages time
varying data
Comparison
Conventional DBs:
–
–
–
–
Evolve through transactions from one state to the next
Changes are viewed as modifications to the state
No information about the past
Snapshot of the enterprise
Temporal DBs:
– Maintain historical information
– Changes are viewed as additions to the information stored in
the database
– Incorporate notion of time in the system
– Efficient access to past states
Time
Time points: <t2, e1>, <t3, e2>, …
Intervals: <(t2, t5), e1>, <(t5, t6), e2>, <[t6, now], e3>
Temporal element: union of intervals designating the
time of events
– <(t2, t5) UNION (t7, t9), e1>
Three types of data
– stepwise constant: salary – valid for certain period
of time, changes stepwise
– Valid as time instance – sales
8
– Continuous data - voice
Relations as Cubes
A complete history
9
Taxonomy of time
Valid time
– Time, a data value/fact becomes effective –Valid
Transaction time
– Transaction time is the time when a fact is stored in the
database
User defined time:
– A user defined and interpreted time
Example
Sales example: data about sales are stored
at the end of the day
Transaction time is different than valid time
Valid time can refer to the future also!
– Credit card: 03/01-04/06
Time in relation
Example of a temporal relation:
Temporal DBMS
13
Types of Temporal databases
Snapshot Databases
– A database with no time support
Valid Time Database (historical database)
– Added valid time dimension (current and past
snapshots)
Transaction time database (rollback, immutable)
– Only new insertions; added transaction time
Bi-temporal databases: validation and transaction
time
14
Transaction Time DBs
Time evolves discretely, usually is associated with
the transaction number:
T1 -> T2 -> T3 -> T4 ….
A record R is extended with an interval [t.start, t.end).
When we insert an object at t1 the temporal attributes
are updated -> [t1, now)
Updates can be made only to the current state!
– Past cannot be changed
– “Rollback” characteristics
Transaction Time DBs
Deletion is logical (never physical deletions!)
– When an object is deleted at t2, its temporal attribute
changes from [t1, now) [t1, t.t2) (lifetime)
– Object is “alive” from insertion to deletion time, ex. t1 to t2. If
“now” then the object is still alive
eid
salary start
end
10
20K
9/14
10/15
20
50K
4/14
*
33
30K
5/14
6/15
10
50K
1/14
*
time
Transaction Time DBs
1 2
4
8
10
15 16 17
25
28
30
33
41 42
45
u
b
f
c
d
g
id
p
j
k
i
m
e
Database evolves through insertions and deletions
47 48
51
53
Transaction Time DBs
Requirements for index methods:
– Store past logical states
– Support addition/deletion/modification changes on the
objects of the current state
– Efficiently access and query any database state
Transaction Time DBs
Queries:
– Timestamp (timeslice) queries: ex. “Give me all
employees at 05/94”
– Range-time slice: “Find all employees with id
between 100 and 200 that worked in the company
on 05/94”
– Interval (period) queries: “Find all employees with
id in [100,200] from 05/94 to 06/96”
Valid Time DBs
Time evolves continuously
Each object is a line segment representing its time
span (eg. Credit card valid time)
Support full operations on interval data:
– Deletion at any time
– Insertion at any time
– Value change (modification) at any time (no
ordering)
Valid Time DBs
Deletion is physical:
– No way to know about the previous states of
intervals
The notion of “future”, “present” and “past” is relative
to a certain timestamp t
Valid Time DBs
Requirements for an Index method:
– Store the latest collection of intervalobjects
– Support add/del/mod changes to this
collection
– Efficiently query the intervals in the
collection
• Timestamp query
• Interval (period) query
Bitemporal DBs
A transaction-time Database, but each record
is an interval (plus the other attributes of the
record)
Keeping the evolution of a dynamic collection
of interval-objects
At each timestamp, it is a valid time database
Bitemporal DBs
C(t1)
t1
Ix
t3
t2
v
Iy
C(t3)
C(t2)
Iy
Ix
Iz
Iy
Ix
Iz
Iw
t
t5
t4
v
v
C(t5)
C(t4)
v
Iy
v
Iy
Iw
Ix
Iw
Ix
Bitemporal DBs
Requirements for access methods:
– Store past/logical states of collections of objects
– Support add/del/mod of interval objects of the
current logical state
– Efficient query answering
Temporal Queries
26
Temporal Queries
Snapshot query: Query at time instant
Temporal query: Query at every time instant
Time travel: Going back to a previous time
instant to view the database
Temporal integrity constraints
27
Temporal integrity constraints
Single state integrity constraints
Multi-state integrity constraints
– Synchronous multi-state integrity constraint
• Salary should be larger than X – all states
– Asynchronous multi-state integrity constraint
• Salary is non-decreasing
28
Time Specification in SQL-92
date: four digits for the year (1--9999), two digits for the month
(1--12), and two digits for the date (1--31).
time: two digits for the hour, two digits for the minute, and two
digits for the second, plus optional fractional digits.
timestamp: the fields of date and time, with six fractional digits
for the seconds field.
Times are specified in the Universal Coordinated Time,
abbreviated UTC (from the French); supports time with time
zone.
interval: refers to a period of time (e.g., 2 days and 5 hours),
29
without specifying a particular time when this period starts; could
more accurately be termed a span.
Temporal Query Languages
Predicates precedes, overlaps, and contains on time intervals.
Intersect can be applied on two intervals, to give a single (possibly
empty) interval; the union of two intervals may or may not be a single
interval.
A snapshot of a temporal relation at time t consists of the tuples that
are valid at time t, with the time-interval attributes projected out.
Temporal selection: involves time attributes
Temporal projection: the tuples in the projection inherit their time-
intervals from the tuples in the original relation.
Temporal join: the time-interval of a tuple in the result is the
intersection of the time-intervals of the tuples from which it is derived. It
intersection is empty, tuple is discarded from join.
30
Temporal Query Languages (Cont.)
Functional dependencies must be used with care: adding a time
field may invalidate functional dependency
31
A temporal functional dependency x Y holds on a relation
schema R if, for all legal instances r of R, all snapshots of r
satisfy the functional dependency X Y.
Temporal projection
Temporal projection is similar to standard projection,
except that the restriction applies to only the nontemporal attributes. Both timestamp columns cannot
be excluded in the resultant history.
After temporal projection, folding is enforced in order
that adjoining intervals should be merged into a
single interval in the resultant relation.
32
Temporal selection
Adds the following new construct to standard SQL:
selection based on temporal comparisons of
timepoints and intervals using terms in a WHEN
clause.
The WHEN clause is used to express the temporal
part of a query.
The temporal comparison in the WHEN clause has
the following form:
WHEN a interval_compare_operator b
where a,b are intervals and
interval_compare_operator can be one of the
keywords: BEFORE, AFTER, DURING,
EQUIVALENT, ADJACENT, OVERLAPS,
PRECEDES, and FOLLOWS.
33
Temporal join
This join has the most special semantics: the valid-
time intervals of the resultant table are created
from the intersection of the overlapping valid-time
elements of the tables specified in the join.
Assumption: The valid time component in each
temporal table must be well-defined before
performing such joins.
34
Temporal join
To perform joining two temporal tables:
– first assemble the non-temporal columns. The
columns are assembled by generating the cross
product of the non-temporal columns from the
operand tables, and then excluding rows that do not
satisfy the conditions in the WHERE and WHEN.
– Then, examine the source tuples for each candidate
tuple in the reduced cross product to see if their valid
time periods overlaps.
- If they overlap, the candidate tuple is included and
the result of the intersection of two valid time periods is
used as the valid time period of the new tuple.
35
- otherwise excluded from the result of the join.
An example of temporal join
PROBLEMLIST
Patient
Problem
TS
TE
---------------------------------------------------------------------J. Smith
P1
14/Feb/2015
1/Mar/2015
J. Smith
P2
10/Mar/2015
Now
P. Jones
P3
1/Apr/2015
12/May/2015
R. Franks
P3
13/Feb/2015
1/Jun/2015
DRUGS
Patient
Drug
VS
VE
----------------------------------------------------------------------J. Smith
D1
20/Mar/2015
12/May/2015
P. Jones
D1
1/Apr/2015
6/Jun/2015
R. Franks
D2
4/Feb/2015
14/May/2015
“Show all problem and drug combinations for patient”
36
TEMPORAL
SELECT T1.Patient, T1.Problem, T2.Drug
FROM PROBLEMLIST AS T1, DRUGS AS T2
WHERE T1.Patient = T2.Patient
The resultant table:
Patient Problem Drug
TS
TE
-----------------------------------------------------------------------------J. Smith P2
D1
20/Mar/2015
12/May/2015
P. Jones P2
D1
1/Apr/2015
12/May/2015
R. Franks P3
D2
13/Feb/2015
14/May/2015
37