Transcript LN7

CPT-S 580-06
Advanced Databases
Yinghui Wu
EME 49
1
Temporal databases
2
Temporality in Databases
 Modelling temporal data
 Temporal operations
 Temporal integrity constraints
 Temporal Queries
3
Modelling temporal data
4
From Data to Big Data to Long Data
 Data: current raw facts
 Temporal data: Data anchored at a time
 Big data: Gartner’s 3V’s
– High volume
– High velocity
– High variety
 Long data: Data with a “massive historical sweep”
5
Temporal DBs – Motivation
 Conventional databases represent the state of an
enterprise at a single moment of time
 Many applications need information about the past
– Financial (payroll)
– Medical (patient history)
– Government
 Temporal DBs: a system that manages time
varying data
Comparison
 Conventional DBs:
–
–
–
–
Evolve through transactions from one state to the next
Changes are viewed as modifications to the state
No information about the past
Snapshot of the enterprise
 Temporal DBs:
– Maintain historical information
– Changes are viewed as additions to the information stored in
the database
– Incorporate notion of time in the system
– Efficient access to past states
Time
 Time points: <t2, e1>, <t3, e2>, …
 Intervals: <(t2, t5), e1>, <(t5, t6), e2>, <[t6, now], e3>
 Temporal element: union of intervals designating the
time of events
– <(t2, t5) UNION (t7, t9), e1>
 Three types of data
– stepwise constant: salary – valid for certain period
of time, changes stepwise
– Valid as time instance – sales
8
– Continuous data - voice
Relations as Cubes
A complete history
9
Taxonomy of time
 Valid time
– Time, a data value/fact becomes effective –Valid
 Transaction time
– Transaction time is the time when a fact is stored in the
database
 User defined time:
– A user defined and interpreted time
Example
 Sales example: data about sales are stored
at the end of the day
 Transaction time is different than valid time
 Valid time can refer to the future also!
– Credit card: 03/01-04/06
Time in relation
 Example of a temporal relation:
Temporal DBMS
13
Types of Temporal databases
 Snapshot Databases
– A database with no time support
 Valid Time Database (historical database)
– Added valid time dimension (current and past
snapshots)
 Transaction time database (rollback, immutable)
– Only new insertions; added transaction time
 Bi-temporal databases: validation and transaction
time
14
Transaction Time DBs
 Time evolves discretely, usually is associated with
the transaction number:
T1 -> T2 -> T3 -> T4 ….
 A record R is extended with an interval [t.start, t.end).
When we insert an object at t1 the temporal attributes
are updated -> [t1, now)
 Updates can be made only to the current state!
– Past cannot be changed
– “Rollback” characteristics
Transaction Time DBs
 Deletion is logical (never physical deletions!)
– When an object is deleted at t2, its temporal attribute
changes from [t1, now)  [t1, t.t2) (lifetime)
– Object is “alive” from insertion to deletion time, ex. t1 to t2. If
“now” then the object is still alive
eid
salary start
end
10
20K
9/14
10/15
20
50K
4/14
*
33
30K
5/14
6/15
10
50K
1/14
*
time
Transaction Time DBs
1 2
4
8
10
15 16 17
25
28
30
33
41 42
45
u
b
f
c
d
g
id
p
j
k
i
m
e
Database evolves through insertions and deletions
47 48
51
53
Transaction Time DBs
 Requirements for index methods:
– Store past logical states
– Support addition/deletion/modification changes on the
objects of the current state
– Efficiently access and query any database state
Transaction Time DBs
 Queries:
– Timestamp (timeslice) queries: ex. “Give me all
employees at 05/94”
– Range-time slice: “Find all employees with id
between 100 and 200 that worked in the company
on 05/94”
– Interval (period) queries: “Find all employees with
id in [100,200] from 05/94 to 06/96”
Valid Time DBs
 Time evolves continuously
 Each object is a line segment representing its time
span (eg. Credit card valid time)
 Support full operations on interval data:
– Deletion at any time
– Insertion at any time
– Value change (modification) at any time (no
ordering)
Valid Time DBs
 Deletion is physical:
– No way to know about the previous states of
intervals
 The notion of “future”, “present” and “past” is relative
to a certain timestamp t
Valid Time DBs
 Requirements for an Index method:
– Store the latest collection of intervalobjects
– Support add/del/mod changes to this
collection
– Efficiently query the intervals in the
collection
• Timestamp query
• Interval (period) query
Bitemporal DBs
 A transaction-time Database, but each record
is an interval (plus the other attributes of the
record)
 Keeping the evolution of a dynamic collection
of interval-objects
 At each timestamp, it is a valid time database
Bitemporal DBs
C(t1)
t1
Ix
t3
t2
v
Iy
C(t3)
C(t2)
Iy
Ix
Iz
Iy
Ix
Iz
Iw
t
t5
t4
v
v
C(t5)
C(t4)
v
Iy
v
Iy
Iw
Ix
Iw
Ix
Bitemporal DBs
 Requirements for access methods:
– Store past/logical states of collections of objects
– Support add/del/mod of interval objects of the
current logical state
– Efficient query answering
Temporal Queries
26
Temporal Queries
 Snapshot query: Query at time instant
 Temporal query: Query at every time instant
 Time travel: Going back to a previous time
instant to view the database
 Temporal integrity constraints
27
Temporal integrity constraints
 Single state integrity constraints
 Multi-state integrity constraints
– Synchronous multi-state integrity constraint
• Salary should be larger than X – all states
– Asynchronous multi-state integrity constraint
• Salary is non-decreasing
28
Time Specification in SQL-92
 date: four digits for the year (1--9999), two digits for the month
(1--12), and two digits for the date (1--31).
 time: two digits for the hour, two digits for the minute, and two
digits for the second, plus optional fractional digits.
 timestamp: the fields of date and time, with six fractional digits
for the seconds field.
 Times are specified in the Universal Coordinated Time,
abbreviated UTC (from the French); supports time with time
zone.
 interval: refers to a period of time (e.g., 2 days and 5 hours),
29
without specifying a particular time when this period starts; could
more accurately be termed a span.
Temporal Query Languages
 Predicates precedes, overlaps, and contains on time intervals.
 Intersect can be applied on two intervals, to give a single (possibly
empty) interval; the union of two intervals may or may not be a single
interval.
 A snapshot of a temporal relation at time t consists of the tuples that
are valid at time t, with the time-interval attributes projected out.
 Temporal selection: involves time attributes
 Temporal projection: the tuples in the projection inherit their time-
intervals from the tuples in the original relation.
 Temporal join: the time-interval of a tuple in the result is the
intersection of the time-intervals of the tuples from which it is derived. It
intersection is empty, tuple is discarded from join.
30
Temporal Query Languages (Cont.)
 Functional dependencies must be used with care: adding a time
field may invalidate functional dependency

31
A temporal functional dependency x  Y holds on a relation
schema R if, for all legal instances r of R, all snapshots of r
satisfy the functional dependency X Y.
Temporal projection
 Temporal projection is similar to standard projection,
except that the restriction applies to only the nontemporal attributes. Both timestamp columns cannot
be excluded in the resultant history.
 After temporal projection, folding is enforced in order
that adjoining intervals should be merged into a
single interval in the resultant relation.
32
Temporal selection
 Adds the following new construct to standard SQL:
selection based on temporal comparisons of
timepoints and intervals using terms in a WHEN
clause.
 The WHEN clause is used to express the temporal
part of a query.
 The temporal comparison in the WHEN clause has
the following form:
WHEN a interval_compare_operator b
where a,b are intervals and
interval_compare_operator can be one of the
keywords: BEFORE, AFTER, DURING,
EQUIVALENT, ADJACENT, OVERLAPS,
PRECEDES, and FOLLOWS.
33
Temporal join
 This join has the most special semantics: the valid-
time intervals of the resultant table are created
from the intersection of the overlapping valid-time
elements of the tables specified in the join.
 Assumption: The valid time component in each
temporal table must be well-defined before
performing such joins.
34
Temporal join
 To perform joining two temporal tables:
– first assemble the non-temporal columns. The
columns are assembled by generating the cross
product of the non-temporal columns from the
operand tables, and then excluding rows that do not
satisfy the conditions in the WHERE and WHEN.
– Then, examine the source tuples for each candidate
tuple in the reduced cross product to see if their valid
time periods overlaps.
- If they overlap, the candidate tuple is included and
the result of the intersection of two valid time periods is
used as the valid time period of the new tuple.
35
- otherwise excluded from the result of the join.
An example of temporal join
PROBLEMLIST
Patient
Problem
TS
TE
---------------------------------------------------------------------J. Smith
P1
14/Feb/2015
1/Mar/2015
J. Smith
P2
10/Mar/2015
Now
P. Jones
P3
1/Apr/2015
12/May/2015
R. Franks
P3
13/Feb/2015
1/Jun/2015
DRUGS
Patient
Drug
VS
VE
----------------------------------------------------------------------J. Smith
D1
20/Mar/2015
12/May/2015
P. Jones
D1
1/Apr/2015
6/Jun/2015
R. Franks
D2
4/Feb/2015
14/May/2015
“Show all problem and drug combinations for patient”
36
TEMPORAL
SELECT T1.Patient, T1.Problem, T2.Drug
FROM PROBLEMLIST AS T1, DRUGS AS T2
WHERE T1.Patient = T2.Patient
The resultant table:
Patient Problem Drug
TS
TE
-----------------------------------------------------------------------------J. Smith P2
D1
20/Mar/2015
12/May/2015
P. Jones P2
D1
1/Apr/2015
12/May/2015
R. Franks P3
D2
13/Feb/2015
14/May/2015
37