PanSTARRS Seminar - UH Institute for Astronomy
Download
Report
Transcript PanSTARRS Seminar - UH Institute for Astronomy
Relational Databases
Narayan Raja
Pan-STARRS Seminar
Introduction to Databases
Why databases?
Requirements of a database system
Data Models
Relational data model
“Object: data model
Relational vs. Object
Questions
Pan-STARRS Seminar
Why Databases
Astronomer’s perspective – just use files!
Issues existing
Scalability ?
Performance ?
Ad-hoc queries ?
Security ?
Concurrence ?
Integrity ?
Therefore – use databases
Pan-STARRS Seminar
Requirements of a Database
Management System (DBMS)
Data Integrity
Scalability
Security
Performance
Transaction capability
Atomicity
Consistency
Isolation
Durability
“ACID” test of a database
Pan-STARRS Seminar
Data Models
Network
Historical interest only, though maybe not
Hierarchical
Historical interest only, though maybe not
Relational
Dominant kind today
Object
Emerging / controversial for 10 years
“Object – Relational” - compromise
Pan-STARRS Seminar
Relational Data Model
Practical Terms
Tables, nothing but tables
No pointers
Specific Integrity Constraints
• Unique Rows
• Primary & Unique Keys
• Foreign Keys
Specific Operators only
• “Select”
• “Project”
• “Join”
• etc
SQL!
Pan-STARRS Seminar
Relational Data Model
Theoretical Terms
Mathematically based
•Set theory
•Predicate Calculus
A “Table” is nothing but a particular kind of Set,
i.e. a “Relation” (hence the term “relational”)
Example: Stars
STAR-ID
Type
Luminosity
Distance
Alpha Centari A
G2V
Null
4.22
Sirus B
A2-S
Null
8.6
Pan-STARRS Seminar
More about Relations
A Relation is a Set of true assertions about the world.
The Header of a Relation, e.g. “Stars” is a predicate
function (i.e. truth-valued function):
“STAR-ID is of Type “TYPE”, has apparent
magnitude “MAGNITUDE” and is at distance
“DISTANCE”.
Each “tuple” (row of the table) is an instantiation of the
above Predicate function.
Pan-STARRS Seminar
More about Relations con’t
The operators provided are set-level (“Relational”)
operators, with the following important properties:
Closure (operands and result are all relations)
They take one or more consistent sets of truthful
assertions about the world, and produce as an
output, another set of truthful assertions.
They are at a high level of abstraction (“What” rather
than “how”)
Pan-STARRS Seminar
Therefore ….
Relational operators can be nested to an arbitrary level of
complexity.
We can be confident about their output. True In
→True Out.
We leave it to the DBMS to figure out how exactly to
efficiently implement our (arbitrarily complex) Relational
expression. No need for pointer – chasings at the user level.
Pan-STARRS Seminar
Example: an observation at UH 88
Observations
Timestamp
Observatory Telescope
2004Nov05 MKO
UH88
Band
Observer
Magnitude
Error
V
Meech
…
…
Observations-Data
OBS-ID
MAG
ERR
Plus several additional tables:
Observations Metadata
Telescopes
Observatories
Filters
Observers
Pan-STARRS Seminar
“Object” Data Model
Outgrowth of OOD/OOP
Encapsulation
Inheritance
Polymorphism
State of the art in programing
No “programer impedance mismatch”
BUT:
Not mathematically based
No ad-hoc querying
Data integrity?
Pan-STARRS Seminar
Relational vs. Object
Data Integrity
Yes
?
Ad-hoc queries
Yes
No
Querying/DML
SQL
Programming needed
Software
infrastructure
Yes
So-so
Personnel
Yes
Meagre
Very mature
Evolving
Market Share
$12 billion
$300 million
“Naturalness”
No! .. Or maybe
Yes!!
Maturity
Pan-STARRS Seminar