PanSTARRS Seminar - UH Institute for Astronomy

Download Report

Transcript PanSTARRS Seminar - UH Institute for Astronomy

Relational Databases
Narayan Raja
Pan-STARRS Seminar
Introduction to Databases
 Why databases?
 Requirements of a database system
 Data Models
 Relational data model
 “Object: data model
 Relational vs. Object
 Questions
Pan-STARRS Seminar
Why Databases
 Astronomer’s perspective – just use files!
 Issues existing






Scalability ?
Performance ?
Ad-hoc queries ?
Security ?
Concurrence ?
Integrity ?
 Therefore – use databases
Pan-STARRS Seminar
Requirements of a Database
Management System (DBMS)
 Data Integrity
 Scalability
 Security
 Performance
 Transaction capability




Atomicity
Consistency
Isolation
Durability
 “ACID” test of a database
Pan-STARRS Seminar
Data Models
 Network
Historical interest only, though maybe not
Hierarchical
 Historical interest only, though maybe not
Relational
 Dominant kind today
Object
 Emerging / controversial for 10 years
“Object – Relational” - compromise





Pan-STARRS Seminar
Relational Data Model
 Practical Terms





Tables, nothing but tables
No pointers
Specific Integrity Constraints
• Unique Rows
• Primary & Unique Keys
• Foreign Keys
Specific Operators only
• “Select”
• “Project”
• “Join”
• etc
SQL!
Pan-STARRS Seminar
Relational Data Model
 Theoretical Terms



Mathematically based
•Set theory
•Predicate Calculus
A “Table” is nothing but a particular kind of Set,
i.e. a “Relation” (hence the term “relational”)
Example: Stars
STAR-ID
Type
Luminosity
Distance
Alpha Centari A
G2V
Null
4.22
Sirus B
A2-S
Null
8.6
Pan-STARRS Seminar
More about Relations
 A Relation is a Set of true assertions about the world.
 The Header of a Relation, e.g. “Stars” is a predicate
function (i.e. truth-valued function):
 “STAR-ID is of Type “TYPE”, has apparent
magnitude “MAGNITUDE” and is at distance
“DISTANCE”.
 Each “tuple” (row of the table) is an instantiation of the
above Predicate function.
Pan-STARRS Seminar
More about Relations con’t
 The operators provided are set-level (“Relational”)
operators, with the following important properties:



Closure (operands and result are all relations)
They take one or more consistent sets of truthful
assertions about the world, and produce as an
output, another set of truthful assertions.
They are at a high level of abstraction (“What” rather
than “how”)
Pan-STARRS Seminar
Therefore ….
 Relational operators can be nested to an arbitrary level of
complexity.
 We can be confident about their output. True In
→True Out.
 We leave it to the DBMS to figure out how exactly to
efficiently implement our (arbitrarily complex) Relational
expression. No need for pointer – chasings at the user level.
Pan-STARRS Seminar
Example: an observation at UH 88
Observations
Timestamp
Observatory Telescope
2004Nov05 MKO
UH88
Band
Observer
Magnitude
Error
V
Meech
…
…
Observations-Data
OBS-ID
MAG
ERR
 Plus several additional tables:





Observations Metadata
Telescopes
Observatories
Filters
Observers
Pan-STARRS Seminar
“Object” Data Model
 Outgrowth of OOD/OOP
 Encapsulation
 Inheritance
 Polymorphism
 State of the art in programing
 No “programer impedance mismatch”
BUT:
 Not mathematically based
 No ad-hoc querying
 Data integrity?
Pan-STARRS Seminar
Relational vs. Object
Data Integrity
Yes
?
Ad-hoc queries
Yes
No
Querying/DML
SQL
Programming needed
Software
infrastructure
Yes
So-so
Personnel
Yes
Meagre
Very mature
Evolving
Market Share
$12 billion
$300 million
“Naturalness”
No! .. Or maybe
Yes!!
Maturity
Pan-STARRS Seminar