Some Fundamental Questions in Databases

Download Report

Transcript Some Fundamental Questions in Databases

Some Fundamental Questions in
Databases
Nick Rossiter
(with Michael Heather)
21st May 2004
Informatics Research Conference,
Northumbria University
1
Questions
•
•
•
•
•
Why so many database models?
What is their rationale?
Are there undiscovered new models?
Is there an ultimate data model?
Do all models have to be reductionist?
21st May 2004
Informatics Research Conference,
Northumbria University
2
Meaning of Database Model
• A model is a representation of reality
according to some perceived view.
• A database model for representing this view
has:
– a structure
– a manipulation language
– rules for controlling the structure
21st May 2004
Informatics Research Conference,
Northumbria University
3
Classical Database Models 1
Model
Maths
Manipulation
Relational
Sets
SQL
Nonprocedural
Status
Des- Cite
ign
Alg/Calc/Bool Yes
theoretical
??
Bags
Calc/Bool
Yes
dominant
Hierarchical
Trees
Traversal
No
legacy
Addon
EER
??
Network
Small
graphs
Traversal
No
legacy
??
Functional
Functions
Compos-ition
No
theoretical
??
21st May 2004
Informatics Research Conference,
Northumbria University
Codd
(1970)
IBM
(1976)
IBM
(1968)
DBTG
(1971)
Shipman
(1981)
4
Classical Database Models 2
Maths
ObjectOriented
Categories Calculus/
/ monoids Traversal/
Composition
??
Calc/Alg/
Bool
niche
Yes
(OQL)
Sets
Yes
Objectrelational
Nestedrelational
Semistructured
Small
graphs/
recursion
21st May 2004
Manipulation
Nonprocedural
Model
Calc/Alg/
Bool
Traversal
Yes
No
Status
experimental
Design
Cite
Addon
UML
Atkinson et al
(1995)
Addon
EER
subsumed ??
(o-o/o-r)
popular
for data
??
exchange
(XML)
Informatics Research Conference,
Northumbria University
SQL3 (1999)
Roth
(1987)
Abiteboul/
Widom (1997)
5
Classical Relationships
• Relationships are often performed in a separate process:
– Entity-Relationship Modelling
– Unified Modelling (UML)
• Normalisation is needed to verify schema design:
– particularly to relate key and non-key attributes.
• The levels, mappings and relationships all have to be
integrated in a consistent database design.
21st May 2004
Informatics Research Conference,
Northumbria University
6
Developing non-classical Areas
• The new developing areas are:
– quantum computation, exploiting quantum mechanics
principles in physics,
– nanoscale chemistry,
– bio- and molecular-computing processing as in genetics
• Collectively referred to as natural computing
• Natural computing is:
– Real-world processing
• does not rely on any model.
– Data can be input neat
• without any reductionist pre-processing.
21st May 2004
Informatics Research Conference,
Northumbria University
7
Non-classical Database Design
• Not layered (Theory of Categories)
• Use Dolittle approach (push me-pull you creature of High
Lofting):
• A database design is a topos -- a Dolittle diagram subsuming the
pullback/pushout relationships as:
X
+
Cartesian Closed
Category
f*
21st May 2004
Informatics Research Conference,
Northumbria University
8
What is f*?
• f* is an examination and re-indexing functor
– organises the data into a key for storage and applies a
query for interrogation of the database.
– puts together a key by concatenation as in the relational
model.
– looks up information for retrieval by inspecting the key.
• In quantum theory:
– the key (X) is entanglement,
– the colimit (+) is superposition,
• In genetics it is a DNA strand.
21st May 2004
Informatics Research Conference,
Northumbria University
9
Enriched Pullback
• In terms of the Dolittle diagram:
– f* is the same operation in classical and natural
computation.
• What then corresponds to the database schema in
natural computing?
• The pullback diagram contains many more arrows
than in the Dolittle diagram.
• This enriched diagram satisfies our needs.
21st May 2004
Informatics Research Conference,
Northumbria University
10
Pullback of S and M in Context of IMG
ls x m
S
s
ls x m

S XIMG M

rs x m 
s x m *m
M
s
(s)-1
m
(m)-1
W/IMG
Figure 2: Pullback showing fuller collection of arrows
S = source, M = medium, IMG = image, W = world
21st May 2004
Informatics Research Conference,
Northumbria University
11
Contents of Enriched Dolittle
• The Dolittle diagram relates binary categorial limits (X)
and colimits (+) for types
• Logic is Heyting -- intuitionistic logic.
• Godement calculus can be used for composing arrows
across levels
• pullback functor (f* or ):
– emulates the join operation of databases
• Other arrows represent:
– projection ; membership 
– existential  or  quantification
– universal  or  quantification
21st May 2004
Informatics Research Conference,
Northumbria University
12
Higher-level Arrows for
Semantics/DB Design
• Originality with the unit of adjunction 
– Example:  gives properties of relationship onto limit
–  = 0, no creativity, mapping S to S XIMG M is 1:1
–  = 1, maximum creativity, mapping S to S  IMG M is from S to
cartesian product of S  M.
• Style with the counit  of adjunction.
– Example:  gives properties of relationship onto colimit
–  = 1, preservation of style, each S is found exactly once in S XIMG
M
–  = 0, loss of style, each S occurs in S  IMG M maximum number
of times (S  M).
21st May 2004
Informatics Research Conference,
Northumbria University
13
Normalisation and Defeasance
• Not captured at the data model level
– except for 1NF in some models
• A complex topic jargon-wise
• Perhaps related to defeasibility or
nonmonotonic reasoning:
– A kills B implies A has committed murder
– A kills B AND B attacked A implies A has
committed manslaughter
– strengthening of antecedent changes conclusion
21st May 2004
Informatics Research Conference,
Northumbria University
14
Integration of Normalisation
within the model
• Many normalisation definitions say:
– A table is normalised if
• condition X holds
• and additional conditions B, C,, D etc do not hold
• Normalisation failure is due to defeasance
• Exploring incorporation of defeasance into
topos
21st May 2004
Informatics Research Conference,
Northumbria University
15
Non-classical Database Model
Model
Maths
Manipulation
Nonprocedural
Design
Cite
Topos
Categories
Composition/
Intuitionistic/
Godement
Yes
Integrated
(topos,
defeasance)
Nelson &
Rossiter
(1996);
Rossiter &
Heather
(2003,
2004)
21st May 2004
Informatics Research Conference,
Northumbria University
16
Questions
•
•
•
•
•
Why so many database models?
What is their rationale?
Are there undiscovered new models?
Is there an ultimate data model?
Do all models have to be reductionist?
21st May 2004
Informatics Research Conference,
Northumbria University
17