No Slide Title
Download
Report
Transcript No Slide Title
Simplicity
as applied to Relational Databases
David Livingstone
IMLab, CEIS
May 2007
A Dangerous Topic
Suppose the talk is :
– difficult to understand ?
– boring ?
Simple things are ‘obvious’ with hindsight.
Simple things are of limited use.
Academics don’t ‘do’ simple things.
They do difficult things.
Simplicity
Assumed to be good and worthwhile.
In research :
• Inherently important in the concepts dealt with;
• Important objective to achieve in the results.
Brought home to me in the feedback from my PhD
Mid-Point Progression.
Need to deal with simplicity explicitly and clearly.
Overview
• My own introduction to
simplicity
What about Simplicity ?
• What simplicity is NOT
• Importance of simplicity
‘Model’ of Simplicity.
“Open Database Project”.
My PhD project.
Simple Computer Languages
• APL
• Unix Shell Languages
1 means of holding data; 1 means of executing processes.
Completely generalised ‘means’.
High level of abstraction; algebra programming style.
Higher programming productivity (1 order of magnitude).
“No. of lines of valid code written per day is independent of
the language”.
Power was the motivation for use, not simplicity.
• Relational DBs with Relational Algebra.
Simplicity in Physics
• Raise level of abstraction.
• Raise level of generality.
Example : electricity & magnetism are special cases
of electromagnetism.
Example : unify the 4 forces of nature into one.
(Needs 10 & 26 dimensions !)
Example : gravitons inferred from gravity waves.
Beauty : simplicity, elegance, symmetry.
(Hyperspace by Michio Kaku).
Einstein developed relativity because symmetry was
more fundamental than Newtonian space-time.
reform space-time to fit symmetry.
What Simplicity is NOT (1)
NOT Minimalism
Minimalism provides simplicity by limiting explicit
functionality.
Minimalism Essentiality.
Essentiality maintains functionality.
Codd used ‘Essentiality’ to create relational DBs.
• Only one essential data construct, the relation.
• Earlier database models had 2 or more data
constructs, but only the functionality of relations.
greater complexity.
(NB Each construct requires its own operators).
What Simplicity is NOT (2)
NOT (necessarily) intuitive.
“Intuition is simply a state of subconscious knowledge
that comes about after extended practice”.
“Difficult tasks will always have to be taught. The trick is
to ensure the technology is not part of the difficulty”.
(Donald Norman).
“Although a programming language is unlikely to
contribute directly to a solution, it may obstruct
solution, even contributing to errors and oversights”.
(Petre)
‘Intuition’ ≈ Skill. The “2-year programming
experience” Catch-22.
‘Wrong’ experience may require
‘un-learning’
Proponents of OO Programming insist on the need to
‘think in OO terms’ to be able to program effectively.
“The use of COBOL cripples the mind; its teaching
therefore should be regarded as a criminal offence”.
“It is practically impossible to teach good programming
to students that had a prior exposure to Basic ... they
are mentally mutilated beyond hope of regeneration”.
(Dijkstra).
“The tools we use have a profound (and devious !)
influence on our thinking habits and therefore on
our thinking ability”.
(Dijkstra)
Desire for Simplicity
“Everything should be made as simple as possible, but
not simpler”.
(Albert Einstein).
“Entities should not be multiplied without necessity”.
(Ockham’s Razor).
“The aim of science is always to reduce complexity to
simplicity”.
(William James).
“Great engineering is simple engineering”.
(James Martin).
“Perfection is achieved, not when there is nothing
more to add, but when there is nothing left to take
away”.
(Antoine de Saint-Exupery).
Note there’s an Irreducible Minimum
Practical Importance of Simplicity
“If projects or programmes are overly complex, there is
a good chance they are simply wrong”.
(Brian Jones, IBM).
“Complexity leads to design problems & greater risk of
error”.
(Martyn Thomas, Praxis MD).
“There are no complex systems that are secure.
Complexity .. almost always comes in the form of
features or options”.
(Ferguson & Schneier).
“.. that a product with fewer features might be more
usable, more functional, & superior .. is considered
blasphemous”.
(Donald Norman).
Overview (2)
What about Simplicity ?
‘Model’ of Simplicity.
• The Simplicity Required
• Simplification Principles
“Open Database Project”.
My PhD project.
What Kind of Simplicity is Required ?
Mental Model Principle - “People understand & interact
with systems & environments based on mental
representations developed from experience”.
(Universal Principles of Design).
“A good conceptual model is .. fundamental to good design”.
“Good designers present explicit conceptual models for users”.
“Start with a simple, cohesive conceptual model and use it to
direct all aspects of the design”.
(D. Norman - “Design of Everyday Things”).
The simplicity required of a software product is
that of its conceptual model.
Conceptual Integrity
Fred Brooks - “The Mythical Man-Month”.
• “Because ease of use is the purpose, the ratio of
function to conceptual complexity is the ultimate test
of system design.
Neither function alone nor simplicity alone defines
a good design”.
• “Simplicity and straightforwardness proceed from
conceptual integrity. Every part must reflect the
same philosophies & the same balancing of
desiderata .. the same techniques in syntax &
analogous notions in semantics”.
Simplification Principles
• Parsimony of concepts.
• Simplicity. Straightforward concepts. Terseness.
Elegance.
• Generality. No limitations/exceptions. Exceptions /
limitations, & ways round them, need to be modelled
complexity & less functionality.
• Orthogonality. Each concept is independent of
every other concept, so that they can be combined in
any arbitrary way.
• Uniformity. Consistency, regularity, naturalness.
Obtaining Simplicity in an Application
• Raise the level of abstraction as much as possible.
• Derive a conceptual model that has conceptual
integrity.
Use the simplification principles to achieve this.
Use ‘essentiality’ to achieve simplicity & elegance.
• Implement the model with as much automation as is
feasible. (Defaults can be useful).
• Separate the model from its implementation.
May need complex software to get a simple
conceptual model.
Overview (3)
• What about Simplicity ?
• ‘Model’ of Simplicity.
• “Open Database Project”.
• My PhD project.
• The Third Manifesto
• What about SQL ?
• Produce a Proof of
Concept of TTM.
Today’s Relational DB ‘Problem’
How to handle more sophisticated
data in a simple manner ?
R
A1
...
A2
...
A3
...
...
...
...
...
...
...
...
...
...
A4
TTM ‘cleans up’ :
relational aspects,
confusion between
scalars and containers,
scalar typing.
greater simplicity and functionality.
The Third Manifesto
by
Chris Date & Hugh Darwen
• “A formal proposal for a solid foundation for data &
database management systems”.
• Based on
- the relational model,
- type theory.
• Aim : to solve the problem of how to support new kinds
of data (e.g. pictures, music, maps) in relational DBs.
• Concerns principles; derives a logical model.
• Implementation of the logical model is a separate matter.
Physical Data Independence.
SQL Has Problems !
SQL is now > 30 years old.
SQL doesn’t fully apply relational theory. For example :
• Doesn’t fully apply set theory (duplicate/sequenced
rows, sequenced columns).
SQL sometimes contradicts relational theory. For
example :
• Implementation pointers appear among logical data.
Poor language design. For example :
• Many ad hoc constraints on applicable expressions.
Complications in adding data types, nested containers.
SQL is unnecessarily complex & limited.
Open DB Project : TTM Proof of Concept
Uses RAQUEL (= Relational Algebra Query, Update &
Executive Language) :
• One means of holding data, relations.
• One means of deriving relational values, operators.
• One means of manipulating relational variables,
assignments/actions.
Unlike traditional programming languages, DBs need multiple
‘assignments’; e.g. retrieve, insert, delete.
These have been generalised, and include assigning integrity
constraints. (Concept developed in earlier research).
Includes sublanguages to :
• handle DB aspects (c.f. relational) : schemas and storage;
• scalar data types.
These have the same structure and style as the relational
sublanguage.
Overview (4)
What about Simplicity ?
‘Model’ of Simplicity.
“Open Database Project”.
My PhD project.
• Nested containers & their
complexity
• Removing their
complexity without loss
of functionality
10 Kinds of Container Type
R
A1
...
A2
...
A3
...
...
...
...
...
...
...
...
...
...
A4
PhD ‘cleans up’ kinds
of container type.
• Relations
• Records/structs
• Sets
• Dictionaries
• Bags
• Lists
• Queues
• Stacks
• Arrays
• Insertable arrays
Complexity Arising
From 1 kind of container type - the relation - to 10.
• 10 different structures,
• 10 different sets of operators.
Not always used in isolation, sometimes together.
master how to combine them.
n (n - 1) 2 possible pairs
45 possibilities.
n (n - 1) (n - 2) 2 possible trios 360 possibilities.
In practice, no SQL product currently provides more
than 4 different kinds of container type.
Reduction to 3 Container Types
Relations
• Relations
• Records/structs
• Sets
• Dictionaries
• Bags
Bags
• Sequences
Lists
• Queues
• Stacks
• Arrays
• Insertable arrays
Special implementations
of relations.
Different versions
of sequences.
Generalisation of Kinds of Container
Generalise the containers.
• Relation ≡ set of tuples.
• Bag
≡ bag of tuples.
• Sequence ≡ sequence of tuples.
• Simple mapping : bags & sequences sets / relations.
Generalise the operators.
• The corresponding operator is provided for each kind
of container (as far as possible).
• Operators provide closure (as for relations),
plus conversion operators.
Exploit the Nesting of Containers
Nesting is orthogonal.
‘External’ as well as ‘internal’ containers can be
sets, bags or sequences.
R
A1
...
A2
...
A3
...
...
...
...
...
...
...
...
...
...
A4
Conclusion
• The simplicity required is that of the user’s
conceptual model of the software product.
• Simplicity maximises the ‘power to weight’ ratio of
the software for the user.
• The software implementation may need to be
(very) complex to achieve simplicity for the user.
( Apply principles of simplicity again
in a layered architecture ? )
Acknowledgements
Nick Rossiter
(PhD Supervisor)
Open DB Project Group
Paul Irvine
Chris Date
Hugh Darwen
Third Manifesto authors
Paul Vickers
Akhtar Ali
Mid-Point Progression
Alas, the mistakes are all mine.