d04-stonebraker

Download Report

Transcript d04-stonebraker

Third-Generation Database
System Manifesto
The Committee for Advanced DBMS
Systems
Michael Stonebraker, Lawrence A.
Rowe, Bruce Lindsay, James Gray,
Michael Carey, Michael Brodie,
Philip Bernstein, David Beech
ACM SIGMOD Record, Sep'1990
First generation database systems
• The network and hierarchical databases of
the 1970’s
• The first systems to offer DBMS function in
a unified system
• e.g. CODASYL, IMS
Second generation database
systems
• relational databases of the 1980’s
• data independence and non procedural data
manipulation language
• e.g. DB2, INGRES, NON-STOP SQL,
ORACLE, Rdb/VMS
• Focused on business data processing
Third generation database
systems
• Problems with 2nd generation DBS
– inadequate for a broader class of applications (than business data
processing)
– e.g. CAD, CASE, Hypertext,
– storing text segments, graphics, etc is usually difficult in 2nd gen.
systems
– Does not support complex data (folders)
• Most vendors are working on functional enhancements on
their 2nd gen. systems
• Surprising degree of consensus on these features
• 3rd gen systems includes the desired capabilities of next
generation database systems
The tenets of third generation DBMS
Tenet 1
• Besides traditional data management services,
third generation DBMS will provide support for
richer object structures and rules
• richer object structures characterize the
capabilities to store and manipulate e.g. text and
spatial data
• designer should be given the capability to specify
a set of rules about data elements, records and
collections
The tenets of third generation DBMS
Tenet 2
• Third generation DBMSs must subsume second
generation DBMSs
• The major contribution of 2nd gen. DBMS:
– non procedural access
– data independence
• A query language is an absolute requirement
• Data independence have dramatically lowered the
amount of program maintenance that must be done
by applications and should not be abandoned
The tenets of third generation DBMS
Tenet 3
• Third generation DBMSs must be open to other subsystems
• Must have
–
–
–
–
–
fourth generation language
various decision support tools
friendly access from many programming languages
interfaces to business graphics packages
the ability to run the application on a different machine than the
DBMS
– ......
• Third generation DBMS must be open
• Must be willing to participate in future distributed DBMS
systems
Propositions concerning object
and rule management
• A third generation DBMS must have a rich type
system (se list in article)
• Inheritance is a good idea
• Functions including database procedures and
methods, and encapsulation are a good idea
• Unique identifiers for records should be assigned
by the DBMS only if a user defined primary key is
not available
• Rules (triggers, constraints) will become a major
feature in future systems. They should not be
associated with a specific function or collection
Propositions concerning
increasing DBMS function
• Essentially all programmatic access to a database
should be through a non.procedural, high-level
access language
• There should be at least two ways to specify
collections, one using enumeration of members
and one using the query language to specify
membership
• Updateable views are essential
• performance indicators have almost nothing to do
with data models and must not appear in them
Propositions that results from the
necessity of an open system
• Third generation DBMS must be accessible form
multiple HLLs
• Persistent X for variety of Xs is a good idea. They
will be supported on top of a single DBMS by
compiler extensions and a (more or less) complex
run time system
• For better or worse, SQL is intergalactic dataspeak
• Queries and their resulting answers should be the
lowest level of communication between a client
and a server
Summary
• Agree with OODB enthusiast on:
– rich type system, functions, inheritance,
encapsulation
• Disagree
– to narrowly focused on object management
issues
– non-SQL, single language systems appealing to
a fairly narrow market
Summary cont.
• DMBS access should only occur through a query
language (20 years of history is convincing)
• Physical navigation by user programs and
functions should be avoided
• Automatic collections should be encouraged
• Persistence should be added to a variety of
programming languages (this has little to do with
the data model)
• Unique identifiers should be user defined or
system defined
•
Summary cont.
• A natural evolution from current RDBS to the ones with
the capabilities discussed in this paper
• Many of the features are already supported by ”aggressive”
RDBS vendors
– inheritance, additional type constructors, persistent programming
languages must be supported
• Current ODBS are not faithful to any of the tenets and
some of the propositions
– query languages, rule system, SQL client/server support, views,
persistent programming languages must be supported
– must undo hard coded requirements for UID and discourage
navigation
– must build 4GL, support distributed databases, tune systems to
perform efficient data management
Research and development
challenges
• The design of persistent programming languages
for existing HLL
• The inclusion of pleasing query language
constructs
• Logical and physical database design will get
more difficult with richer type systems and rules
• Optimisation of the execution of rules
• Tools to allow users to visualize and debug rule
oriented applications