Object Database Semantics: the Stack

Download Report

Transcript Object Database Semantics: the Stack

Object Database Semantics:
the Stack-Based Architecture
Presentation prepared for
Object Database Technology Users and Vendors Roundtable
OMG Technical Meeting, Burlingame, CA, December 10-14, 2007
by
Prof. Kazimierz Subieta
Polish-Japanese Institute of Information Technology, Warsaw, Poland
[email protected]
www.ipipan.waw.pl/~subieta
SBA/SBQL pages: www.sbql.pl
K.Subieta. ODB Semantics: SBA, slide 1
December 2007
Topics
•
•
•
•
•
•
•
•
•
Human and machine semantics – the motivation for SBA
Machine aspects of database semantics
Semantic quarks of OO store models
Functionality, semantics & theories
SBA – an approach to formal semantics
Major topics that SBA deals with
Some examples
Current SBA implementation in the ODRA system
SBA and interoperability
K.Subieta. ODB Semantics: SBA, slide 2
December 2007
Targets of database semantics
• Human (database designer, programmer, database administrator)
– Humans perceive the semantics informally
– The most important is practice, training and efficiency of work
– The informal semantics addressing humans is under influence of a lot
of side factors, including beliefs, aesthetics, opinion of authorities, etc.
• Machine (systems that have to accomplish the semantics through
interpreters, compilers, mappers, etc.)
– Machine semantics is always fully formal and deterministic
• Human semantics is a guide for developing machine semantics
• However, the machine semantics eventually determines the
human semantics
K.Subieta. ODB Semantics: SBA, slide 3
December 2007
Human and machine semantics
• Programmer’s understanding of database semantics and
machine interpretation of database semantics should coincide.
– The programmer must understand database structures on a high
abstraction level
– The programmer must understand semantics of queries addressing
database structures on a high abstraction level, too
• No essential details concerning database structures and query
semantics (e.g. updating) can be neglected or treated as
„implementation issues”.
– However, programmers need not be aware of all the machine
semantics details
– Programmers use a database programming environment intuitively
– Machine should precisely follow their thinking and imagination
K.Subieta. ODB Semantics: SBA, slide 4
December 2007
Database models and SBA
• Database semantics depends on the assumed data model
• „Data model” is an ideological rather than technical notion
– People believe or not.
– No mathematics, theory or experience can justify a particular ideology
for all future cases and applications
– Ideologies can be wrong, based on misleading superficial rhetoric
• The relational database model is an ideology, supported by
(very limited) mathematical theories
• OO is an ideology, too
– SBA is a theory supporting OO database models,
– SBA is incomparably more powerful than relational theories
• However, SBA is neutral to database models
– It can be used to object-oriented, XML, relational and other models
K.Subieta. ODB Semantics: SBA, slide 5
December 2007
Machine aspects of database semantics
• Machines deal with data structures rather than with data
models
• For this reason in SBA we talk about data store models
– i.e. formal concepts determining organization of data structures
• A store model is purely formal, it does not involve an „ideology”
– Features of a data model are reflected in data structures indirectly
– Some features of a store model appear as the result of orthogonality
criteria, beyond the data model
• Database queries and programs formally address formal data
structures
K.Subieta. ODB Semantics: SBA, slide 6
December 2007
Semantic quarks of OO store models
• There are many object-oriented store models that are
significantly different and possess incompatible features
– Smalltalk, C++, Java, C#, CORBA, ODMG, SQL-99, XML, RDF, …
– The models tend to be complex and non-intuitive
– The same notions are understood differently
• Is it possible to unify them on some common ground?
• SBA reduces the models to „semantic quarks”: object
identifiers, atomic values and object names
– They are used to build object store models practically with no limits
– Some principles (known for 40 years):
• Object relativism: each component of an object is an object
• Total identification: each run-time entity (e.g. object) should possess an
internal identifier (used as a reference to the entity)
K.Subieta. ODB Semantics: SBA, slide 7
December 2007
An SBA object
K.Subieta. ODB Semantics: SBA, slide 8
December 2007
Functionality of queries – where is the limit?
• All theories devoted to the relational model assume some limited
role of queries
– E.g. the relational algebra: even 2+2 is beyond
• SQL: limits of the relational algebra are not reasonable
– SQL-99 has the power of universal programming languages, far more
than the „relationally complete” languages can do
• Limited role of queries in the database programming is assumed
in all proposals concerning OO query languages
– models, theories, OODBMS, ODMG standards, …
• SBA abandons this philosophy
K.Subieta. ODB Semantics: SBA, slide 9
December 2007
SBA: no limit for applications of QLs
• Any functionality may require queries
• Queries take the role of expressions of programming languages
– Hence new attitude to the semantic description, in particular, to
theories that are the basis for semantics
• SBA removes the border between querying and programming
• SBA theory is a continuation of the programming language
theory rather than database theories.
K.Subieta. ODB Semantics: SBA, slide 10
December 2007
Semantics of query & programming languages
• Usually, semantics is intuitively explained rather than formally
specified
– This is typical for programming languages
– Formal specification of semantics is complex, boring, full of difficult
concepts, frequently containing bugs and unspecified parts
• Intuitive specification of semantics makes problems for
standards:
– Ambiguous specification => many incompatible implementations
– Contradictory specification => no correct implementation
– No reasoning on redundancy, equivalence or incompleteness of
language constructs
• No formal semantics => poor query optimization, difficulties
with strong typing, and other problems
• Even smallest semantic problem is a very big problem
– Especially for standardization aiming at code portability
K.Subieta. ODB Semantics: SBA, slide 11
December 2007
Precision, Functionality and Universality
• Precision
– Simple, non-ambiguous model of data structures being queried
– Non-ambiguous semantics of query and programming constructs
• Functionality and Universality
– A formal object model should cover (almost) all features of the current
object models, including UML, CORBA, XML, Java, C++, WSDL
– A complete query and programming language addressing the model
• What does it mean „complete”?
• Complete = practically universal: the power of PLs + interoperability +
performance + client/server + transactions + database abstractions + ….
• Mistake of current database models & theories: neglecting updates
• Non-redundancy
– Keep the model and the language as lean as possible
K.Subieta. ODB Semantics: SBA, slide 12
December 2007
SBA – an approach to formal semantics
• Formal models of data stores:
– Based on the semantics quarks and assumed principles
• Models M0, M1, M2 and M3 cover basic features of majority of
object models, including complex objects, classes, inheritance,
roles and encapsulation
– Other features can be introduced by small variations of M1-M3
• Functionality of SBQL and its programming capabilities
address all features of M1-M3
• Semantics of SBQL is expressed in a way specific to
programming languages
• SBA non-algebraic operators – known from relational
languages, but defined in the spirit of programming languages
K.Subieta. ODB Semantics: SBA, slide 13
December 2007
Approaches to formal semantics
•
There are many approaches, in particular:
–
–
–
–
–
•
Relational/object algebras (or calculi)
1st order logic
Denotational semantics
Operational semantics
etc.
In the specification of semantics three aspects are important:
1. Formal specification
2. Supporting all imaginable data structures and query functionalities
3. Communication with developers who must understand the semantics
to implement it
•
•
Academic people usually address only the first aspect
For other aspects only operational semantics is adequate
K.Subieta. ODB Semantics: SBA, slide 14
December 2007
Abstract implementation as semantic specification
• It is a kind of operational semantics based on abstract machine
that accomplishes query/program processing
• SBA abstract machine introduces three well-known structures:
– object store,
– environment stack (thus SBA),
– query result stack.
• These structures are fundamental for precise semantic
description of everything that may happen in database
query/programming languages.
– Classical query operators, such as selection, projection, joins and
quantifiers, can be generally and precisely specified
– Updating constructs, programming abstractions, database abstractions,
strong typing, etc. can be expressed in terms of abstract
implementation.
K.Subieta. ODB Semantics: SBA, slide 15
December 2007
SBA – power through orthogonality
• SBA discovers and employs semantic quarks of query languages
–
–
–
–
Primitive queries: literals and names
Semantics of a complex query is build from semantics of its parts
Environment and result stacks as a mechanism of query composition
Semantics based on object references rather than object values
• Qualities of the orthogonality and semantic quarks
–
–
–
–
–
–
Powerful theory and reasoning on features of languages
Easier implementation
Much shorter programmers’ manuals
Powerful query optimization methods and a strong typing system
Much easier teaching and developing general principles
Supporting inventions and new ideas
K.Subieta. ODB Semantics: SBA, slide 16
December 2007
The idea of SBA
• Unification of PL expressions and queries:
–
–
–
–
–
2, ”Smith”
salary, x, Employee
2+2 , (x+y)*z
Employee where salary = 1000
(Employee where salary = (x+y)*z).surname
• All such expressions/queries used as:
– arguments of imperative statements,
– parameters of procedures, functions or methods
– a return from a functional procedure (from a method)
• Expressions/queries + programming capabilities used for:
– Procedures, functions, classes, types, inheritance, roles, …
– Virtual updatable views
– Various other forms of database abstractions (triggers, business rules,
constraints, transactions,…)
K.Subieta. ODB Semantics: SBA, slide 17
December 2007
Major topics that SBA deals with (1)
•
•
•
•
•
•
•
•
•
•
General architecture of query processing
Abstract models of object stores
Syntax, semantics and pragmatics of query languages
Semantics of algebraic and non-algebraic operators
Classes, methods and static inheritance in query languages
Dynamic object roles and dynamic inheritance in query
languages
Processing of irregular data structures (semi-structured data)
Transitive closures and fixed-point equations
Imperative (updating) constructs
Procedures, functions and methods
K.Subieta. ODB Semantics: SBA, slide 18
December 2007
Major topics that SBA deals with (2)
•
•
•
•
•
•
•
•
Parameter passing for procedures, functions & methods
Encapsulation
Virtual updatable views
Types, interfaces, schemas and metamodels
Static (semi-) strong type checking of queries and programs
Query optimization (rewriting, indices, caching, …)
Query processing and optimization in distributed systems
Data-intense grids and P2P networks: integration of distributed,
heterogeneous, fragmented and redundant resources
• Aspect-oriented databases
• OMG MDA and executable UML + OCL
K.Subieta. ODB Semantics: SBA, slide 19
December 2007
UML-like schema (ODRA)
K.Subieta. ODB Semantics: SBA, slide 20
December 2007
SBQL queries
• Get all information on departments for employees named Doe:
(Emp where lName = “Doe”).worksIn.Dept
• Get the name of Doe’s boss:
(Emp where lName = “Doe”).worksIn.Dept.boss.Emp.lName
• Names and cities of employees working in departments managed by Kim:
(Dept where (boss.Emp.lName) = “Kim”).employs.Emp.
(lName, if exists(address) then address.city else “No address”)
• For each employee get the name and the percent of the annual budget of
his/her department that is consumed by his/her monthly salary:
Emp . (lName as n, (((if exists(sal) then sal else 0) as s).
((s * 12 * 100)/(worksIn.Dept.budget)) as percentOfBudget)
K.Subieta. ODB Semantics: SBA, slide 21
December 2007
SBQL programs (ODRA)
• For each person having no salary give the minimal salary in his/her
department:
for each (Emp where not exists(sal)) as e {
e.changeSal( min(e.works_in.Dept.employs.Emp.sal) )}
• A method:
changeSal( newSal: real ): boolean {
if (not exists(self.sal)){
sal: real[0..1];
self :< create sal(newSal);
}
else {
if (self.sal > newSal) return false;
else self.sal := newSal;
}
return true;
}
K.Subieta. ODB Semantics: SBA, slide 22
December 2007
Current
ODRA
Architecture
K.Subieta. ODB Semantics: SBA, slide 23
December 2007
SBA/SBQL in recent (pending) projects
• ODRA (Object Database for Rapid Applications) - queries, imperative
constructs, programming abstractions, classes, types, methods, inheritance,
modules, query optimization,...
• European project eGov Bus. Integrating distributed resources being
under control of various European governmental institutions.
– SBQL as an embedded QL for application programming in Java.
– SBQL as self-contained DBPL for application programming.
– Virtual repository based on SBQL virtual updateable OO views
• European project VIDE - developing a visual programming language
for the OMG MDA.
– OCL and other concepts related to Executable UML are implemented
• XML2XML mapper based on SBQL (more powerful than XSLT)
K.Subieta. ODB Semantics: SBA, slide 24
December 2007
Current functionality of ODRA
• Object model: complex objects,
collections, associations, classes,
inheritance, polymorphism, types and
schemata
• ODRA IDE
• SBQL queries: many algebraic and
all non-algebraic operators, transitive
closures, function, procedure and
method calls
• Typing system: semi-strong static
type checking, dynamic type checks
• Imperative (updating) statements
and control statements
• Procedures, functions with
parameters, recursive
K.Subieta. ODB Semantics: SBA, slide 25
• Virtual Updatable Views
• Transactions
• Multiple-client/multiple server
architecture
• Accessing Java Libraries
• Accessing Web Services
• Wrapper to Relational DB
• XML Importer and Exporter
• RDF Wrapper
• Web Services Front End
• ODRA Web API (JSP + SBQL)
• ODRA JOBC
• ODRA Indexing
• ODRA Access Control
December 2007
Interoperability with RDBMS
• The ORM problem is not (cannot be?) properly solved on the
ground of current object-oriented technologies, including Java.
• If the mapping between the models would be complex, then:
– Performance can be unacceptable for very large databases (the SQL
query optimizer has no chances to work).
– Updating leads to non-trivial view updating problems.
• Practically, only limited mappings are acceptable
• The problem is much easier on the ground of SBA, due to
virtual updateable views:
– Because SBA is neutral to data models, it is possible to see a relational
database as a primitive object database queried by SBQL
– Then, SBQL virtual updatable views make it possible to map the
relational database to an object database with full algorithmic power
– This technology is implemented in ODRA.
K.Subieta. ODB Semantics: SBA, slide 26
December 2007
Interoperability with popular programming
languages: Java, C++, Ruby, etc.
• A difficult problem with a lot of (partial) solutions:
–
–
–
–
–
Bindings in the style of embedded SQL or ODMG
An interface from a PL to SBQL in the style of ODBC/JDBC
A generic gateway from SBQL to libraries written in other languages
Generic middleware based e.g. on CORBA or Web Services
Generic middleware based on a virtual repository
• The MDA case: CIM => PIM => PSM => code
– If the transformations between the models are to be done
automatically, the problem is difficult
– Manual transformations?
• Using native syntax (Java, Ruby, etc.) to query some external
resources (e.g. OO databases) – a very challenging problem.
– No general solution.
K.Subieta. ODB Semantics: SBA, slide 27
December 2007
Conclusions
• To make a high quality standard for object-oriented databases, the
specification of semantics is the must,
– to avoid the fate of SQL-99 and ODMG standards
• SBA offers the unique method of query languages’ construction and
semantic specification.
– SBA is a holistic database theory, it doesn’t give up any (even the most
advanced) feature of current practical OO database QL/PL.
• Michi Henning, ZeroC: „No standard should be approved without a
reference implementation. …
– No one is brilliant enough to look at a specification and be certain that
it does not contain hidden flaws without actually implementing it.”
• SBA has been implemented more than 10 times, for different
systems and purposes.
– PJIIT can support OMG with a reference implementation of a new
object database standard.
K.Subieta. ODB Semantics: SBA, slide 28
December 2007