oodbs - COW :: Ceng

Download Report

Transcript oodbs - COW :: Ceng

Object-Oriented
& Object-Relational DBMSs
CENG 553 Database Management Systems
1
Advanced Database Applications
•
•
•
•
•
•
•
•
Computer-Aided Design/Manufacturing (CAD/CAM)
Computer-Aided Software Engineering (CASE)
Network Management Systems
Office Information Systems (OIS) and Multimedia
Systems
Digital Publishing
Geographic Information Systems (GIS)
Interactive and Dynamic Web sites
Other applications with complex and interrelated
objects and procedural data.
CENG 553 Database Management Systems
2
Expected features for new
applications
•
•
•
•
Complex objects
Behavioral data
Meta knowledge
Long duration transactions
CENG 553 Database Management Systems
3
Weaknesses of RDBMSs
• Poor representation of “Real World” entities
– Normalization leads to relations that do not correspond
to entities in “real world”.
• Semantic overloading
– Relational model has only one construct for
representing data and data relationships: the relation.
– Relational model is semantically overloaded
• Limited operations
– only a fixed set of operations which cannot be
extended.
CENG 553 Database Management Systems
4
Object-Oriented Concepts
•
•
•
•
•
•
•
Abstraction, encapsulation, information hiding.
Objects and attributes.
Object identity.
Methods and messages.
Classes, subclasses, superclasses, and inheritance.
Overloading.
Polymorphism and dynamic binding.
CENG 553 Database Management Systems
5
Database Systems
First Generation DBMS: Network and Hierarchical
– Required complex programs for even simple queries.
– Minimal data independence.
– No widely accepted theoretical foundation.
Second Generation DBMS: Relational DBMS
– Helped overcome these problems.
Third Generation DBMS: OODBMS and ORDBMS.
CENG 553 Database Management Systems
6
Object-Oriented Data Model
No one agreed object data model. One definition:
Object-Oriented Data Model (OODM)
– Data model that captures semantics of objects supported in
object-oriented programming.
Object-Oriented Database (OODB)
– Persistent and sharable collection of objects defined by an
ODM.
Object-Oriented DBMS (OODBMS)
– Manager of an ODB.
CENG 553 Database Management Systems
7
Commercial OODBMSs
•
•
•
•
•
•
•
GemStone from Gemstone Systems Inc.,
Objectivity/DB from Objectivity Inc.,
ObjectStore from Progress Software Corp.,
Ontos from Ontos Inc.,
FastObjects from Poet Software Corp.,
Jasmine from Computer Associates/Fujitsu,
Versant from Versant Corp.
CENG 553 Database Management Systems
8
Advantages of OODBMSs
•
•
•
•
•
•
Enriched Modeling Capabilities.
Removal of Impedance Mismatch.
More Expressive Query Language.
Support for Schema Evolution.
Support for Long Duration Transactions.
Applicability to Advanced Database Applications.
CENG 553 Database Management Systems
9
Disadvantages of OODBMSs
•
•
•
•
•
•
Lack of Universal Data Model.
Lack of Experience.
Lack of Standards.
Query Optimization compromises Encapsulation.
Object Level Locking may impact Performance.
Complexity.
CENG 553 Database Management Systems
10
Alternative Strategies for Developing an
OODBMS
• Extend existing object-oriented programming language.
– GemStone extended Smalltalk.
• Provide extensible OODBMS library.
– Approach taken by Ontos, Versant, and ObjectStore.
• Embed OODB language constructs in a conventional host
language.
– Approach taken by O2,which has extensions for C++.
• Extend existing database language with object-oriented
capabilities.
– Approach being pursued by RDBMS and OODBMS vendors.
– Ontos and Versant provide a version of OSQL.
• Develop a novel database data model/language.
CENG 553 Database Management Systems
11
Single-Level v. Two-Level Storage Model
• With a traditional DBMS, programmer has to:
– Decide when to read and update objects.
– Write code to translate between application’s object model and the
data model of the DBMS.
– Perform additional type-checking when object is read back from
database, to guarantee object will conform to its original type.
• Conventional DBMSs have two-level storage
model: storage model in memory, and database
storage model on disk.
• In contrast, OODBMS gives illusion of single-level
storage model, with similar representation in both
memory and in database stored on disk.
CENG 553 Database Management Systems
12
Two-Level Storage Model for RDBMS
CENG 553 Database Management Systems
13
Single-Level Storage Model for OODBMS
CENG 553 Database Management Systems
14
Object Data Management Group
(ODMG)
• Established by vendors of OODBMSs to define
standards.
• The ODMG Standard includes :
–
–
–
–
Object Data Model (ODM).
Object Definition Language (ODL).
Object Query Language (OQL).
C++, Smalltalk, and Java Language Binding.
CENG 553 Database Management Systems
15
Main Idea: Host Language = Data Language
• Objects in the host language are mapped directly to
database objects
• Some objects in the host program are persistent. Changing
such objects (through an assignment to an instance variable
or with a method application) directly and transparently
affects the corresponding database object
• Accessing an object using its oid causes an “object fault”
similar to pagefaults in operating systems. This
transparently brings the object into the memory and the
program works with it as if it were a regular object
defined, for example, in the host Java program
CENG 553 Database Management Systems
16
SQL Databases vs. ODMG
• In SQL: Host program accesses the database by
sending SQL queries to it (using JDBC, ODBC,
Embedded SQL, etc.)
• In ODMG: Host program works with database
objects directly
CENG 553 Database Management Systems
17
ODMG Data Model
• Distinguishes between objects and pure values (which
are called literals)
• Both can have complex internal structure, but only objects have oids
• Two kinds of classes: “ODMG classes” and “ODMG
interfaces”, similar to Java
– An ODMG interface: only signatures
• does not have its own objects
• cannot inherit from (be a subclass of) an ODMG class – only from
another ODMG interface
– An ODMG class:
• can have attributes, methods with code, own objects
• can inherit from (be a subclass of) other ODMG classes or interfaces
– can have at most one immediate superclass
CENG 553 Database Management Systems
18
ODMG Object Model – Built-in Collections
Set: unordered collections without duplicates.
Bag: unordered collections that do allow duplicates.
List: ordered collections that allow duplicates.
Array: 1D array of dynamically varying length.
Dictionary: unordered sequence of key-value pairs with
no duplicate keys.
CENG 553 Database Management Systems
19
More on the ODMG Data Model
• Can specify keys
• Class extents have their own names – this is what
is used in queries
• Distinguishes between relationships and attributes
•
•
•
•
Attribute values are literals
Relationship values are objects
Only binary relationships supported
ODMG relationships have little to do with
relationships in the E-R model
CENG 553 Database Management Systems
20
ODL: ODMG’s Object Definition
Language
• ODL supports semantics constructs of ODMG
• ODL is independent of any programming
language
• ODL is used to create object specification
(classes and interfaces)
• ODL is not used for database manipulation
CENG 553 Database Management Systems
21
ODL Examples (1)
A Very Simple Class
• A very simple, straightforward class
definition :
class Degree {
attribute string college;
attribute string degree;
attribute string year;
};
CENG 553 Database Management Systems
22
ODL Examples (2)
A Class With Key and Extent
• A class definition with “extent”, “key”, and more
elaborate attributes; still relatively straightforward
class Person (extent persons key ssn) {
attribute struct Pname {string fname …} name;
attribute string ssn;
attribute date birthdate;
…
short age();
}
CENG 553 Database Management Systems
23
ODL Examples (3)
A Class With Relationships
• Note extends (inheritance) relationship
• Also note “inverse” relationship
Class Faculty extends Person (extent faculty) {
attribute string rank;
attribute float salary;
attribute string phone;
…
relationship Dept works_in inverse Dept::has_faculty;
relationship set<GradStu> advises inverse GradStu::advisor;
void give_raise (in float raise);
void promote (in string new_rank);
};
CENG 553 Database Management Systems
24
Referential Integrity
class STUDENT extends PERSON {
( extent StudentExt )
attribute Set<String> Major;
relationship Set<COURSE> Enrolled;
inverse COURSE::Enrollment;
}
class COURSE: Object {
( extent CourseExt )
attribute Integer CrsCode;
attribute String Department;
relationship Set<STUDENT> Enrollment;
inverse STUDENT::Enrolled;
}
CENG 553 Database Management Systems
25
Object Query Language (OQL)
• OQL is DMG’s query language
• Provides declarative access to object database using SQLlike syntax.
• Does not provide explicit update operators - leaves this to
operations defined on object types.
• Can be used as a standalone language and as a language
embedded in another language, for which an ODMG
binding is defined (Smalltalk, C++, and Java).
• Embedded OQL statements return objects that are
compatible with the type system of the host language
CENG 553 Database Management Systems
26
Object Query Language (OQL)
• An OQL query is a function that delivers an object whose
type may be inferred from operator contributing to query
expression.
• Query definition expressions is of form:
DEFINE Q as e
• Defines query with name Q given query expression e.
CENG 553 Database Management Systems
27
Example OQL: Extents & Traversal Paths
Get set of all faculty (with identity)
faculty
Get set of all enrollments(with identity)
CourseExt.Enrollment
CENG 553 Database Management Systems
28
Example schema
class Branch (extent branchOffices key branchNo)
{
attribute string branchNo;
….
relationship Manager ManagedBy
inverse Manager::Manages;
void takeOnPropertyForRent(in string propertyNo);
}
CENG 553 Database Management Systems
29
Example (cont.)
class Person {
attribute struct Pname {string fName, string lName}
name;
}
Class Staff extends Person
(extent staff
key staffNo)
{
attribute staffNo;
attribute date DOB;
….
short getAge();
}
CENG 553 Database Management Systems
30
Example (cont.)
class Manager extends Staff
(extent managers)
{
relationship Branch Manages
inverse Branch::ManagedBy;
}
CENG 553 Database Management Systems
31
Example OQL: Extents & Traversal Paths
Find all branches in London
DEFINE londonBranches AS
SELECT b.branchNo
FROM b IN branchOffices
WHERE b.address.city = “London”;
This returns a literal of type bag<string>.
CENG 553 Database Management Systems
32
Example OQL: Extents & Traversal Paths
Find all staff who work at London branches.
londonBranches.Has
This returns set<SalesStaff>.
CENG 553 Database Management Systems
33
Example OQL: Use of structures
Get structured set (without identity) containing name,
sex, and age of all staff who live in London.
SELECT struct (lName:s.name.lName, sex:s.sex,
age:s.age)
FROM s IN Staff
WHERE s.WorksAt.address.city = “London”
This returns a literal of type set<struct>.
CENG 553 Database Management Systems
34
Example OQL: Use of structures
Get structured set (without identity) containing
branch number and set of all Assistants at branches in
London.
SELECT struct (branchNo:x.branchNo, assistants:
(SELECT y FROM y IN x.WorksAt
WHERE y.position = “Assistant”))
FROM x IN (SELECT b FROM b IN branchOffices
WHERE b.address.city = “London”)
This returns a literal of type set<struct>.
CENG 553 Database Management Systems
35
OQL - Creating Objects
A type name constructor is used to create an object
with identity.
Manager(staffNo: “SL21”,
fName: “John”, lName: “White”,
address: “19 Taylor St, London”,
position: “Manager”, sex: “M”,
DOB: date“1945-10-01”, salary: 30000)
CENG 553 Database Management Systems
36
ORDBMS
CENG 553 Database Management Systems
37
Merging Relational and Object
Models
• Object-oriented models support interesting
data types --- not just flat files.
– Maps, multimedia, etc.
• The relational model supports very-highlevel queries.
• Object-relational databases are an attempt to
get the best of both.
CENG 553 Database Management Systems
38
Evolution of DBMS’s
• Object-oriented DBMS’s failed because
they did not offer the efficiencies of wellentrenched relational DBMS’s.
• Object-relational extensions to relational
DBMS’s capture much of the advantages of
OO, yet retain the relation as the
fundamental abstraction.
CENG 553 Database Management Systems
39
ORDBMSs
• Vendors of RDBMSs conscious of threat and
promise of OODBMS.
• Agree that RDBMSs not currently suited to advanced
database applications, and added functionality is
required.
• Can remedy shortcomings of relational model by
extending model with OO features.
CENG 553 Database Management Systems
40
ORDBMSs - Features
• OO features being added include:
–
–
–
–
–
–
–
user-extensible types,
encapsulation,
inheritance,
polymorphism,
dynamic binding of methods,
complex objects including non-1NF objects,
object identity.
CENG 553 Database Management Systems
41
Stonebraker’s View
CENG 553 Database Management Systems
42
Objects in SQL:1999
•
•
•
•
Object-relational extension of SQL-92
Includes the legacy relational model
SQL:1999 database = a finite set of relations
relation = a set of tuples (extends legacy relations)
OR
•
•
•
a set of objects (completely new)
object = (oid, tuple-value)
tuple = tuple-value
tuple-value = [Attr1: v1, …, Attrn: vn]
CENG 553 Database Management Systems
43
SQL:1999 Tuple Values
• Tuple value: [Attr1: v1, …, Attrn: vn]
– Attri are all distinct attributes
– Each vi is one of these:
• Primitive value: a constant of type CHAR(…),
INTEGER, FLOAT, etc.
• Reference value: an object Id
• Another tuple value
• A collection value
Only the ARRAY construct is – a fixed size array.
SETOF and LISTOF are not supported.
CENG 553 Database Management Systems
44
Row Types
• The same as the original (legacy) relational tuple type.
However:
– Row types can now be the types of the individual attributes in
a tuple
CREATE TABLE
PERSON (
Name CHAR(20),
Address ROW(Number INTEGER, Street CHAR(20), ZIP CHAR(5))
)
CENG 553 Database Management Systems
45
Row Types (Contd.)
• Use path expressions to refer to the components of row types:
SELECT P.Name
FROM PERSON P
WHERE P.Address.ZIP = ‘11794’
• Update operations:
INSERT INTO PERSON(Name, Address)
VALUES (‘John Doe’, ROW(666, ‘Hollow Rd.’, ‘66666’))
UPDATE PERSON
SET Address.ZIP = ‘66666’
WHERE Address.ZIP = ‘55555’
UPDATE PERSON
SET Address = ROW(21, ‘Main St’, ‘12345’)
WHERE Address = ROW(123, ‘Maple Dr.’, ‘54321’) AND Name = ‘J. Public’
CENG 553 Database Management Systems
46
User Defined Types (UDT)
• UDTs allow specification of complex objects/tuples,
methods, and their implementation
• Like ROW types, UDTs can be types of individual
attributes in tuples
• UDTs can be much more complex than ROW types
(even disregarding the methods): the components of
UDTs do not need to be elementary types
CENG 553 Database Management Systems
47
A UDT Example
CREATE TYPE PersonType AS (
Name CHAR(20),
Address ROW(Number INTEGER, Street CHAR(20), ZIP CHAR(5))
);
CREATE TYPE StudentType UNDER PersonType AS (
Id INTEGER,
Status CHAR(2)
)
METHOD award_degree() RETURNS BOOLEAN;
CREATE METHOD award_degree() FOR StudentType
LANGUAGE C
EXTERNAL NAME ‘file:/home/admin/award_degree’;
File that holds the binary code
CENG 553 Database Management Systems
48
Using UDTs in CREATE TABLE
• As an attribute type:
CREATE TABLE TRANSCRIPT (
Student StudentType,
CrsCode CHAR(6),
Semester CHAR(6),
Grade CHAR(1)
)
A previously defined UDT
• As a table type:
CREATE TABLE STUDENT OF StudentType;
Such a table is called typed table.
CENG 553 Database Management Systems
49
Objects
• Only typed tables contain objects (i.e. tuples with oids)
• Compare:
CREATE TABLE STUDENT OF StudentType;
and
CREATE TABLE STUDENT1 (
Name CHAR(20),
Address ROW(Number INTEGER, Street CHAR(20), ZIP CHAR(5)),
Id
INTEGER,
Status CHAR(2)
)
• Both contain tuples of exactly the same structure
• Only the tuples in STUDENT – not STUDENT1 – have oids.
• This disparity is motivated by the need to stay backward
compatible with SQL-92.
CENG 553 Database Management Systems
50
Querying UDTs
• Nothing special – just use path expressions
SELECT T.Student.Name, T.Grade
FROM
TRANSCRIPT T
WHERE T.Student.Address.Street = ‘Main St.’
Note: T.Student has the type StudentType. The attribute Name is
not declared explicitly in StudentType, but is inherited from
PersonType.
CENG 553 Database Management Systems
51
Updating User-Defined Types
• Inserting a record into TRANSCRIPT:
INSERT INTO TRANSCRIPT(Student,Course,Semester,Grade)
VALUES (????, ‘CS308’, ‘2000’, ‘A’)
–
The type of the Student attribute is StudentType. How
does one insert a value of this type (in place of ????)?
– Further complication: the UDT StudentType is
encapsulated, i.e., it is accessible only through public
methods, which we did not define
–
Do it through the observer and mutator methods
provided by the DBMS automatically
CENG 553 Database Management Systems
52
Observer Methods
• For each attribute A of type T in a UDT, an SQL:1999 DBMS is supposed to
supply an observer method, A: ( )  T, which returns the value of A (the notation
“( )” means that the method takes no arguments)
• Observer methods for StudentType:
• Id: ( )  INTEGER
• Name: ( )  CHAR(20)
• Status: ( )  CHAR(2)
• Address: ( )  ROW(INTEGER, CHAR(20), CHAR(5))
• For example, in
SELECT T.Student.Name, T.Grade
FROM
TRANSCRIPT T
WHERE T.Student.Address.Street = ‘Main St.’
Name and Address are observer methods, since T.Student is of type StudentType
Note: Grade is not an observer, because TRANSCRIPT is not part of a UDT
CENG 553 Database Management Systems
53
Mutator Methods
• An SQL:1999 DBMS is supposed to supply, for each attribute A
of type T in a UDT U, a mutator method
A: T  U
For any object o of type U, it takes a value t of type T
and replaces the old value of o.A with t; it returns the
new value of the object. Thus, o.A(t) is an object of type U
• Mutators for StudentType:
• Id: INTEGER  StudentType
• Name: CHAR(20)  StudentType
• Address: ROW(INTEGER, CHAR(20), CHAR(5))  StudentType
CENG 553 Database Management Systems
54
Example: Inserting a UDT Value
INSERT INTO TRANSCRIPT(Student,Course,Semester,Grade)
VALUES (
NEW StudentType( ) .Id(111111111) .Status(‘G5’) .Name(‘Joe Public’)
.Address(ROW(123,’Main St.’, ‘54321’)) ,
‘CS532’,
‘S2002’,
‘A’
)
Add a value
for Id
Create a blank
StudentType object
Add a value for the
Address attribute
Add a value
for Status
‘CS532’, ‘S2002’, ‘A’ are primitive values for the attributes Course, Semester, Grade
CENG 553 Database Management Systems
55
Example: Changing a UDT Value
UPDATE TRANSCRIPT
SET Student = Student.Address(ROW(21,’Maple St.’,’12345’)).Name(‘John Smith’),
Grade = ‘B’
Change Name
Change Address
WHERE Student.Id = 111111111 AND CrsCode = ‘CS532’ AND Semester = ‘S2002’
• Mutators are used to change the values of the attributes Address
and Name
CENG 553 Database Management Systems
56
Referencing Objects
• Consider again
CREATE TABLE TRANSCRIPT (
Student StudentType,
CrsCode CHAR(6),
Semester CHAR(6),
Grade CHAR(1)
)
• Problem: TRANSCRIPT records for the same student refer to distinct
values of type StudentType (even though the contents of these
values may be the same) – a maintenance/consistency problem
• Solution: use self-referencing column
– Bad design, which distinguishes objects from their references
– Not truly object-oriented
CENG 553 Database Management Systems
57
Self-Referencing Column
• Every typed table has a self-referencing column
– Normally invisible
– Contains explicit object Id for each tuple in the table
– Can be given an explicit name – the only way to enable
referencing of objects
CREATE TABLE STUDENT2 OF StudentType
REF IS stud_oid;
Self-referencing column
Self-referencing columns can be used in queries just like regular columns
Their values cannot be changed, however
CENG 553 Database Management Systems
58
Reference Types and Self-Referencing Columns
• To reference objects, use self-referencing columns + reference
types: REF(some-UDT)
CREATE TABLE TRANSCRIPT1 (
Student REF(StudentType) SCOPE STUDENT2,
CrsCode CHAR(6),
Semester CHAR(6),
Grade CHAR(1)
Reference type
)
Typed table where the
values are drawn from
• Two issues:
• How does one query the attributes of a reference type
• How does one provide values for the attributes of type REF(…)
– Remember: you can’t manufacture these values out of thin air – they are oids!
CENG 553 Database Management Systems
59
Querying Reference Types
• Recall:
Student REF(StudentType) SCOPE STUDENT2
in
TRANSCRIPT1.
How does one access, for example, student names?
• SQL:1999 has the same misfeature as C/C++ has (and which Java and
OQL do not have): it distinguishes between objects and references to
objects. To pass through a boundary of REF(…) use “” instead of “.”
SELECT T.StudentName, T.Grade
FROM TRANSCRIPT1 T
WHERE
T.StudentAddress.Street = “Main St.”
Not crossing REF(…)
boundary, use “.”
Crossing REF(…)
boundary, use 
CENG 553 Database Management Systems
60
Inserting REF Values
• How does one give values to REF attributes, like Student in
TRANSCRIPT1?
• Use explicit self-referencing columns, like stud_oid in STUDENT2
• Example: Creating a TRANSCRIPT1 record whose Student attribute has
an object reference to an object in STUDENT2:
INSERT INTO TRANSCRIPT1(Student,Course,Semester,Grade)
SELECT S.stud_oid, ‘HIS666’, ‘F1462’, ‘D’
FROM STUDENT2 S
WHERE S.Id = ‘111111111’
CENG 553 Database Management Systems
Explicit self-referential
column of STUDENT2
61
Modifications to support ORDBMS
• Parsing
– Type-checking for methods pretty complex.
• Query Rewriting
– Often useful to turn path exprs into joins!
• Optimization
– New algebra operators needed for complex types.
• Must know how to integrate them into optimization.
– WHERE clause exprs can be expensive!
• Selection pushdown may be a bad idea.
CENG 553 Database Management Systems
62
Modifications (Contd.)
• Execution
– New algebra operators for complex types.
– OID generation & reference handling.
– Dynamic linking.
– Support “untrusted” methods.
– Support objects bigger than 1 page.
CENG 553 Database Management Systems
63
Modifications (Contd.)
• Access Methods
– Indexes on methods, not just columns.
– Need indexes for new WHERE clause exprs
(not just <, >, =)!
• Data Layout
– Clustering of nested objects.
– Chunking of arrays.
CENG 553 Database Management Systems
64
OO/OR-DBMS Summary
• Traditional SQL is too limited for new apps.
• OODBMS: Persistent OO programming.
– Difficult to use, no query language.
• ORDBMS: Best (?) of both worlds:
– Catching on in industry and applications.
– Pretty easy for SQL folks to pick up.
– Still has growing pains (SQL-3 standard still a
moving target).
CENG 553 Database Management Systems
65
Summary (Contd.)
• ORDBMS offers many new features.
– But not clear how to use them!
– Schema design techniques not well understood
– Query processing techniques still in research
phase.
• A moving target for OR DBA’s!
CENG 553 Database Management Systems
66