Methods for Indicating Persistence

Transcript Methods for Indicating Persistence

Persistence and Objects

Group presentation given by

Barry Myles

Gary Stewart

Stephanie Dunsire
Introduction

This presentation will give an in-depth
view of persistence

When coupling the programming power of an object-oriented
language like Java or C++ with the data storage and
manipulation mechanism’s of a relational database, difficulties
arise!

Due to different characteristics when handling data


Programming Languages work with transient data, where as Relational
DB’s tend to work more with Pesistant data
Also due to differing modelling paradigms

A Relation Model cannot map directly to an OO Model
What is Persistence

Persistence has different meanings

With respect to a single piece of Data


“With respect to a piece of data it means the length of time for
which that datum exists.” [Ref: Object Database and ODMG
approach - Richard Cooper]
With Respect to an Information system

"A persistent programming language (PPL) is a programming language that
includes a persistent memory area (e.g., a heap of objects) that outlives the
execution of any individual program." [Ref: Working with Persistent
Objects - J. Eliot B. Moss]
Types of Persistence


How do program’s write of data to
secondary storage
The Three types of persistence are



Session Persistence
File Persistence
Orthogonal Persistence
Types of Persistence



Session persistence - allows the state of a program to be
resumed from a snapshot of the system.
Used where the entire state must be restored and/or where the
workspace size is small
Examples:
 Windows Hibernation mode
 Smalltalk implementations
 Machine Emulators

Able to take the state of the machine they are emulating
Types of Persistence


Session Persistence(contd.)
Problems:



Not designed for multiple users modifying the same
system.
User has no control over what is and is not saved.
For its inability to allow concurrent users to
access the same workspace schema this system is
not suitable for database systems.
Types of Persistence

File Persistence



The widest used format for storing data.
Traditional way of making programming
languages persistent.
Extracts the data from the program and then
writes it of to Secondary Storage.

This will normally have to be converted to be placed back in the
program as data
Types of Persistence

File Persistence (contd..)

Problems:


Conversion of data to and from the application can be expensive
both on processor and designer time
Open access to the file may lead to other programs corrupting it


although access to a file in this manner is not always bad practice,
it can be useful
Since there are multiple models for the data in files, one for
internal data and another for files then this removes the
simplicity of a single model view.
Types of Persistence

Orthogonal Persistence

Concerned with storing the data in the same structure
on the disk as it had in memory


Allows any memory data to be stored without
transformation


If a class is marked as persistent the program will also make all
the data in the lower classes persistent to. This maintains
referential integrity.
The same logical data structure exists in main memory as it
does on hard disk
Allows any data value to be made persistent or
transient irrelevant of type
Types of Persistence

Orthogonal Persistence (contd.)




Only areas of either the code or data that need to be
addressed will be brought into memory, the rest will
remain on disk until required.
The data model of the database can represent the
class structure.
The database itself represents the class hierarchy.
This results in the the persistence being a single
representation of the data.
Distinguishing Transient and
Persistent Data


Implementation varies by a large degree in the
different OODBMSs.
The two main criteria are:



How little effort it takes for the programmer to make
objects persistent
Efficiency of the storage of data in the system
A benefit to one of these criteria may result in a
loss of performance/decreasing functionality of the
other.
Marking Persistence

An object can be marked as persistent in one of
two ways:




Explicitly declared as persistence by the programmer or
programming language
Is reachable by another persistent object, therefore must
persist otherwise it will lead to a dangling reference
All objects that are not part of the set above are
transient
Not all languages support persistence by reachability,
these either force the programmer to check all
references or allow dangling references
Marking Persistence—Explicitly

Multiple methods of marking persistence used


Some systems may allow more than one method.
Seven main methods used







Named Root Objects
Persistent Classes
Persistence Declared at Object Creation
Persistent Root Class
Persistent Shadow Classes
Persistence By Explicit Storage
System–Provided Persistent Roots
Persistent Shadow Classes

Automatically creates a persistent class for
all classes that exist



User can then use the persistent class to create an
instance of the class or use the transient class
Once initialised as one type cannot be changed to
the other (though it should be possible to copy the
object across)
Ensures that all classes in the language are
persistent capable
Persistence By Explicit Storage

This provides a method to call an object to be
stored at any time, hence making it persistent




Similar to file persistence but should be orthogonal unlike
file persistence
Has the run time cost of copying the memory requested into
the persistent store
One of the most flexible systems, should allow all objects to
be made persistence
Obviously supports the 'upgrading' of object into persistent
objects
System–Provided Persistent Roots

This provides a mechanism for adding objects to a
persistent store





The persistent store has some method allowing the user to
add an object to it
The object, by being in this store, is persistent
The store should allow all object types to be added hence all
objects can be made persistent at run-time and 'upgraded'
from transient objects
The System-Provided Persistent Roots should be globally
available
ODMG uses this for it's Java and Smalltalk
implementation.
Persistence By Reachability



This means that if an object is persistent all objects
that are referred to by that object must also be
made persistent or references would become
invalid
Explicit persistent objects therefore dictate what is
made persistent from them, however a cascading
effect may happen
This should be entirely handled by the system but
may make the programmer confirm things to be
persistent
Persistence By Reachability

Persistence by reachability provides:



Referential integrity to the system by insuring object
that are referred to by a persistent object are retained
A reduction of workload by the programmer as they
no longer have to explicitly state what objects are to
be made persistent
It may, however, store unwanted data that will
increase the size of the database

Programmers might be able to denote fields in objects
that are never to be made persistent (transient
keyword in Java)
Storing Code and Data


Storing Code and Data is important as databases can hold complex
data.
The Figure 1 shows file systems, code is shown as ovals, while data
is shown as rectangles:
Figure 1





System (a) in figure 1 is an unprotected file system however it has a
flexible environment in which code and data can be freely mixed.
System (b) in figure 1 is a traditional DBMS that controls part of the
file system, which it uses to store and protect the data. Application
code remains outside the control of the DBMS.
System (c) in figure 1 is an orthogonally persistent language, which
allows code to be protected as well and for data and code to be freely
mixed.
System (d) in figure 1 is an OODBMS file system that provides a
class structure in which the mixture of code and data is controlled.
Systems which exclude the program code from the structure of the
database, will often fall into disuse as they make the creation and
maintenance of the applications exceptionally costly and error-prone.
Removing Data
An important facility a database must provide is a method of
removing unwanted data, there are various methods of removing data
which depends on the system:


Explicit Deletion of Data – Systems can allow explicit deletion of
objects and classes. Some systems allow this as it is believed that the
programmer can achieve the most efficient storage of data. This
method can avoid integrity violations for example by using C++ an
operation called a destructor can be used, thus it can be used to
explicitly remove data without causing deletion violation.
Garbage Collection – Systems can also allow an automatic method
of removing data called garbage collection, a popular use of garbage
collection is in a OODBMS which uses persistence through
reachability.
Types of Garbage Collection

Mark-and-Sweep

This involves two stages: Mark - This starts at the persistent
root objects and marks everything referred to by the roots as
required data, and then recursively marking further objects
referred to by already marked objects until all objects are
marked. Stage two Sweep – removes unmarked data so the
space is freed and returned to the free pool for future
reallocation. The problem with the mark-and-sweep method of
garbage collection is that it has a tendency to fragment memory.

Stop-and-Copy

This method collects garbage and de-fragments memory, this
method involves the memory being divided into two regions an
active region and an inactive region. The stop aspect of the
method starts when the memory in the active region is
exhausted, the copy aspect of the method copies all of the live
objects from the active region to the inactive region. The final
step is to switch the active and inactive regions, as this method
only copies the live objects any objects left over are garbage and
are deleted. The problem with this method is the cost of
doubling the size of the heap (i.e. active and inactive).

Reference Counting

This method involves every object keeping a count of all
references made to it, thus every time a new reference is made
this total is incremented and every time a reference is removed
the total is decremented. If the total is zero the object is
unreachable and so is garbage and can be removed. . The two
main problems of this method: the counts and maintaining them
can take up space and there is no guarantee that all unreachable
objects will have a reference count of zero.

Mark-and-Compact

This method again consists of two phases: the first phase mark
this starts at the persistent root objects it marks everything
referred to by the roots as required data, and then recursively
marking further objects referred to by already marked objects.
The second phase is called the compaction phase this collects
anything that remains as garbage and compacts the memory by
moving all the live objects into continuous memory locations.
This method eliminates fragmentation and does not incur the
additional costs of additional space.
Example Garbage Collection

An example of Garbage Collection is a course being cancelled at
Napier due to this all students on this course can be relocated on other
courses or leave (in this scenario we will ignore the students that
leave), the course attribute of the student objects will be changed to
their new courses. Once all objects representing the members of the
course are updated there will be no references left from the student
objects to the old course object thus it can be removed from the object
holding the set of courses and as there will be no existing references
to the object it will become unusable data Figure 2 shows an object
graph with garbage and non-garbage aspects:
Figure 2
References
ODMG 2.0: A Standard for Object Storage by Doug Barry
Component Strategies (July 1998)
http://www.odmg.org/library/readingroom/Article%20%20Components%20Strategies%20-%20July98.html

Garbage Collection In Java, Vern Martin, December 2, 1997:
http://trident.mcs.kent.edu/~vmartin/proj/proj.html

A presentation on mark-and-compact
http://cselab.snu.ac.kr/project/ipctv/javagc/gcover
view/sld010.htm

Databases From Relational to Object Oriented Systems by Claude
Delobel (1991)


Methods for Indicating Persistence

Transcript Methods for Indicating Persistence

Directory