16.Persistence - ics-software

Download Report

Transcript 16.Persistence - ics-software

Approaches to
Persistence in Java
Philip Johnson
Collaborative Software Development Laboratory
Information and Computer Sciences
University of Hawaii
Honolulu HI 96822
(1)
Part 1: Small Scale Persistence
Small scale persistence:
• dozens to hundreds of
users
• single machine
• All persistent objects
can fit in memory at
once.
• No transaction,
rollback, fail-over
• Cheap and fast to
implement
(2)
Large scale persistence:
• thousands to millions of
users
• clusters of machines,
shared caching
• lazy/incremental loading
of objects
• Transaction, rollback,
fail-over required
• Costly and timeconsuming to implement
Common motivations
Persistence across application restarts/failure
• Avoid data loss
Checkpoints
• Allow rollback to previous state
Transfer of data between multiple applications
• Synchronous vs. asynchronous
-Example: network vs. file system
• Application-specific vs. portable
-Example: Serialized object vs. XML
Caching of intermediate results
• Avoid computation loss
(3)
Some flavors of persistence
Simple, key-value:
• java.util.Properties
• java.util.prefs package
• JNDI
Object-based:
• java.io.Serializable
• JavaBean persistence
• Java Data Objects (JDO)
XML-based:
• JDOM
• JAXB
(4)
Database:
• Relational
• Object-oriented
Enterprise Java Beans:
• Entity & Session beans
• CMP & BMP
Persistence creates
development issues
Persistence tends to slow down development.
• Adds cost & risk to major design changes.
• Tends to “lock in” early (bad) design decisions.
Why is persistence a problem?
• The Object-Relational Impedance Mismatch
• Multiple design issues and constraints
How can we maintain development velocity in face of
need for persistence?
• A “Late-binding persistence” development strategy
(5)
Object-Relational Impedance Mismatch
Object paradigm:
•Networks of objects with state and behavior
•Processing via: traversal
•Classes, inheritance, polymorphism, etc.
Relational paradigm:
•Tables of entities with only data.
•Processing via: selection/joining of rows
•Tables, columns, keys, indices, etc.
The intrinsic differences between paradigms
creates design problems.
(6)
Example
Consider a family tree.
Consider the query “Return all of the
grandchildren of Family Member X”
Which representation would make this query
easiest to implement?
•A network (tree) of family members
•A set of database tables and SQL
statements
(7)
Addressing the OO-DB IM
1. Eliminate the OO:
•User interface manipulates SQL.
•Pros: single paradigm, simplicity
•Cons: complexity of “advanced” processing
(stored procedures, etc.)
2. Eliminate the relational DB:
•OODBs, Serialized objects, etc.
•Pros: single paradigm, simplicity
•Cons: potential loss of relational data
integrity (normalization)
(8)
Addressing the OO-DB IM
3. Hide the DB:
•Object-to-relational mappings, JDO, EJB...
•Pros: Allows use of back-end RDBMs
•Cons: Complexity, lock-in, overhead
4. Stop whining and just deal with it:
•Manual mapping between objects and tables
•Pros: Flexibility
•Cons: Maintenance and complexity
(9)
Choice of persistence depends
upon many design issues
Simplicity:
• How complicated to set
up for me? For my users?
Financial cost:
• Do I have to pay for it?
Do my users?
Data specificity:
• What kinds of data am I
saving? Can I use a
“special purpose”
persistence mechanism?
Design lock-in:
• How much code will I
have to change if I need
to change my mechanism?
(10)
Longevity:
• How long must the
persistent data exist?
Scalability:
• What usage level do I
expect over the next six
months?
Integrity:
• Do I require
transactions? Rollback?
Fail-over?
OO-Relational impedance
mismatch:
• Do I mind the cost?
Optimization:
• Do I need something
faster than a relational
database?
One development approach:
Late-binding persistence
Initial development: No persistence.
• Deploy initial versions to user on “trial basis” with no
persistence guarantees.
Early “live” releases: simple, “non-scalable”.
• Enable data migration.
• Determine true bottlenecks/integrity issues.
• Maintain application evolvability.
Ongoing development:
• Think about multiple persistence approaches.
• Example: Preferences + XML + RDMS
• Each approach optimized to persistence
requirements.
Applicability of this approach depends upon nature of
(11)
system/requirements!
A birds-eye view of selected
persistence mechanisms
(12)
Preference and configuration data
java.util.Properties:
• Well known, easy to use
• No standards as to where data should reside
• Problems for backup, or transfer to other machines.
JNDI (Java Naming and Directory Service):
• Back-end neutrality
• Large, complicated to set up
java.util.prefs (JDK 1.4):
• Back-end neutrality of JNDI
• Simplicity of java.util.Properties
• Can be invoked by multiple threads safely
(13)
Object-based persistence:
java.io.Serializable
Pros:
• Converts an object (and all internal objects) into a
stream of bytes that can be later deserialized into
a copy of the original object (and all internal
objects).
• Fast, simple, compact representation of an object
graph.
• May be great choice for *temporary* storage of
data.
Cons:
• Creates long-term maintenance issues
• Harder to evolve objects and maintain backward
compatibility with serialized representation.
• See “Effective Java”, Chapter 10, for a good
description of issues with Serialization
(14)
XML file-based persistence:
JDOM and JAXB
Pros:
• Very high level of data portability.
• Simple
Cons:
• Space-inefficient
• Complex graph structures problematic.
For data structures:
• JDOM
For bi-directional object mappings:
• JAXB
(15)
Java-based RDBMS
Most important one is Derby.
•http://db.apache.org/derby
•Will be included in JDK 1.6!
Can run as either ‘embedded’ in your
application JVM or as a stand-alone network
server.
•To embed, just add derby.jar to your
classpath!
(16)
Open Source
Java Persistence Frameworks
Hibernate (www.hibernate.org):
•Object to RDBMS binding
•“Hibernate Query Language”
•Claims to be very fast, very scalable, very
efficient.
•Most popular open source framework for
object/relational mapping in Java.
(17)
Others
Enterprise Java Beans
•Public standard framework
•Simple reference implementation
•Support for clustering, fail-over, etc. in
distributed applications.
Firestorm/DAO
•Automatically generates Java source code
for accessing relational databases.
(18)
Things to think about
Sometimes simple is better
•Try the least complicated persistence
mechanism first.
Sometimes you can mix and match
•Not all data must necessarily be persisted
the same way
You can evolve your solution over time
•Especially if you design your system to
encapsulate your persistence mechanism.
(19)
Things to think about
Your persistence strategy might depend on
context:
(20)
Java First:
• You’ve developed/inherited some Java code and
need someplace to store the objects.
-Hibernate
Database First:
• You’ve developed/inherited a database and want
to access it in Java.
-Firestorm/DAO
Spaghetti Junction:
• You’ve inherited Java code and a DB and want
to put the two together.
-Uh oh.
Things to think about
IF: Client-side, single thread, simple
structure, installation simplicity
•DB optional, consider XML.
IF: Multiple clients need access to data
•DB highly recommended
IF: Transaction support, fail-over, etc:
•DB required
(21)