Object Serialization in Java

Download Report

Transcript Object Serialization in Java

Object Serialization in Java
Or: The Persistence of Memory…
Originally from:
http://www.cs.unm.edu/~terran/
So you want to save your data…

Common problem:
 You’ve built a large, complex object




Want to store on disk and retrieve later
 Or: want to send over network to another Java
process
In general: want your objects to be persistent -outlive the current Java process


Spam/Normal statistics tables
Game state
Database of student records
Etc…
Answer I: customized file formats

Write a set of methods for saving/loading each
instance of a class that you care about
public class MyClass {
public void saveYourself(Writer o)
throws IOException { ... }
public static MyClass loadYourself(Reader r)
throws IOException { ... }
}
Coolnesses of Approach 1:




Can produce arbitrary file formats
Know exactly what you want to store and get
back/don’t store extraneous stuff
Can build file formats to interface with other
codes/programs
 XML
 Pure text file
 Etc.
If your classes are nicely hierarchical, makes
saving/loading simple
 What will happen with Inheritance?
Make Things Saveable/Loadable
public interface Saveable {
public void saveYourself(Writer w)
throws IOException;
// should also have this
// public static Object loadYourself(Reader r)
//
throws IOException;
// but you can’t put a static method in an
// interface in Java
}
Saving, cont’d
public class MyClassA implements Saveable {
public MyClassA(int arg) {
// initialize private data members of A
}
public void saveYourself(Writer w)
throws IOException {
// write MyClassA identifier and private data on
// stream w
}
public static MyClassA loadYourself(Reader r)
throws IOException {
// parse MyClassA from the data stream r
MyClassA tmp=new MyClassA(data);
return tmp;
}
}
Saving, cont’d
public class MyClassB implements Saveable {
public void MyClassB(int arg) { ... }
private MyClassA _stuff;
public void saveYourself(Writer w) {
// write ID for MyClassB
_stuff.saveYourself(w);
// write other private data for MyClassB
w.flush();
}
public static MyClassB loadYourself(Reader r) {
// parse MyClassB ID from r
MyClassA tmp=MyClassA.loadYourself(r);
// parse other private data for MyClassB
return new MyClassB(tmp);
}
}
Painfulnesses of Approach 1:




This is called recursive descent parsing
Actually, there are plenty of places in the real
world where it’s terribly useful.
But... It’s also a pain (why?)
 If all you want to do is store/retrieve data, do
you really need to go to all of that effort?
Fortunately, no. Java provides a shortcut that
takes a lot of the work out.
Approach 2: Enter Serialization...



Java provides the serialization mechanism for
object persistence
It essentially automates the grunt work for you
Short form:
public class MyClassA implements Serializable { ... }
// in some other code elsewhere...
MyClassA tmp=new MyClassA(arg);
FileOutputStream fos=new FileOutputStream("some.obj");
ObjectOutputStream out=new ObjectOutputStream(fos);
out.writeObject(tmp);
out.flush();
out.close();
In a bit more detail...


To (de-)serialize an object, it must
implements Serializable
 All of its data members must also be marked
serializable
 And so on, recursively...
 Primitive types (int, char, etc.) are all
serizable automatically
 So are Strings, most classes in java.util, etc.
This saves/retrieves the entire object graph,
including ensuring uniqueness of objects
The object graph and uniqueness
MondoHashTable
Entry
"tyromancy"
Vector
Entry
"zygopleural"
Now some problems…

static fields are not automatically serialized
Not possible to automatically serialize them
because they’re owned by an entire class, not
an object
Options:
 final static fields are automatically
initialized (once) the first time a class is
loaded
 static fields initialized in the static {}
block will be initialized the first time a class is
loaded
 But what about other static fields?


When default serialization isn’t enough


Java allows writeObject() and
readObject() methods to customize output
If a class provides these methods, the
serialization/deserialization mechanism calls
them instead of doing the default thing
writeObject() in action
public class DemoClass implements Serializable {
private int _dat=3;
private static int _sdat=2;
private void writeObject(ObjectOutputStream o)
throws IOException {
o.writeInt(_dat);
o.writeInt(_sdat);
}
private void readObject(ObjectInputStream i)
throws IOException, ClassNotFoundException {
_dat=i.readInt();
_sdat=i.readInt();
}
}
Things that you don’t want to save


Sometimes, you want to explicitly not store
some non-static data
 Computed vals that are cached simply for
convenience/speed
 Passwords or other “secret” data that
shouldn’t be written to disk
Java provides the “transient” keyword.
transient foo means don’t save foo
public class MyClass implements Serializable {
private int _primaryVal=3; // is serialized
private transient int _cachedVal=_primaryVal*2;
// _cachedVal is not serialized
}
Issue: #0 -- non Serializable fields


What happens if class Foo has a field of type
Bar, but Bar isn’t serializable?
If you just do this:
Foo tmp=new Foo();
ObjectOutputStream out=new ObjectOutputStream;
out.writeObject(tmp);



You get a NotSerializableException
Answer: use read/writeObject to explicitly
serialize parts that can’t be handled otherwise
Need some way to get/set necessary state
Issue: #0.5 -- non-Ser. superclasses

Suppose
 class Foo extends Bar implements
Serializable
 But Bar itself isn’t serializable

What happens?
Bar
(not serializable)
Foo
(serializable)
Non-Serializable superclasses, cont’d




Bar must provide a no-arg constructor
Foo must use readObject/writeObject to
take care of Bar’s private data
Java helps a bit with defaultReadObject and
defaultWriteObject
Order of operations (for deserialization)
 Java creates a new Foo object


Java calls Bar’s no-arg constructor
Java calls Foo’s readObject



Foo’s readObject explicitly reads Bar’s state data
Foo reads its own data
Foo reads its children’s data
In O’Reilly Java I/O




父類別沒有實作Serializable介面,而且沒有提供
無引數的建構子
java.lang.Object沒有實作Serializable
 每個類別都至少有一個不能分解的父類別
重組時,會呼叫沒有實作Serializable的最近血緣
之父類別的無引數建構子(真難懂!),以建立
該物件不可分解的父類別之狀態(超複雜!)
PS: 以上原文抄錄
When having a non-serializable parent


Class ZipFile does not implements Serializable,
and it does not have a no-arg constructor
public class ZipFile implements java.util.zip.ZipConstants
public ZipFile(String filename) throws
IOException
 public ZipFile(File file) throws ZipException,
IOException
What can we do?
 Can anyone answer me?


Issue: #1 -- Efficiency





For your MondoHashTable, you can just
serialize/deserialize it with the default methods
But that’s not necessarily efficient, and may even
be wrong
By default, Java will store the entire internal
_table, including all of its null entries!
Now you’re wasting space/time to load/save all
those empty cells
Plus, the hashCode()s of the keys may not be the
same after deserialziation -- should explicitly rehash
them to check.
 hashCode() is defined in java.lang.Object
 Address is usually used in the default
implementation
Issue: #2 -- Backward compatibility





Suppose that you have two versions of class
Foo: Foo v. 1.0 and Foo v. 1.1
The public and protected members of 1.0 and
1.1 are the same; the semantics of both are the
same
So Foo 1.0 and 1.1 should behave the same and
be interchangable
BUT... The private fields and implementation of
1.0 and 1.1 are different
What happens if you serialize with a 1.0 object
and deserialize with a 1.1? Or vice versa?
Backward compat, cont’d.



Issue is that in code, only changes to the public or
protected matter
With serialization, all of a sudden, the private data
members (and methods) count too
 Serialization is done by the JVM, not codes in
ObjectInputStream/ObjectOutputStream
 This is a kind of privilege
Have to be very careful to not muck up internals in
a way that’s inconsistent with previous versions
 E.g., changing the meaning, but not name of
some data field
Backward compat, cont’d

Example:
// version 1.0
public class MyClass {
MyClass(int arg) { _dat=arg*2; }
private int _dat;
}
// version 1.1
public class MyClass {
MyClass(int arg) { _dat=arg*3; } // NO-NO!
private int _dat;
}
Backward compat, cont’d:



Java helps as much as it can
Java tracks a “version number” of a class that
changes when the class changes “substantially”
 Fields changed to/from static or transient
 Field or method names changed
 Data types change
 Class moves up or down in the class
hierarchy
Trying to deserialize a class of a different
version than the one currently in memory
throws InvalidClassException
Yet more on backward compat





Java version number comes from names of all
data and method members of a class
If they don’t change, the version number won’t
change
If you want Java to detect that something about
your class has changed, change a name
But, if all you’ve done is changed names (or
refactored functionality), you want to be able to
tell Java that nothing has changed
Can lie to Java about version number:
static final long serialVersionUID = 3530053329164698194L;
The detail list of compatibility


You have to check the following rules
 http://java.sun.com/javase/6/docs/platform/s
erialization/spec/version.html
One of the key idea is that
 When restoring an object, new things are
allowed, and old things should be kept
Issues #3: When facing Singleton pattern

When you are restoring a Singleton object, you
need to check whether there is an existing
singleton object in the system
 This is logical correctness, and you need to
check and guarantee it by yourself!
Default Write/Read Object


Sometimes, we want to add some additional
information
For example
public class NetworkWindow implements Serializable {
private Socket theSocket;
//and many other fields and methods
}
Recover the states
public class NetworkWindow implements Serializable {
private transient Socket theSocket;
//and many other fields and methods
private void writeObject(ObjectOutputStream out)
throws IOException {
out.defaultWriteObject();
out.writeObject(theSocket.getInetAddress());
out.writeInt(theSocket.getPort());
}
private void readObject(ObjectInputStream in)
throws IOException, ClassNotFoundException {
in.defaultReadObject();
InetAddress ia = (InetAddress) in.readObject();
int thePort = in.readInt();
this.theSocket = new Socket(ia, thePort);
}
}
Preventing Serialization

Sometimes you don’t want your class object to
be serialized, but your parent implements
Serializable…
 You can override writeObject and readObject,
and throw exceptions

throw new NotSerializableException();
Summary





Make the thing sequential, and so writable
 Serialization
Serialization is difficult and technical, you need
to be aware of all the class hierarchy which you
are going to serialize
You can define your own serialization process
You can add additional information when
serializing
You can prevent an instance from serializing