Serialization

Download Report

Transcript Serialization

Serialization
Flatten your object for automated storage or
network transfer
Spring/2002
Distributed Software Engineering
C:\unocourses\4350\slides\DefiningThreads
1
Software object persistence
• Persistence: Saving information about an object
to recreate at different time, or place or both.
• Object serialization means of implementing
persistence: convert object’s state into byte stream
to be used later to reconstruct (build-deserialized)
a virtually identical copy of original object.
• Default serialization for an object writes:
– the class of the object,
– the class signature,
– values of all non-transient and non-static fields.
Spring/2002
Distributed Software Engineering
C:\unocourses\4350\slides\DefiningThreads
2
Serialization protocol
• For serialization:
– java.io.ObjectOutputStream via
writeObject which calls on defaultWriteObject,
• For deserialization:
– java.io.ObjectInputStream via
readObject which calls on defaultReadObject.
• Any object instance that belongs to the
graph of the object being serialized must be
serializable as well.
• Superclass must be Serializable.
Spring/2002
Distributed Software Engineering
C:\unocourses\4350\slides\DefiningThreads
3
Serialization protocol
• Customize default: implement extended
versions of default methods in:
– writeObject
– readObject
– But final fields cannot be read with
readObject. Need to use default.
• Create own complete serialization by
implementing the interface Externalizable.
Spring/2002
Distributed Software Engineering
C:\unocourses\4350\slides\DefiningThreads
4
Specifying persistent objects
• Class of the object to be serializable must
implement interface:
java.io.Serializable
• This interface is an empty interface and is
used to mark the objects of such class as
persistent.
Spring/2002
Distributed Software Engineering
C:\unocourses\4350\slides\DefiningThreads
5
Deserialization
• It reads values written during serialization
• Static fields in the class are left untouched.
– If class needs to be loaded, then normal initialization of
the class takes place, giving static fields its initial
values.
• Transient fields will be initialized to default values
• Recreation of the object graph will occur in
reverse order from its serialization.
Spring/2002
Distributed Software Engineering
C:\unocourses\4350\slides\DefiningThreads
6
Example
import java.io.Serializable;
import java.util.Date;
import java.util.Calendar;
public class PersistentTime implements Serializable {
public PersistentTime() {
time = Calendar.getInstance().getTime();
}
public Date getTime() {
return time;
}
private Date time;
}
Spring/2002
Distributed Software Engineering
C:\unocourses\4350\slides\DefiningThreads
7
Class java.io.ObjectOutputStream
• An ObjectOutputStream instance writes primitive
data types and graphs of Java objects to an
OutputStream. The objects can be read (reconstituted)
using an ObjectInputStream. Persistent storage of
objects can be accomplished by using a file for the stream.
If the stream is a network socket stream, the objects can be
reconstituted on another host or in another process.
• Only objects that support the java.io.Serializable
interface can be written to streams. The class of each
serializable object is encoded including the class name and
signature of the class, the values of the object's fields and
arrays, and the closure of any other objects referenced
from the initial objects.
Spring/2002
Distributed Software Engineering
C:\unocourses\4350\slides\DefiningThreads
8
Class java.io.ObjectOutputStream
• The method writeObject is used to write an object to
the stream. Any object, including Strings and arrays, is
written with writeObject. Multiple objects or
primitives can be written to the stream. The objects must
be read back from the corresponding
ObjectInputstream with the same types and in the
same order as they were written.
• Primitive data types can also be written to the stream using
the appropriate methods from DataOutput. Strings can
also be written using the writeUTF method.
Spring/2002
Distributed Software Engineering
C:\unocourses\4350\slides\DefiningThreads
9
Example
import java.io.ObjectOutputStream;
import java.io.FileOutputStream;
import java.io.IOException;
public class FlattenTime{
public static void main(String [] args){
String filename = "time.ser";
if(args.length > 0){
filename = args[0];
}
PersistentTime time = new PersistentTime();
FileOutputStream fos = null;
ObjectOutputStream out = null;
try{
fos = new FileOutputStream(filename);
out = new ObjectOutputStream(fos);
out.writeObject(time);
out.close();
}
catch(IOException ex){
ex.printStackTrace();
}
}
}
Spring/2002
Distributed Software Engineering
C:\unocourses\4350\slides\DefiningThreads
10
import java.io.ObjectInputStream;
import java.io.FileInputStream;
import java.io.IOException;
import java.util.Calendar;
public class InflateTime{
public static void main(String [] args){
String filename = "time.ser";
if(args.length > 0){
filename = args[0];
}
PersistentTime time = null;
FileInputStream fis = null;
ObjectInputStream in = null;
try{
fis = new FileInputStream(filename);
in = new ObjectInputStream(fis);
time = (PersistentTime)in.readObject();
in.close();
}
catch(IOException ex){
ex.printStackTrace();
}
catch(ClassNotFoundException ex){
ex.printStackTrace();
}
System.out.println("Flattened time: " + time.getTime());
System.out.println("Current time: " + Calendar.getInstance().getTime());
}
}
Spring/2002
Distributed Software Engineering
C:\unocourses\4350\slides\DefiningThreads
11
Serializable vs. Non-Serializable objects
• Java.lang.Object does not implement serializable,
so you must decide which of your classes need to
implement it.
• AWT, Swing components, strings, arrays are
defined serializable.
• Certain classes and subclasses are not serializable:
Thread, OutputStream, Socket
• When a serializable class contains instance
variables which are not or should not be
serializable they should be marked as that with the
keyword transient.
Spring/2002
Distributed Software Engineering
C:\unocourses\4350\slides\DefiningThreads
12
Transient fields
• These fields will not be serialized.
• When deserialized, these fields will be initialized
to default values
– Null for object references
– Zero for numeric primitives
– False for boolean fields
• If these values are unacceptable
– Provide a readObject() that invokes
defaultReadObject() and then restores transient fields to
their acceptable values.
– Or, the fields can be initialized when used for the first
time. (Lazy initialization.)
Spring/2002
Distributed Software Engineering
C:\unocourses\4350\slides\DefiningThreads
13
Serial version UID
• You should explicitly declare a serial version UID
in every serializable class.
– Eliminates serial version UID as a potential source of
incompatibility.
– Small performance benefit, as Java does not have to
come up with this unique number.
– private static final long serialVersionUID =rlv;
– rlv can be any number out thin air, but must be unique
for each serializable class in your development.
– If you want to make a new version of the class
incompatible with existing version, choose a different
UID. Deserialization of previous version will fail with
InvalidClassException.
Spring/2002
Distributed Software Engineering
C:\unocourses\4350\slides\DefiningThreads
14
Customizing OutputObjectStream,
InputObjectStream
• To provide special behavior in the writing
or reading of stream object bytes implement
private void writeObject(ObjectOutputStream out)
throws IOException;
private void readObject(ObjectInputStream in) throws
IOException, ClassNotFoundException;
Spring/2002
Distributed Software Engineering
C:\unocourses\4350\slides\DefiningThreads
15
Creating your own protocol: Externalizable
• Instead of implementing the Serializable
interface, implement Externalizable:
interface Externalizable{
public void writeExternal(ObjectOutput out)
throws IOException;
public void readExternal(ObjectInput in) throws
IOException;
}
Spring/2002
Distributed Software Engineering
C:\unocourses\4350\slides\DefiningThreads
16
Performance
• Serialization is a very expensive process.
You must clearly have reasons to serialize
instead of you directly writing what you
need to save about the state of an object.
Spring/2002
Distributed Software Engineering
C:\unocourses\4350\slides\DefiningThreads
17
Default or Customized serialization? Or
Implementing Serializable judiciously
• Allowing a class’s instances to be serializable can
be as simple as adding the words “implements
Serializable” to the class specification.
• This is a common misconception, the truth is far
more complex.
• While efficiency it is one cost associated with it,
there are other long-term costs that are much more
substantial.
• Using default serialization is very easy but this a
very specious
Spring/2002
Distributed Software Engineering
C:\unocourses\4350\slides\DefiningThreads
18
Serialization Costs
• Your object’s private structure is out for the
viewing!!!! It’s become part of the API.
• A major cost is that it decreases flexibility to
change a class’s implementation once the class has
been release
• Increases the likelihood of bugs and security
holes.
• Increases the testing associated with releasing a
new version of the class.
Spring/2002
Distributed Software Engineering
C:\unocourses\4350\slides\DefiningThreads
19
Serialization caveats
• Implementing Serializable is not a decision to be
undertaken lightly.
• Classes design for inheritance should rarely
implement serializable and interfaces should rarely
extend it.
– You should provide parameterless constructor on nonserializable classes designed for inheritance, in case it is
subclassed and the subclass wants to provide
serialization.
• Inner classes should rarely if ever, implement
Serializable.
• A static member class can be serializable.
Spring/2002
Distributed Software Engineering
C:\unocourses\4350\slides\DefiningThreads
20
Consider using a custom serialized form
• The default serialized form of an object is an
encoding of the physical representation of the
object graph rooted at the object
– Data contained in the object
– Data contained in every object reachable from it.
– Topology by which all of these objects are interlinked.
• The ideal serialized form contains only the logical
data represented by the object. It is independent of
its physical representation.
Spring/2002
Distributed Software Engineering
C:\unocourses\4350\slides\DefiningThreads
21
Consider using a custom serialized form
• Default serialization is likely to be
appropriate if an object’s physical
presentation is identical to its logical
content.
– Appropriate: A Name class.
– Not appropriate: A doubly linked List class.
Spring/2002
Distributed Software Engineering
C:\unocourses\4350\slides\DefiningThreads
22
Consider using a custom serialized form
• Disadvantages of default serialization when
physical and logical representation differ:
– Permanently ties the exported API to the
internal representation.
– Can consume excessive space.
– Can consume excessive time.
– Can cause stack overflow.
Spring/2002
Distributed Software Engineering
C:\unocourses\4350\slides\DefiningThreads
23
Consider using a custom serialized form
• A reasonable serialized form for a List is the
number of entries followed by each of the entries.
• Although default serialized form is correct for a
List case, it may not be the case for any object
whose invariants are tied to implementationspecific details.
– Example: a hash table using buckets. This is based on
the hash code of the key, which may change from JVM
to JVM, or for different runs of the hash table in same
JVM. Thus default serialized form can violate the
invariant for hash tables in this case.
Spring/2002
Distributed Software Engineering
C:\unocourses\4350\slides\DefiningThreads
24
readObject() and security attacks
• Deserialization uses defaultReadObject()
and readObject() to create a new instance of a
class.
• Thus readObject is a constructor!!!!!
• So, readObject must behave like any other
constructor:
– Check for argument’s validity if need be
– Make copies of parameters where needed
• Otherwise, a very simple job for an attacker to
violate object’s invariants.
– Provide a hand-made serialization of the attack object.
Spring/2002
Distributed Software Engineering
C:\unocourses\4350\slides\DefiningThreads
25
Guide for writing a bulletproof readObject
• Private reference fields should be initialized with
copies of its values.
• Check invariants and throw an
InvalidObjectException if they fail.
• As with constructors, do not invoke any
overridable methods.
• If an entire object graph must be check for validity
after deserialization, the
objectInputValidation interface should
be used.
Spring/2002
Distributed Software Engineering
C:\unocourses\4350\slides\DefiningThreads
26
writeReplace()
• Sometimes it may not be appropriate to serialize
the actual object, but some specifically given
object.
<access> Object writeReplace()
throws ObjectStreamException;
Returns an object that will replace
the current object during
serialization. Any object may be
returned including the current one.
Spring/2002
Distributed Software Engineering
C:\unocourses\4350\slides\DefiningThreads
27
A comment about access qualifier
• These methods can be of any accessibility
• They will be used if they are accessible to the
object type being serialized
– If a class has private readResolve, it only affects
serialization of objects that are exactly its type.
– If package-accessible readResolve affects only
subclasses within the same package
– public and protected readResolve affect objects of all
subclasses.
Spring/2002
Distributed Software Engineering
C:\unocourses\4350\slides\DefiningThreads
28
readResolve()
• Recall that deserialization produces an instance of
a class object.
• If a given class should only have one instance
(singleton pattern), then via deserialization we can
provide a different instance!!!
• In general you need to be concerned of what is
being created for instance-controlled classes.
• Enter: readResolve(); this is a method that
returns the appropriate instance of the class at
hand by the readObject() or defaultReadObject()
methods.
<access> readResolve() throws ObjectStreamException;
Spring/2002
Distributed Software Engineering
C:\unocourses\4350\slides\DefiningThreads
29