How to use object serialization for bean persistence
Up to this point in the JavaWorld column on JavaBeans, we’ve discussed beans from the vantage point of how they behave within a single running Java program. The JavaBeans we’ve discussed so far only survive as long as there’s an active reference to them, and as long as the program in which they run is executing. It would, however, be very useful for a software component to be able to survive the death of the program in which it runs, to be “resurrected” and run again when that program is revitalized; or maybe we’d want the component to be able to move from machine to machine, gathering information, or performing remote services. In either case, persistence is the key.
When the last reference to a bean goes out of scope, or when the program exits, all of a bean’s “state” (the values of the bean’s fields) is lost forever, unless we’ve saved enough information about what was inside the bean to reconstruct it later. Software object persistence is nothing more than saving information about an object so that it can be recreated at a different time and/or place. Object serialization is a means of implementing persistence by converting the object’s state into a stream of bytes that can later be used to reconstruct a virtually identical copy of the original object.
In this article, we’re going to take a look at some of the benefits that a persistence mechanism provides to a software component framework. We’ll discuss the goals of the JavaBeans persistence approach, and then go over some introductory code examples of persistent JavaBeans.
A matter of simple storage
Though object persistence may seem like a new idea to you, you’re probably more familiar with it than you know. Every file on your hard drive or floppy disk can be thought of as a persistent software object of one sort or another. For example, let’s say you use a text editor to create a text file. At one point in time, there was a data structure in memory on some computer that contained the characters of your document. When you gave the editor the Save As command, you were really telling the program to persist the contents of its memory to a disk file. (Thank goodness the people who design user interfaces know better than to present information to users this way — teaching my mom to use a word processor is hard enough without having to deal with menu items like Persist Memory State!)
The next time you run the text editor and you Load a file, the program reads the information in the text file and creates a structure in memory that is identical, more or less, to what was in memory when the file was last saved. The phrase “more or less” is significant because, typically, not all of the information about a software object is saved. For example, in the text editor there may be Undo information hanging around that goes away when the editor dies. The next time you start the editor, the file contents are there, but the Undo information is off in the Big Bit Bucket in the sky.
How object persistence and serialization work
Software object persistence works in precisely the same way as our file-saving example above: A software object in an object-oriented system can be serialized, or converted into a stream of bytes, which can be used to resurrect the object at some other place and/or time. If you’re the developer who wrote the text editor discussed above, you may well have organized your program so that a document is a single object, which might be a single “monolithic” object that does everything, or an aggregation of many smaller objects that perform specialized tasks. If you ask the Document
object for its serialized state, it simply returns a (possibly very long) string, which you then squirrel away on disk. When the user asks that a file be opened, your program opens the file, reads the string, and hands it to the Document
class (or some class that knows how to create Document
objects), and, voilà ! The Document
object is risen from the dead.
Now, when the program told the Document
object to serialize, the Document
might have returned a long ASCII string with embedded newlines, which, when sent directly to a printer, would be readable. (Printing could actually be considered “serializing to paper,” but only a geek would say it that way.) The Document
also might have returned a string of compressed, illegible gibberish that you’d never be able to figure out in a thousand years, but which the Document
object “understands” and can use to reconstruct that instance of the Document
. This is an important point: Objects that serialize themselves into strings also know how to read those strings to restore themselves to their original state. An object of a certain class wrote that string, and given the same string later, that class had better know how to reconstruct the instance. Otherwise, persistence simply doesn’t work.
Precisely how an object is serialized is arbitrary, at least in the general case. (Specific component technology specifications spell out the format of a serialized object, and that is anything but arbitrary.) That’s why any reasonable word processor gives you the choice of one of a dozen or more file formats. Every word processor serializes its document objects in a different way, but since they’re all documents, and documents share general traits like characters, fonts, paragraphs, leading, and so on, it’s possible for software objects (and, therefore, word processors) to read each others’ formats and interoperate. Likewise, software objects can be written to read and write each others’ serialization formats and “pretend” to be instances of one another. We’ll discuss this more in the section below entitled “Interoperation: works and plays well with others.”
Objects that have been “freeze-dried” into strings can then be transmitted, stored, and otherwise manipulated as strings. The ability to store objects as strings gives system designers a lot of flexibility. Imagine you’re designing the graphical user interface for a database application, and you’ve created a hot-looking component that manipulates the contents of a particular table or query result. You could serialize that customized component and save it in the database itself, say in a table called EDITORS, along with the name of the table. You could then organize your database application’s user interface around combining these editing components, each of which specializes in manipulating a particular set of data. In order to change, for example, the screens used to edit particular tables, you need only change the associated serialized editing component in the EDITORS table, and the end-user application would automatically use the new component for future access to that table. (This is just an example. Obviously, there’s a lot to be said for organizing your database applications around workflow instead of around the underlying data model.)
Interoperation: Works and plays well with others
So far, we’ve placed no restrictions or expectations on how objects serialize themselves. We’ve simply said that objects should be able to turn themselves into strings (serialization), and then turn strings into instances of themselves (deserialization). Component technologies, however, much like word processors, have specific formats (and rules) that make it easier to automate a lot of the details of how to perform the serialization and deserialization. These rules and formats are spelled out in the component technology specification document. (The serialization specification for JavaBeans can be found in the Resources section.) When serialization formats are standardized, software can manipulate the serialized data strings in more detail, since developers know what to expect from a well-formed string.
Standardizing serialization can also make programming easier. It’s certainly possible for a programmer to go through every class that needs persistence in an application, writing functions that say “write this, write that” and “read this, read that,” but this makes for awfully tedious work. A component technology specification provides detailed guidelines for how to perform the serialization, so much of the serialization code comes free of charge. (This frees programmers up for more interesting work, such as battling irreproducible bugs and narcoleptic operating systems.) We’ll discuss how JavaBeans handles serialization in the section on JavaBeans serialization below.
One of the coolest things about standardized object serialization is that it allows different component technologies and even different languages to share and process each others’ objects. If, for example, WordPerfect can read and write WordStar files (leaving aside the question of why it would want to), why couldn’t a JavaBean read and write a file containing a serialized OLE or OpenDoc object? (The answer is: It can — at least in theory.) If the OpenDoc serialization specification is openly available, a JavaBean could be written to be able read OpenDoc objects. The user could then manipulate the data in the object, and the JavaBean could write its internal data back to the file in OpenDoc object format. Later, when a “real” OpenDoc application opens the file, it would find the new state in its native format, never suspecting that a JavaBean had anything to do with it. As long as the standards remain open (and closed standards are arguably worse than no standards at all), components should be able to interoperate.
Beam me up, Scotty: Distributed systems
Years ago I spent a rainy week in Amsterdam feeling sorry for myself because I was stuck there, waiting for an insurance form to arrive in the mail. For reasons I still can’t fathom, it hadn’t occurred to me to have the document faxed to me for my signature. (Fax machines were less common then, but still…) It didn’t really matter where I was when I signed the document; it just needed to be signed and returned to its point of origin for further processing. If the fax option had occurred to me, I could have asked dear old Dad to serialize the insurance form into electronic pulses (via a fax machine), after which it would have been deserialized into a duplicate document (with another fax) in Amsterdam, signed by myself, and sent or faxed back to the U.S.
Object serialization makes something analogous to the above insurance form example possible for software objects. Sometimes the resources necessary to perform a particular task aren’t available locally. (In the case of my insurance form, the resource was my hand, which was, along with the rest of me, in The Netherlands.) Other times, it’s computationally cheaper to pack up objects and ship them out for processing on other machines, a process called load balancing. In still other situations, existing (“legacy”) systems can be wrapped in new layers of software, meaning that effectively they are repackaged as network services. Objects can be serialized and sent to the legacy-system “wrapper” code, which reconstitutes the objects, operates on them with the old system, and sends them back to the system from which they originated (or sends them on to other systems).
A distributed system can be loosely defined as a system in which data may be operated on by any of several processors. All of these applications are examples of distributed processing, in which software objects are freed from having to run on the individual computers on which they were, so to speak, born.
One form of distributed processing, called an object request broker, or ORB, involves serializing objects and method call arguments and shipping them around for processing on remote systems. One common example of an object request broker is CORBA. CORBA specifies object formats and operations so completely that objects can be created and processed by programs running on different computers on a network, even if the programs were originally written in different languages. We’ll fool around with CORBA in a later column.
JavaBeans serialization
The JavaBeans API Specification spells out at specifically the goals of the bean serialization mechanism and how those goals are achieved in the API. JavaBeans persistence is constructed on top of two Java 1.1 features: object serialization (primarily) and introspection (which is built on top of reflection).
Several classes and interfaces were added to the <a href="
package to support object serialization. These new classes and interfaces know how to read and write all of Java’s built-in data types like byte, int, double, and so on, so that’s taken care of for you. (Strings are written in Universal Transfer Format, or UTF.) The interfaces that describe how to read and write these data types are specified in <a href="
and <a href="https://java.sun.com/products/jdk/1.1/docs/api/java.io.DataInput.html">java.io.DataInput</a>
. These interfaces are implemented in various places, including <a href="
and <a href="https://java.sun.com/products/jdk/1.1/docs/api/java.io.ObjectInputStream.html">java.io.ObjectInputStream</a>
. The object input and output streams are used (as we’ll see in the coding examples below) to output object contents in accordance with rules specified by the Java Object Serialization Specification document (see Resources below).
In order for these new classes in java.io
to work as specified, they often need to know things about the class files that only the Java 1.1 Reflection mechanism can provide. Reflection makes it possible to programmatically figure out a class’ fields and methods. This means it’s possible to write a generic class that figures how objects are connected (by reference) and all of the fields each object contains and automatically write the whole mess to a stream in one (recursive) operation. For many Java classes (most notably <a href="https://java.sun.com/products/jdk/1.1/docs/api/java.awt.Component.html">java.awt.Component</a>
), the JavaSoft team at Sun has already done this for you. In order to put your own classes into an ObjectOutputStream
(or create them from an ObjectInputStream
), your class need implement only one of two interfaces: <a href="
or <a href="https://java.sun.com/products/jdk/1.1/docs/api/java.io.Externalizable.html">java.io.Externalizable</a>
. The latter is the interface that makes operating with other component technologies possible, to name just one example, but going into depth on this is beyond the scope of this column (see Resources). For now, we’ll concentrate on the Serializable
interface.
Creating a serializable bean: An example
Let’s look at an example of how to make a bean serializable. The interface java.io.Serializable
specifies that a class that implements it contain two methods with the following signatures:
private void writeObject(java.io.ObjectOutputStream out)
throws IOException;
private void readObject(java.io.ObjectInputStream in)
throws IOException, ClassNotFoundException;
You see that the java.io.Serializable
methods take an java.io.ObjectInputStream
or java.io.ObjectOutputStream
as an argument. These input/output streams contain the know-how for reading and writing basic data types to and from an underlying stream. If you write fields a, b, c, in that order, to an ObjectOutputStream
, then later, when you read the stream in the order a, b, c, you’ll get back the same objects you wrote (assuming your code’s correct). The ObjectInputStream
and ObjectOutputStream
classes inherit java.io.InputStream
and java.io.OutputStream
(respectively), the standard input/output streams we use for files and sockets and so forth. This is great because we can save objects to files and transmit them through networks using the same interface. The object streams enforce such hair-raising details as block alignment and buffering. (Serialized object streaming is buffered to improve performance, and serialized objects’ fields are aligned to block boundaries so that the objects can more easily be manipulated as text.)
When your class claims that it implements java.io.Serializable
, it’s telling Java it can handle writing itself to a stream. Fortunately, Java serialization is designed to do so easily: Your code need only write its fields to the stream as objects in the writeObject()
field, and read them from the stream in the readObject()
field, all the time ensuring that the order of reading and writing are the same. Here’s an example of a serializable object, an uninspired little class called SampleObject1
. This is a simple class with just a couple of fields, and we’re going to implement the functions necessary to allow the object to serialize itself.
import java.io.*;
import java.lang.*;
// this is boring, but it gets the point across.
public class SampleObject1 extends java.lang.Object
implements java.io.Serializable {
protected int a_;
protected int b_;
protected String sTitle_ = new String("");
// Necessary to be a well-behaved Bean.
public SampleObject1()
{
}
// Create manually.
public SampleObject1(int a, int b, String s) {
a_ = a;
b_ = b;
sTitle_ = s;
}
public void print()
{
System.out.println("a=" + a_ + "nb=" + b_ + "ns=" + sTitle_);
}
// How to write myself to a stream
private void writeObject(java.io.ObjectOutputStream out)
throws IOException
{
out.writeInt(a_);
out.writeInt(b_);
if (sTitle_ == null)
out.writeObject(null);
else
out.writeUTF(sTitle_);
};
// How to load myself from a stream
private void readObject(java.io.ObjectInputStream in)
throws IOException, ClassNotFoundException
{
a_ = in.readInt();
b_ = in.readInt();
sTitle_ = in.readUTF();
}
// Properties
public void setTitle(String sTitle) { sTitle_ = sTitle; }
public String getTitle() { return sTitle_; }
public void setA(int aa) { a_ = aa; }
public int getA() { return a_; }
public void setB(int bb) { b_ = bb; }
public int getB() { return b_; }
};
You see above (in red) that the class extends java.lang.Object
. This is because the function that creates beans from a stream must be able to get the object’s class. The java.lang.Object
class provides that ability for free. We also announce to the compiler that our bean implements java.io.Serializable
. A bit further down in the code example, in purple, you see the internal string being initialized to an empty string. This keeps the string from being null. If we were to leave it null, the BeanBox would silently and mysteriously ignore the Title property, and only display properties for a and b . After the string, in blue, is a public zero-argument constructor, required by all JavaBeans so that empty objects may be created. Finally, in green, are the implementations of writeObject()
and readObject()
. They simply write or read their state to and from the stream and return.
You’ll find that, if your bean already subclasses existing classes, such as java.awt.Component
, you need only specify that your class implements java.io.Serializable
. The object’s readObject()
and writeObject()
methods are already written for you in the class you’re subclassing. These two methods keep track of all the nasty details such as initializing superclasses and chasing references to other objects. We’re writing our own readObject()
and writeObject()
purely to show how it works.
In fact, you can actually comment out the readObject()
and writeObject()
functions, and this class will still serialize just fine. (Try it!) Why don’t we have to define these functions, when implements Serializable
clearly states we have to? Because every class is a subclass of java.lang.Object
, and java.io.ObjectOutputStream
knows implicitly how to serialize any object. The class still has to say that it implements Serializable
, though, or the ObjectOutputStream
will throw a NotSerializableException
. In a real (and hopefully useful) class, the only reason you’d want to write your own readObject()
and writeObject()
functions is if you wanted to somehow extend or replace what they do.
Here’s a very simple “main” program that lets us create a file with a serialized bean inside. Serialized bean files (sometimes called “pickle” files, because they contain “pickled” beans) have the extension .ser
by convention. This main program, if its first argument is “w,” reads its next two arguments as integers and the following argument as a string, creates a SampleObject1
from these values, and serializes the object to a .ser
file. If the first argument is “r,” it reads the next argument as a filename, opens that file as an ObjectInputStream
, reads the object from the file by calling ObjectInputStream.readObject()
and casting the result to SampleObject1
, and tells the resulting object to print itself.
import java.io.*;
import SampleObject1;
public class Demo1 {
private static void Usage() throws java.io.IOException
{
System.out.println("Usage:ntDemo1 w file a bntDemo1 r file");
IOException ex = new IOException("ERROR");
throw ex;
}
public static void main(String[] args)
{
String cmd = args[0];
try {
if (cmd.compareTo("w") == 0)
{
if (args.length != 5)
{
Usage(); // UNIX anyone?
}
int aa = Integer.parseInt(args[2]);
int bb = Integer.parseInt(args[3]);
String ss = args[4];
SampleObject1 bar = new SampleObject1(aa, bb, ss);
FileOutputStream f = new FileOutputStream(args[1]);
RomanOutStream s = new RomanOutStream(f);
System.out.println("Write SampleObject1 a=" + aa +
", b=" + bb +
", s="" + ss + """);
s.writeObject(bar);
s.flush();
}
else if (cmd.compareTo("r") == 0)
{
if (args.length != 2)
{
Usage();
}
FileInputStream f = new FileInputStream(args[1]);
ObjectInputStream s = new ObjectInputStream(f);
System.out.println("Read SampleObject1:");
SampleObject1 bar = (SampleObject1) s.readObject();
bar.print();
}
else {
System.err.println("Unknown command " + cmd);
Usage();
}
}
catch (IOException ex) {
System.out.println("IO Exception:");
System.out.println(ex.getMessage());
ex.printStackTrace();
}
catch (ClassNotFoundException ex) {
System.out.println("ClassNotFound Exception:");
System.out.println(ex.getMessage());
ex.printStackTrace();
}
}
};
You can use this program to create a .ser
containing a serialized bean, like so:
C:> java Demo1 w demo1.ser 10 22 Hallelujah!
Write SampleObject1 a=10, b=22, s="Hallelujah!"
(I’m listening to Handel’s Messiah as I write this.) Creating a serialized bean may not be so exciting in itself perhaps, but if you put the SampleObject1
class file and the serialized bean together in a JAR file and use LoadJar to load it into the BeanBox, you’ll see that the serialized “Demo1” bean appears in the BeanBox’s tool window, already initialized and ready to use. (You can get the JAR file from a link in the Resources section below.)
Load the JAR file (using LoadJar), then put a demo1
bean (which is a customized, serialized SampleObject1
bean) on the BeanBox. This is how the BeanBox should look if demo1
is the SampleObject1
with a=10
, b=22
, and Title=Hallelujah!
created above.
So what?, you may be thinking. Well, look what we’ve done here. With just a few lines of code, we’ve written a program that writes an object out as a file that another program (the BeanBox) can use to recreate the object in its own memory space. Our Demo1 program and the BeanBox are interoperating, albeit at a fairly stupid level. You could also place a SampleObject1
in the BeanBox, customize it, write it to a .ser
file with the File… Serialize Component function, and Demo1
(with the argument “r” and the name of the file) could read the object and print it out. (Try this as an exercise.) This illustrates the principle of how programs can use the serialized object format to interoperate. This interoperation becomes even more interesting when the programs are running on different machines in a network.
Instantiating beans with java.beans.Beans.instantiate()
The code in the Demo1
class above reads an object from a file when it receives the “r” argument. It has to open a file, create a stream, typecast, and so on. There’s an easier way to do this — by using the static function <a href=" java.lang.String)">java.beans.Beans.instantiate()</a>
. This function instantiates a JavaBean, given a ClassLoader
and a String
, which is the name of the class. (What’s a ClassLoader
, you ask? Don’t ask. I’m not even going to go there in this article. If you pass null
for the first argument, you get the default class loader, and that’s all you need to know for the moment. We’ll look into class loaders in a future column.)
The instantiate()
function does all sorts of cool stuff for you: it figures out if you’re looking for a class or a serialized object, it searches the CLASSPATH
for a .ser
file (when appropriate), and it even handles the object specially if it’s an Applet
. Check out the documentation to find out more. Meanwhile, let’s replace the object reading code from the Demo1 example above with a single call to instantiate()
:
import java.beans.*;
class Demo2 ... {
// ... code from previous listing...
else if (cmd.compareTo("r") == 0)
{
if (args.length != 2)
{
Usage();
}
System.out.println("Read SampleObject1:");
SampleObject1 bar =
(SampleObject1)Beans.instantiate(null, args[1]
bar.print();
}
};
(See the Resources section below for links to the full source code “Demo2.java”.) The instantiate()
function throws an exception java.lang.ClassNotFoundException
if it can’t instantiate the object, or a java.io.IOException
if there’s an I/O problem (for example, if the file doesn’t exist).
Conclusion
This month, we’ve looked at why and how to freeze-dry your JavaBeans. We discussed the benefits of software component serialization standards, and went over some examples of how to serialize objects in Java. Next month, we’ll look at serializing structures of objects, and discuss how to gain more control over your bean serialization format.