Learn how to use ClassLoaders and exploit the zip library
Some developers consider application maintenance an afterthought. Naturally, you tend to focus on the task at hand, which is, first and foremost, to make your program work! Sure, there might be a few minor bugs here and there, but you can always fix those in the next maintenance release, right? And no, you didn’t have time to include every bell and whistle your users requested, but you can include those features in a subsequent release. When you start thinking along these lines, your project needs a maintenance plan, and the sooner the better.
For a recent Java project at IBM, my team was forced to consider application maintenance from the start. The situation had an interesting set of characteristics: First, the users were mobile and needed disconnected access to the application. Second, the users were dispersed worldwide. Third, the application was expected to evolve quickly. And fourth, the department had limited support resources to help users install upgrades.
Because the users were mobile, we needed to install the application locally on the their laptops, rather than having them retrieve the latest bytecodes from a server. Because the users were worldwide and frequent updates were expected (not to mention that we had a small support infrastructure with a limited travel budget), we needed a reliable means of remotely updating the application. We wanted something more seamless than an email with an attached jar file and installation instructions, or an executable packaged with InstallShield. The solution we implemented, which solved our problem nicely, was to store the application bytecodes in the same database the users replicate for the application data. We installed on everyone’s workstation one jar we termed the bootstrapper. This unchanging piece of code locates the local database, loads the application-specific bytecodes, and launches the application. When the developers add a new report or fix a bug, we recompile the bytecodes and replace them in the database on the server. The next time users replicate their local database with the one on the server, they receive the new bytecodes along with the other new application data.
What makes this dynamic approach possible is the Java ClassLoader
. A ClassLoader
is the part of the JVM responsible for finding classes as your application instantiates them. Thanks to the foresight of the Java creators, you can replace the default class loader with one of your own, which loads classes in just about any way you can imagine. Any array of bytes can be interpreted as a class that your program may instantiate at runtime. The bytes may come from files in the local machine or from a server halfway around the world, delivered via a network connection. In our case, we put the bytes into a Lotus Notes database. For convenience, and to minimize the amount of replicated data, we packaged all application-specific files into a compressed zip file. The java.util.zip
package lets users read and write files in the zip format.
Meet the players
In this article, I show you how we designed and developed the infrastructure for our application deployment. The components should be reusable for your own applications. Here is a brief preview of each component, which I then describe in more detail:
-
The
BootStrapper
: The classcom.paulitech.bootstrap.BootStrapper
is the main class we installed on the users’ workstations to let them receive the dynamic application updates. It takes, as command-line arguments, the location of the database and the class to instantiate. -
The
Bridge
: Our application at IBM happens to use Lotus Notes as the database, but yours may use something else. To decouple the choice of database from the rest of the application, I’ve used a common design pattern known as a bridge (as documented in Design Patterns, by Erich Gamma, et al., see Resources). All access to the database is via the abstract interfaces defined incom.paulitech.bridge
. The Lotus Notes-specific implementation of the bridge interface is incom.paulitech.bridge.notes
. If your database is something other than Lotus Notes, you will need to create your own bridge implementation. The interface is rather small and simple, and this task should not take long for an experienced JDBC programmer. -
The
ZipClassLoader
: This is our customClassLoader
, located incom.paulitech.classloader.ZipClassLoader
, which extendsjava.lang.ClassLoader
, the default class loader. Its constructor is passed an array of bytes, retrieved from the database, that represent the zip file. When you need a class, you ask theZipClassLoader
for the class by name. TheZipClassLoader
handles all the ugly details of retrieving the proper array of bytes out of its zip file. To the outside world, theZipClassLoader
acts just like theClassLoader
that it extends. This essential component is delivered along with theBootStrapper
to the users. -
The
FileInstaller
: Of course, you need a tool to actually put the zip file into the database so that it may be replicated.com.paulitech.classloader.BridgeFileInstaller
is an abstract class that talks to theBridge
database interface described above and writes records to it. Thecom.paulitech.classloader.notes.NotesFileInstaller
is a concrete implementation of theBridgeFileInstaller
, which, naturally, uses aNotesBridge
. If you are using something other than Lotus Notes for the database, you should create your own installer tool implementation, subclassingBridgeFileInstaller
, as I have done for the Notes case.
Grab on to your bootstraps
Below is a class diagram depicting what happens on the user’s workstation. There’s quite a bit of code, so rather than go through it line by line I will describe the algorithms with prose. The full source is available in the sample file (see Resources for a download). At this point, it would be a good idea to follow along with the source code.
A batch file (bootstrap.bat
, in the example) kicks off the main()
routine in the BootStrapper
class, passing it the database location, the key that identifies the particular record that contains the bytes of interest, the name of the class you wish to instantiate as the main class, and the prefixes of any application-specific classes (separated by commas). For example, if you work for Widgets USA, all your application-specific classes start with com.widgetsusa
, and you’ve included some custom libraries written by a business partner named Gunkle Media, whose classes are all under com.gunklemedia
, this fourth parameter to the Bootstrapper
should be com.widgetsusa,com.gunklemedia
. The reason this last parameter is necessary is somewhat obscure, and it is necessitated by something that took me a great deal of hair-pulling to figure out. I’ll explain more when I discuss custom class loaders.
Bridge over troubled data
Armed with the parameters, the BootStrapper
can go to work. First, it creates a NotesBridgeFactory
, which is a concrete implementation of an abstract BridgeFactory
class. The BridgeFactory
, as you might guess from the name, is responsible for generating a Unified Field Theorem. Just kidding! As the name suggests, its sole purpose is to create Bridge
objects. What is a Bridge
object? A Bridge
provides an abstract mechanism for getting data in and out of your system. The core system does not actually care how the data is transferred, just that it does. The data could be going in and out of a database, could be sent and retrieved via a message queuing system, or could be transcribed by human operators onto scraps of pigeon-delivered paper. The point is, your core application talks to something that adheres to the Bridge
interface and doesn’t worry about the details. In this particular case, the data is ultimately stored in a local Lotus Notes database that the users replicate periodically with a server. The vagaries and peculiarities of the Notes API (and there are many, trust me!) are hidden behind the abstract interface. This allows you to change the data transport at a later time with minimal impact on your existing code.
The Bootstrapper
can ask the BridgeFactory
for an instance of a Bridge
that corresponds to the key passed in. In this example, the file you’re interested in is exampleapp.zip
. The BridgeFactory
(actually implemented as a NotesBridgeFactory
) goes off to the Notes database and attempts to find a document (NotesSpeak for record) that matches that key. When it finds that record, it creates a Bridge
(actually implemented as a NotesBridge at runtime) corresponding to that record and returns it. The Bootstrapper
, using the getPayload()
method of the Bridge
interface, grabs the string representing the contents of the file. Notice I said “string,” not “array of bytes.” In a perfect world, Notes would handle raw streams of bytes better, but since Notes was designed as an unstructured document database, it does not support pure binary data. You can turn arbitrary arrays of bytes into strings and back again via old-fashioned base-64 encoding. I was able to steal — ahem, reuse — some nice base-64 encoding classes from org.w3c.tools.codec
.
Zipping right along … with a load of class!
So, after the Bootstrapper
gets the string from the Bridge
and decodes it, you have an array of bytes that represent a zip file. Now what? How do you retrieve the classes and instantiate them? The Java runtime environment (JRE) loads new classes and instantiates them via a ClassLoader
. When you start the JRE and load your initial class with the main()
method, it creates a ClassLoader
known as the primordial class loader, which loads your class. Any subsequent classes referenced by that class will use the same ClassLoader
that loaded it. This primordial ClassLoader
searches for class files in directories and archives that are specified in your classpath. If you want to load classes in another way, you must subclass ClassLoader
and do it yourself. This is not as hard as it sounds, as you’ll see. The only requirement is that your subclass implement the findClass()
method. This method must acquire the bytes that represent the class, then call defineClass()
in the superclass (ClassLoader
), which will parse the bytes to make sure they represent a valid class and return the newly created class.
In this case, you don’t want to get the class files from the file system, but rather from a zip file that happens to exist in memory as an array of bytes. Therefore, your custom class loader will be called ZipClassLoader
.
The custom ZipClassLoader
I have written maintains a cache of the classes it has loaded, so it can quickly return classes that have been used previously. If the class you want is not in the cache, you start to look for it, as I’ll explain. First, you must make sure the class is an application-specific class (com.paulitech.examples.ExampleApp
, for instance) and not a system class (such as java.lang.String
). If the class is application-specific, you need to get it directly from the zip file, and not the primordial class loader. You must get it from the zip file because the JRE can cache, on disk, certain nonsystem classes indefinitely (presumably for performance reasons). This means, even after a reboot, the primordial class loader, if asked, will return the old version of the class, as if it were a system class! I discovered this the hard way (luckily, during testing), when I attempted to deploy changes and found that sometimes the application continued to exhibit its old behavior no matter what I did (including a reboot). Then, suddenly, the cache would clear out (to this day I’m not sure what triggers it) and the changes would appear! I suspect it has something to do with hotspot compiling classes and caching them somewhere on disk. By always using my custom ClassLoader
and never relying on the primordial one, I can ensure that the latest bytecodes from the database are always used.
Don’t byte off more than you can chew
Java includes utility classes for dealing with zip files and, in Java 1.2, jar files. Despite the built-in library support, working with zip files is very tricky. You’ll find yourself manipulating the contents byte by byte. The Java creators could have spent a little more time making easy-to-use zips, but fortunately, I’ve done the dirty work for you. The algorithm is basically this: First, wrap a ZipInputStream
around a ByteArrayInputStream
created with the original array of bytes. Next, loop through the entries in the ZipInputStream
until you find an entry that matches the class you are looking for. If you don’t find a match, throw a ClassNotFoundException
back to the Bootstrapper
. If you do find a matching entry, start reading bytes from the stream into a temporary buffer and keep reading until no more exist. Create a new array of bytes of the proper length and copy the bytes from the temporary buffer into it. Once you have the array of bytes representing the class, you can call defineClass()
on the superclass ClassLoader
. The superclass parses the bytes to make sure they really represent a valid Java class, and return an instance of Class
. This Class
is what you return to the Bootstrapper
. Whew!
Once the Bootstrapper
has the reference to the Class
, it can call the newInstance()
method on it to instantiate it, and the Bootstrapper
‘s work is complete. Any classes referenced by the new instance will be loaded by the same class loader that loaded the new instance, which is, of course, the ZipClassLoader
.
All the Bootstrapper
can do is start a new instance of the main class; the constructor of the instantiated class must do the rest. The best way to do this is to make your class implement Runnable
and have the constructor create a new thread to start it. Then put your main application logic in run()
. For example:
public class ExampleApp implements Runnable
{
public ExampleApp()
{
Thread thread = new Thread(this);
thread.start();
}
public void run()
{
System.out.println("Mahir kisses you!");
}
}
This class only needs to be instantiated in order to perform its task.
Install a file, any file
Now that you’ve seen how the user can load the classes on the fly, you may wonder how the classes got into the database in the first place. Figure 2 shows a diagram depicting the various components of the file installer.
Since the installer is just another program that needs to access the database, it will reuse the Bridge
interface designed for the Bootstrapper
. Only this time, instead of reading from the Bridge
, you’ll write to it (using the setPayload()
method). The code that actually talks to the Bridge
and installs the file is placed in an abstract base class called BridgeFileInstaller
, in a method called install()
, which takes the path of the file as a parameter. The install()
method reads the bytes from the file, encodes the bytes into a base-64 string, asks the BridgeFactory
for a Bridge
corresponding to the file name, calls setPayload()
to send the string to the bridge, and finally calls complete()
on the bridge to commit the transaction. This is a generic implementation that uses the Bridge
interface, not the more specific NotesBridge
interface.
You can then subclass BridgeFileInstaller
to provide your own specific implementation; I call this one NotesFileInstaller
. The main()
method of NotesFileInstaller
takes two arguments: the location of the Notes database and the name of the file to install. NotesFileInstaller
creates an instance of a NotesBridgeFactory
to initialize it as the BridgeFactory
singleton. Then, NotesBridgeFactory
calls the install()
method of the superclass (described above), passing it the name of the file. Finally, you call dispose()
on the BridgeFactory
to tell it you are finished, allowing it to free up any native resources, close database connections, and so forth.
It is a good idea to create a batch file to automate the file installation, especially if you must update your application frequently. I have provided one for the sample application, installapp.bat
.
Future possibilities
What about command-line arguments for your main class? In this case, you were able to pass runtime information to your main application via a properties file contained in the zip file, along with all the class files, so you had no need for command-line arguments. If a properties file is not good enough and you absolutely must have command-line arguments, here is an approach you might take:
- Modify the
Bootstrapper
to take the command line arguments. - Place all the arguments destined for the main application into a global Singleton instance of a class designed for holding arguments, such as a
Vector
(see Design Patterns for a complete description of the Singleton pattern). - Have your main application reference the singleton to retrieve the arguments.
One thing that might be nice is a JDBCBridge
implementation, in case you’re using a database that supplies a JDBC driver. Or, you can create a JMSBridge
that could use the Java Message Service to retrieve the classes, assuming your users are always connected.
My team chose zip instead of jar as the archive format because support for manipulating jars was not included until Java 1.2. As of this writing, IBM supports Java 1.1.8 only for its internal users, so we went with the zip format. The classes for manipulating zips and jars are virtually identical, so you could create a JarClassLoader
with fairly little effort.