Java Tip 122: Beware of Java typesafe enumerations
Think twice before relying on instance identity
Departing from traditional practice for JavaWorld’s Tips ‘N Tricks column, I will talk about when not to use a previously suggested trick. Specifically, the typesafe enum construct, covered in JDC Tech Tips and other publications, can sometimes be hazardous to your code.
Because Java lacks a proper C/C++ enumeration (enum) feature, Java programmers have opted to define simple sets of primitive values:
public class Colors
{
public static final int GREEN = 0;
public static final int RED = 1;
...
}
This is not particularly typesafe, but it works. You can easily copy and serialize these constants, and then use them for fast switch
lookups and so on. In fact, this is how Java language designers originally advised Java programmers to handle Java’s lack of an enumeration feature (see “The Java Language Environment” whitepaper).
How it’s supposed to work
The typesafe Java enum concept basically replaces the set of primitive constants above with a set of static final object references encapsulated in a class that (possibly) restricts further instantiation. A basic example would be:
public final class Enum
{
public static final Enum TRUE = new Enum ();
public static final Enum FALSE = new Enum ();
private Enum () {}
} // end of class
Because the set of instances is restricted by the private constructor and Enum
class being final, we can assume that Enum.TRUE
and Enum.FALSE
are the only instances of the Enum
class. Thus, we can use the identity comparison (==
) operator instead of the equals()
method when comparing enum values. The cost of using the ==
operator equates to directly comparing pointer values in C/C++. Great, right? We have both type and range safety for enum values while keeping value comparisons efficient.
Is that enough?
Alas, the simple Enum
class above lacks a few features. One missing feature is that we cannot pass instances of our Enum
class as an argument to an RMI (remote method invocation) or EJB (Enterprise JavaBeans) method. To do that, we must mark the class Serializable
:
public final class Enum implements java.io.Serializable
{
public static final Enum TRUE = new Enum ();
public static final Enum FALSE = new Enum ();
private Enum () {}
} // end of class
Ok, so what is wrong with that?
The above has a subtle trap, as shown here:
ByteArrayOutputStream bout = new ByteArrayOutputStream ();
ObjectOutputStream out = new ObjectOutputStream (bout);
Enum e1 = Enum.TRUE;
out.writeObject (e1);
out.flush ();
ByteArrayInputStream bin = new ByteArrayInputStream (bout.toByteArray ());
ObjectInputStream in = new ObjectInputStream (bin);
Enum e2 = (Enum) in.readObject ();
System.out.println ((e2 == Enum.TRUE || e2 == Enum.FALSE));
This code will print out false, indicating that e2
is neither Enum.TRUE
nor Enum.FALSE
. This happens because deserializing an object creates a new object without regard to the class’s constructors — the instantiation protection that we thought we got from making the Enum
constructor private
doesn’t affect deserialization.
This could lead to unexpected results in runtime environments like an EJB container, especially since most EJB containers support optimization options to disable the serialization marshalling of method arguments for beans deployed in the same JVM. Your code’s runtime behavior will then depend on this option’s setting — not a very comforting thought, is it?
As pointed out by Joshua Bloch, we must do more to ensure that serialization doesn’t result in illegal Enum
instances unexpectedly springing up at runtime. At a minimum, we have to add a readResolve()
method and an instance field to use as the real instance ID:
public final class Enum implements java.io.Serializable
{
public static final Enum TRUE = new Enum (true);
public static final Enum FALSE = new Enum (false);
public String toString ()
{
return String.valueOf (m_value).toUpperCase ();
}
private Enum (boolean value)
{
m_value = value;
}
private Object readResolve () throws java.io.ObjectStreamException
{
return (m_value ? TRUE : FALSE);
}
private boolean m_value;
} // end of class
Here, in the readResolve()
method, I check the value ID of the instance just created and replace the deserialized instance with one of the static objects.
Unfortunately, many programmers today are unaware they must implement readResolve()
to perform instance substitution during serialization (this feature was not available before Java 2 either). If we don’t do this, however, we won’t get any compiler or runtime errors — the reference comparison will simply fail each time we compare an Enum
value against a deserialized Enum
instance. Depending on the enumeration’s size, the amount of work necessary to have a correct and serializable typesafe class may be too much compared to the good old “typeunsafe” pattern (the standard practice of defining simple-minded sets of constants referred to earlier), which lacks this issue.
Interestingly enough, Sun’s JDK uses the typesafe enum pattern and is not consistent with making all such types Serializable
: several Swing typesafe enum classes are not Serializable
(for example, javax.swing.text.html.HTML.Tag
), while others are (for example, java.util.logging.Level
in JDK 1.4+).
Dealing with classloaders
Another scenario in which the typesafe enum pattern breaks completely is when the Java runtime loads the Enum
class multiple times. Although this sounds obscure, it can happen more easily than you might think.
Consider an EJB invoking a method on another EJB. If the EJBs come from different deployment JAR units, different classloaders may load them. Both deployment JARs could package the Enum
class, and the particular details of the container classloader hierarchy can conspire to have both classloaders load the Enum
twice. If the two EJBs then exchange data that includes the Enum
type and the data is not marshalled by means of serialization, relying on object reference identity for comparison will most certainly fail.
Consider another possibility: a JavaServer Page (JSP) or a servlet placing data that includes Enum
instances in an HTTP session. If the servlet later reloads (for example, because the JSP updates) and then attempts to compare anything against Enum
values left in the session, this will create the same effect of a class in one classloader namespace acquiring data from a different classloader namespace.
The typesafe enum pattern fails in the above cases for a reason different from serialization intricacies: the same class loaded by different classloaders is, strictly speaking, actually a different class each time. The Enum
class’s static data will be created anew by each classloader loading the class. Instances of such classes can coexist in the VM, but they will be instances of incompatible types; they could not be cast to each other, and thus, they could not be compared using the ==
operator.
The following code simulates this runtime scenario. This new class, EnumConsumer
, will act as something that uses the Enum
type:
public class EnumConsumer implements IEnumConsumer
{
public Vector getObjects ()
{
Vector result = new Vector ();
result.add (Enum.FALSE);
result.add (Enum.TRUE);
return result;
}
public void validate (Vector objects)
{
if (objects.get (0) != Enum.FALSE)
System.out.println ("element 0 [" + objects.get (0) + "] != Enum.FALSE");
else
System.out.println ("element 0 Ok");
if (objects.get (1) != Enum.TRUE)
System.out.println ("element 1 [" + objects.get (1) + "] != Enum.TRUE");
else
System.out.println ("element 1 Ok");
}
} // end of class
EnumConsumer
implements a simple test interface, IEnumConsumer
, that we will use to drive EnumConsumer
instances across multiple classloader namespaces:
public interface IEnumConsumer
{
Vector getObjects ();
void validate (Vector objects);
} // end of interface
The idea here is simple enough: getObjects()
returns a Vector
of two possible Enum
values in known order. If that Vector
is passed into validate()
, it will check the expected state of data and complain if something is wrong. Naively, I expect that if I execute getObjects()
and send the result of that execution into validate()
then it should never fail. But it can fail, as shown below. The key to making this interesting is to drive this class from the following main()
method:
public static void main (String [] args) throws Exception
{
File loaderClasspathDir = new File ("data");
loaderClasspathDir.mkdir ();
// move Enum.class and EnumConsumer.class from "./out/" to "./data/":
String [] classNames = new String [] {"Enum.class", "EnumConsumer.class"};
for (int c = 0; c < classNames.length; c ++)
{
File source = new File ("out", classNames [c]);
File target = new File (loaderClasspathDir, classNames [c]);
if (! target.exists () || (source.lastModified () > target.lastModified ()))
{
if (target.exists ()) target.delete ();
source.renameTo (target);
}
}
URL [] URLlist = new URL [] {loaderClasspathDir.toURL ()};
// simulate 2 different classloader namespaces: this is namespace #1
URLClassLoader l1 = new URLClassLoader (URLlist);
Class c1 = l1.loadClass ("EnumConsumer");
IEnumConsumer obj1 = (IEnumConsumer) c1.newInstance ();
// ... and this is namespace #2:
URLClassLoader l2 = new URLClassLoader (URLlist);
Class c2 = l2.loadClass ("EnumConsumer");
IEnumConsumer obj2 = (IEnumConsumer) c2.newInstance ();
// get data to pass between obj1 and obj2:
Vector objects = obj1.getObjects ();
// this works as expected:
obj1.validate (objects);
// this fails:
obj2.validate (objects);
}
In the above code, I assume that all classes in the project compile into the out
directory, which will be in the system classloader’s classpath when the program runs. I first execute a loop that moves Enum
and EnumConsumer
classes in a separate data
directory, which classloaders l1
and l2
will use. This simulates l1
and l2
being packaged in the same deployment unit and is necessary to prevent them from delegating to their common parent system classloader. IEnumConsumer
is left alone so that the result of ClassLoader.loadClass()
could be cast to it. Both l1
and l2
are then asked to create two EnumConsumer
instances, obj1
and obj2
, which then validate the result of obj1.getObjects()
twice:
>java -cp out Main
element 0 Ok
element 1 Ok
element 0 [FALSE] != Enum.FALSE
element 1 [TRUE] != Enum.TRUE
The last two lines of output show that obj2
will never be able to use Enum
values that obj1
creates. Effectively, obj1
and obj2
have different views of what Enum
class and its values are. Neither the reference comparison (==
) nor the equals()
method will work.
In fact, fixing the Enum
class so it works in this case would require a nontrivial amount of effort. You can override Object.equals()
for Enum
to use reflection in order to compare class names and m_value
values. This will of course eliminate the speed advantage of a typesafe enum type. Besides, it would definitely be too much work for something that has no issues with the old typeunsafe construct in the first place.
Ironically, core JDK classes are in little danger of this happening because they always load precisely once from the same bootstrap classloader. However, we, the developers, can run into this issue with custom-loaded code.
Proceed with caution
The typesafe enum pattern may require too much work to be truly safe in all situations, especially if your runtime involves serialization or a complex classloader structure — typical elements of Java 2 Platform, Enterprise Edition (J2EE) applications. In certain cases, it won’t work at all. As such, the pattern is an unreliable substitute for a basic feature that the Java language lacks: a compiler-supported enum feature. In many cases, you’re better off using the good old set-of-static-primitive-values enumeration.