Extend Javadoc by creating custom doclets
Automatic code generation is becoming increasingly common in software development, a result of the need to hide complexity from the software developer and the acceptance of various standard and de facto standard application programming interfaces. Hiding complexity from the developer can be demonstrated by creating stub and skeleton classes in CORBA from their interface definition language descriptions and by some object-oriented databases that create the necessary adapter code to persist and retrieve objects from the database.
Java contains many APIs that Java developers regard as de facto standards. The complexity of these APIs ranges from those that constitute the “core” of the Java language to those found in the Java 2 Platform, Enterprise Edition. For example, the Java Database Connectivity API presents a unifying interface for interacting with databases from various companies. Suppose that you want a Java object to be able to persist itself to a database by implementing a simple save()
method that maps the object’s attributes to a database table. That method would extract the attributes from the object and use the JDBC API to build up a JDBC statement that is executed against the database. After implementing the save()
method for a few classes, you begin to see the similarities in the code structure and the repetitive nature of implementing that method. Often the basic attributes of an object need to be transliterated and “plugged in” to the appropriate Java API. That is when a code generator can be a useful tool to have in your programming toolbox.
By using a code generator you can automate the process of some tedious, repetitive, and error-prone coding tasks. The fact that you are plugging in to well-known APIs increases the utility of such a tool, since it is applicable to a wide audience of developers. Furthermore, some typically “in-house” domain-specific frameworks can also be considered as fixed API targets for code generators.
A code generator can be a timesaving tool that increases code quality and introduces a more formal and automated approach to part of the development cycle. Another advantage of automated code generation is the synchronization of object definitions across various programming languages. In many tightly bound applications, the same business object (for example, an order to purchase a stock) must be represented consistently in C++, Java, and SQL. The ability to output different representations from a common model is available in various modeling tools; however, I have found it awkward to use those tools to achieve the level of customization required. A dedicated custom code generator is simple enough to create and does not tie you into a specific modeling tool.
The path to Javadoc
The path my team took to choosing Javadoc for code-generation purposes was somewhat long, and probably common. In early implementations, we used Perl scripts to parse custom metadata grammar in a text file. This was an ad hoc solution, and adding additional output formats was difficult. Our second, short-lived attempt was to modify an existing Java-based IDL compiler. We soon realized that additional IDL keywords would have to be introduced to send hints to the code generator. Making an extension to IDL, or even starting from scratch with tools such as lex and yacc (which split a source file into tokens and define code that is invoked for each recognized token) were not personally palatable. (See Resources for more information.)
A third more promising solution was to describe the class metadata using XML. Defining an XML DTD schema and creating XML documents to describe classes seemed like a natural fit. The file could then be verified and easily parsed. To avoid starting from scratch, I figured that someone must have tried to create a similar XML DTD, and I soon came across XMI. XMI is a full-blown description of UML using XML, and it is now used as an exchange format between UML tools. (See Resources for more information.)
However, the XML documents that described classes were extremely verbose and difficult to edit manually. There are simply too many seemingly superfluous tags and descriptions to weed through in order for you to change one class attribute. Also, manipulating XML files at the application-domain level can be quite tedious. IBM alphaWorks produces an XMI toolkit that makes the processing of XMI-based XML documents much easier, but the XMI toolkit API for manipulating class descriptions is extremely similar to the Java Reflection or Doclet API. With that in mind, my organization decided to use the doclet approach, which has been successful.
Introducing Javadoc
Javadoc is the program used to create the HTML-format Java API documentation. It is distributed as part of the Java SDK and its output stage is designed to be extensible through doclet creation. The Doclet API provides the infrastructure to access all aspects of a Java source-code file that has been parsed by Javadoc. By using the Doclet API, which is similar to the Reflection API, you can walk through a Java class description, access custom Javadoc tags, and write output to a file. The standard doclet used to produce the HTML documentation does just that; it writes out HTML files as it traverses all the Java source code. More detailed information on Javadoc can be found in Resources.
By creating simple Java classes that contain attributes and some custom Javadoc tags, you allow those classes to serve as a simple metadata description for code generation. Javadoc parses those metadata classes, and custom doclets access the metadata class information to create concrete implementations of the metadata class in specific programming languages such as Java, C++, or SQL. You can also create variations of the standard doclet that produces simple HTML tables describing the metadata class, which would be appropriate to include in a word processing document. Those metadata Java classes serve the same purpose as an IDL description whose syntax is similar to C++.
Using Javadoc as a code generation tool has several benefits:
- You don’t need to write any parsing code; the parsing of the metadata classes is performed by Javadoc, and presented in an easy-to-use API.
- By using custom Javadoc tags, you add just enough flexibility to define special hooks during code generation.
- Since Java types are well defined, an int is 32 bits; therefore, you don’t have to introduce additional primitive type keywords to achieve that clarity level.
- You can check the Java metadata classes for syntax and other errors by compilation.
Introducing doclets
Before jumping into the doclet used for code generation, I’ll present a simple “Hello World” example that exposes the relevant parts of how to create, run, and play with the Doclet API. The sample code for SimpleDoclet
is given below. (You can obtain the source code for this article in Resources.) If you consider this code somewhat lengthy for a true “Hello World” program, the Sun Website presents an even simpler doclet to help you get started. (See Resources.)
package codegen.samples;
import com.sun.javadoc.*;
import java.text.*;
public static boolean start(RootDoc root) {
//iterate over all classes.
ClassDoc[] classes = root.classes();
for (int i=0; i< classes.length; i++) {
//iterate over all methods and print their names.
MethodDoc[] methods = classes[i].methods();
out("Methods");
out("-------");
for (int j=0; j<methods.length; j++) {
out("Method: name = " + methods[j].name());
}
out("Fields");
out("------");
//iterate over all fields, printing name, comment text, and type.
FieldDoc[] fields = classes[i].fields();
for (int j=0; j<fields.length; j++) {
Object[] field_info = {fields[j].name(), fields[j].commentText(),
fields[j].type()};
out(FIELDINFO.format(field_info));
//iterate over all field tags and print their values.
Tag[] tags = fields[j].tags();
for (int k=0; k<tags.length; k++) {
out("tField Tag Name= " + tags[k].name());
out("tField Tag Value = " + tags[k].text());
}
}
}
//No error processing done, simply return true.
return true;
}
private static void out(String msg) {
System.out.println(msg);
}
private static MessageFormat METHODINFO =
new MessageFormat("Method: return type {0}, name = {1};");
private static MessageFormat FIELDINFO =
new MessageFormat("Field: name = {0}, comment = {1}, type = {2};");
}
The above doclet prints out descriptive information of the classes, methods, fields, and some Javadoc tag information of the class SimpleOrder.java
listed below:
public class SimpleOrder {
public SimpleOrder() { }
public String getSymbol() {
return Symbol;
}
public int getQuantity() {{escriptive
return Quantity;
}
/**
* A valid stock symbol.
*
* @see A big book of valid symbols for more information.
*/
private String Symbol;
/**
* The total order volume.
*
* @mytag My custom tag.
*/
private int Quantity;
private String OrderType;
private float Price;
private String Duration;
private int AccountType;
private int TransactionType;
}
After compiling these files, you invoke the Javadoc tool using this command:
javadoc -private -doclet codegen.samples.SimpleDoclet SimpleOrder.java
The -private
option tells Javadoc to expose private field and method information, and the -doclet
option tells Javadoc what doclet to invoke. The last parameter is the file to be parsed. The output of the program is the following:
Loading source file SimpleOrder.java...
Constructing Javadoc information...
Methods
-------
Method: name = getSymbol
Method: name = getQuantity
Fields
------
Field: name = Symbol, comment = A valid stock symbol., type =
java.lang.String;
Field Tag Name= @see
Field Tag Value = A big book of valid symbols for more information.
Field: name = Quantity, comment = The total order volume., type = int;
Field Tag Name= @mytag
Field Tag Value = My custom tag.
Field: name = OrderType, comment = , type = java.lang.String;
Field: name = Price, comment = , type = float;
Field: name = Duration, comment = , type = java.lang.String;
Field: name = AccountType, comment = , type = int;
Field: name = TransactionType, comment = , type = int;
The sample code shows that the Doclet API is contained in the package com.sun.javadoc
. Since you are plugging in to the Javadoc tool and are not creating a standalone application, Javadoc calls your doclet from the method public static boolean start(RootDoc root)
.
Once the start
method executes, RootDoc
holds all the information parsed by Javadoc. You can then start to walk through all the parsed classes by invoking the method classes()
on RootDoc
. That method returns a ClassDoc
array describing all the parsed classes. ClassDoc
in turn contains methods such as fields()
and methods()
. These methods return FieldDoc
and MethodDoc
arrays that describe all the fields and methods of the parsed class. All the “Doc” classes contain the method tags
, which returns a Tag
array describing both custom and standard Javadoc tags. The standard tag used in this example is @see
.
The out()
method simply wraps the standard output, and the MessageFormat
class helps format the output according to a fixed template.
Reusable classes for code generation
In light of the above example, I hope you agree that creating your own doclets and extracting the class information using the Doclet API is easy. The next step to parsing the Java classes and generating code to a file is relatively straightforward. To make creating code-generation doclets easier, I developed a small set of interfaces and abstract base classes. The class diagram of these utility classes is shown below.
The interface Maker
defines the method signature public void make(ClassDoc classdoc)
that you will use to interact with your code generators. The abstract class CodeMaker
provides default implementations for manipulating files and indention, which are common to all code generators. Specific code generators inherit from the abstract base class and provide an implementation of the make
method. The make
method has the class ClassDoc
as an argument, not RootDoc
. That causes the Maker
to enter the code generation logic at the class level.
All classes parsed by Javadoc are looped over in the doclets plug-in method start
. An example of how that is done (described in the file SimpleMakerDoclet.java) is shown below:
public static boolean start(RootDoc root) {
ClassDoc[] classes = root.classes();
//Set up CodeMakers to run
Maker simplemaker = new SimpleCodeMaker("Description Maker");
//Iterate through all classes and execute the "make" method the Maker
for (int i=0; i < classes.length; i++ ) {
ClassDoc classdoc = classes[i];
simplemaker.make(classdoc);
}
return true;
}
Following are parts of the code from a simple code generator called SimpleCodeMaker
, which performs the same task as the SimpleDoclet
previously listed. Instead of sending the output to the screen, SimpleCodeMaker
saves it to a file in the subdirectory genclasses
. The implementation of the make
method is also becoming more structured with separate methods to process fields and methods. Only the methods make
and processMethods
are listed here for brevity.
public class SimpleCodeMaker extends CodeMaker {
public void make(ClassDoc classdoc) {
Log.log("creating description file",Log.INFO,this);
setFile("genclasses/" + classdoc.name() + ".txt");
processMethods(classdoc);
processFields(classdoc);
endFile();
}
private void processMethods(ClassDoc classdoc) {
MethodDoc[] methods = classdoc.methods();
out("Methods");
out("-------");
for (int j=0; j<methods.length; j++) {
out("Method: name = " + methods[j].name());
}
}
...
}
Organizing multiple Makers
Commonly you need to have many different output formats generated at the same time. So instead of creating an individual doclet for each code generator, you can allow a collection of code generators to run at the same time. You can easily achieve that goal by using the Composite design pattern, which lets you create a CompositeMaker
that allows the invoker of a Maker
to treat a collection of makers the same way it treats an individual CodeMaker
. The CompositeMaker
class implements the Maker
interface and maintains a collection of code generators. You add a Maker
to the composite using the method addMaker
:
public boolean addMaker(Maker cm) {
return m_makers.add(cm);
}
The run
method then simply executes the make
method on its collection of Maker
s:
public void make(ClassDoc classdoc) {
for (Iterator i = m_makers.iterator(); i.hasNext();) {
Maker maker = (Maker)i.next();
maker.make(classdoc);
}
}
Sample CodeMakers
The sample code generators provided, JavaCodeMaker
and CppCodeMaker
, create simple getters and setters for the data fields defined in the metadata Java class. The SqlCodeMaker
creates a simple table definition for use in a database.
Using the sample code generators for Java, C++, and SQL, the implementation of the doclet’s start
method now looks as follows:
public static boolean start(RootDoc root) {
ClassDoc[] classes = root.classes();
//Set up CodeMakers to run
CompositeMaker codemakers =
new CompositeMaker("Simple Java/CPP/SQL Makers");
codemakers.addCodeMaker(new JavaCodeMaker("Java"));
codemakers.addCodeMaker(new CppCodeMaker("C++"));
codemakers.addCodeMaker(new SqlCodeMaker("SQL"));
//Iterate through all classes and execute the "make" method the
//composite codemaker.
for (int i=0; i < classes.length; i++ ) {
ClassDoc classdoc = classes[i];
codemakers.make(classdoc);
}
return true;
}
To demonstrate the use of custom Javadoc tags, I’ve used two custom tags, @enum
and @primarykey
. The @enum
tag is used to create an enumerated type for a specific field in Java and C++. The SQL code generator uses the @primarykey
tag to identify the field that will be used as the primary key in the SQL table.
In the Doclet API, the method tags(String nameOfTag)
on the FieldDoc
class lets you generate code relating to your custom tag set. That method allows you to ask for a Javadoc tag by name, as in Tags[] tags = fielddoc.tags("enum")
. The method makeEnums
in the Java and C++ code generators shows the use of the tags
method. In the implementation of JavaCodeMaker
, enumerations are simply implemented as a “static final int”. See Resources for more information on implementing enumerated types in Java in a type-safe manner.
In the case of JavaCodeMaker
, the metadata description of the OrderType
field is defined as:
/**
* The order qualifier that determines such things as the amount of
* time in which to leave an order in and at what price to execute an
order.
*
* @enum MARKET LIMIT STOP ALL_OR_NONE FILL_OR_KILL
*/
private int OrderType;
That description results in the following methods, fields, and enumerations in the generated Java source code file:
public int getOrderType()
{
return m_OrderType;
}
public void setOrderType(int val)
{
m_OrderType = val;
}
private int m_OrderType;
public static final int MARKET = 0;
public static final int LIMIT = 1;
public static final int STOP = 2;
public static final int ALL_OR_NONE = 3;
public static final int FILL_OR_KILL = 4;
Conclusion
Using these examples as a starting point, you can easily develop more-elaborate code generators for use within the J2EE architecture, or other custom frameworks for your own purposes.