Crafting Metadata

Decouple applications and their details using properties, XML, and cryptography

When a story is reconstructed, its central plot doesn’t change. But depending on the author and audience, the plot is obfuscated by changes in context: settings, scene sequences, characters, tone, and dialogue. But the story, the core of what happens isn’t reinvented. New incarnations of ancient myths populate silver screens, books, even television sitcoms, and legal-thriller writers reiterate the same stories with new casts.

By using application metadata, wise developers can accomplish a similar feat. Metadata, or data about data, describes software in an external and fluid manner that replaces internal hardcoded constants. By stashing those details in a separate editable text file, you allow yourself and other users/developers to drastically alter application behavior without writing new code. The metadata deployment descriptors of Java Enterprise applications testify to that flexibility. Take a look at the following increasingly sophisticated uses of metadata and consider how you might add the technique to your own coding repertoire.

Traditional configuration files

Consider a basic application that connects to a database and issues a SQL query via JDBC. To connect to the database, you’ll need a number of parameters: driver, URL, user name, and password. An example of how those values are typically initialized is listed in the MetadataBasicSample.java source file in the Resources section.

Listing 1 contains a skeletal version of the code:

Listing 1. Hardcoded JDBC application

public class MetadataBasicSample {
    
    private String dbDriver;
    private String dbURL;
    private String dbUser; 
    private String dbPassword;
    public static void main(String args[]) {
        // ... CODE SNIPPED ... //    
    }
    
    /** Constructor simply calls init(). */
    public MetadataBasicSample() {
        init();
    }
    
    /** Load property values into data members. */
    private void init() {
       try {
            // set the properties
            this.dbDriver = "oracle.jdbc.driver.OracleDriver";
            this.dbURL = "jdbc:oracle:thin:@127.0.0.1:1521:DEV2";
            this.dbUser = "monkey";
            this.dbPassword = "password";
            
            // load the driver
            Class.forName(dbDriver);
        } catch (Exception e) {
            throw new RuntimeException("UNABLE TO INIT, EXITING...");
        }
    } 
    /** Retrieves a Connection using the hardcoded parameters. */
    private Connection getConnection() 
        throws SQLException {
        return DriverManager.getConnection(dbURL, dbUser, dbPassword);
    }
    
}

The init() method initializes the data members with hardcoded values and, if any parameter needs changing, the code is edited and recompiled. The application and its details are tightly coupled, making for an unnecessarily complex system.

A better design calls for decoupling the data from the application — moving the data into a text-based configuration file that you can update separately from the code.

The database parameters, even in their hardcoded format, suggest a data structure in which properties are of name=value format. The Properties class possesses a nifty load() method for loading such data into true Properties objects. Using the ClassLoader, any application can discover and transform any properties text file into a Properties object. Listing 2 shows a text file named MetadataPropsSample.props designed for that purpose.

Listing 2. Plaintext metadata properties file

dbDriver=oracle.jdbc.driver.OracleDriver
dbURL=jdbc:oracle:thin:@127.0.0.1:1521:DEV2
dbUser=monkey
dbPassword=password

You can place the new text file in a directory listed in your system CLASSPATH. For instance, if your CLASSPATH includes the directory /home/code/lib, then try saving the props file in that directory.

The init() method in the above example needs only slight alteration to make use of the new props file. The full code is listed in MetadataPropsSample.java source (available in Resources). Below is the relevant snip of the changes.

Listing 3. The init() method now loads the properties file

    private void init() {
        try {
            // figure out the name of the props file
            String propsfilename = getClass().getName() + ".props";
            // this approach will read from the top of any CLASSPATH entry
            InputStream is = getClass().getResourceAsStream("/" + propsfilename);
            Properties p = new Properties();
            // load the file into the Properties object
            p.load(is);
            // set the properties
            this.dbDriver = p.getProperty("dbDriver");
            this.dbURL = p.getProperty("dbURL");
            this.dbUser = p.getProperty("dbUser");
            this.dbPassword = p.getProperty("dbPassword");
            
            // load the driver
            Class.forName(this.dbDriver);
        } catch (Exception e) {
            throw new RuntimeException("UNABLE TO INITIALIZE, EXITING...");
        }
    }

To avoid hardcoding the name of the props file, I’m assuming that the file will have the same name as the class accessing it. That may not be the case in a more complex application; the file name is often represented as a hardcoded constant such as:

public static final String PROPS_FILENAME = "MyProps.config";

The properties file delimiter, by the way, need not be an = character; it can be a number of other characters, such as tabs, spaces, colons, and so forth. Check the official Sun javadocs for further examples.

At any rate, you get the idea. The ClassLoader finds the file because it is in a directory in the CLASSPATH, the application accesses the file as a raw InputStream, and the Properties object is created via the Properties.load() method. That is the format adopted by most application configuration files.

XML configuration files

The property file created above is fine for simple configurations, but it doesn’t lend itself to representing more complex data relationships. The database properties I’m using are all related to the notion of a database, but the file lists them loosely and without any suggested organization or encapsulation. If configuration properties fit into a natural hierarchy or collection, application metadata should cleanly represent such structures.

XML-based metadata encourages that type of object-oriented organization. The XML version of the properties text file shown in Listing 4 illustrates the concept. Listing 4. XML-based metadata file

<?xml version="1.0"?>
<metadata>
<database>
  <dbDriver>oracle.jdbc.driver.OracleDriver</dbDriver>
  <dbURL>jdbc:oracle:thin:@127.0.0.1:1521:DEV2</dbURL>
  <dbUser>monkey</dbUser>
  <dbPassword>password</dbPassword>
</database>
</metadata>

Now for the code changes. MetadataXMLSample.java employs a standards-compliant Simple API for XML (SAX) parser — in that example, the open source Xerces parser available from Apache (see Resources) — to parse the new XML file. More specifically, the code accesses the parser and offers three new methods to indicate how to handle the start, middle, and end tags the parser will find. Since the sample code is the SAX event handler, I adjust it so that it extends HandlerBase. Listing 5 contains the relevant changes.

Listing 5. Skeletal version of the XML-accessing sample application

public class MetadataXMLSample extends HandlerBase {
   public static final String 
        DEFAULT_PARSER_NAME = "org.apache.xerces.parsers.SAXParser";
        
   // ... CODE SNIPPED ... //
   
   /** Load property values into data members. */
   private void init() {
        try {
            
            // figure out the name of the props file
            String propsfilename = getClass().getName() + ".xml";
            
            // parse
            Parser parser = ParserFactory.makeParser(DEFAULT_PARSER_NAME);
            parser.setDocumentHandler(this);
            parser.setErrorHandler(this);
            parser.parse(propsfilename); 
            
            // load the driver
            Class.forName(this.dbDriver);
            
        } catch (Exception e) {
            throw new RuntimeException("UNABLE TO INITIALIZE, EXITING...");
        }
    }
    
    // ... CODE SNIPPED ... //
    
    /* ========================================================= */
    /*               XML SAX HANDLER METHODS                     */
    /* ========================================================= */
    
    /** This data member handles the incoming xml values. */
    private StringBuffer buffer = null;
    
    public void startDocument() {};
    
    public void startElement(String name, AttributeList attrs) {
        buffer = new StringBuffer();
    }
    
    public void characters(char ch[], int start, int length) {
        buffer.append(ch, start, length);
    }
    
    public void endElement(String name) {
        
        String value = buffer.toString().trim();
        
        if (name.equals("dbDriver")) 
            this.dbDriver = value;
        else if (name.equals("dbURL"))
            this.dbURL = value;
        else if (name.equals("dbUser"))
            this.dbUser = value;
        else if (name.equals("dbPassword"))
            this.dbPassword = value;
    }
}

If you’re acquainted with the SAX API, that class should look familiar. If you’re new to XML and need to learn the basics, browse through Resources.

Another benefit of using XML-based metadata is that it can be transformed by further XML documents. For instance, you may wish to impose logical rules on your application parameters. Now, it makes little sense to require a validating parser in that instance. Since the file is constructed by a trusted source (you), the validation overhead would be an unnecessary hindrance.

However, imagine that you develop a hatred for certain database drivers and decree that only certain drivers are permissible. You could use an XML Schema definition to enforce such a rule for that XML configuration file. XML-based metadata makes adding such logic a snap whereas the properties-based version of the configuration file would require additional internal application logic (precisely what you’re trying to avoid) to accomplish the same result.

Checking for modifications

The visage of inflexible trouble glares from within those first few examples: the configuration files are checked only once, when the application starts. The init() method is never called again. That is a fairly common scenario. Recall the mantras: “We made the changes but need to restart the server” and “Reboot for the changes to take effect.”

Ideally, however, applications will notice changes in metadata and reinitialize or update themselves. A simple way of accomplishing that is illustrated in the next example, MetadataRecheckSample.java.

Here I have taken MetadataXMLSample.java and had it implement java.lang.Runnable. I’ve also added a Thread that will check the file for modifications, an interval in milliseconds at which the file will be checked, and a run() method in which to do the actual checking.

I may want to change the interval at which the file is checked, so I’ve made that interval a configurable property as well. Unlike the database properties, however, that interval element should not necessarily be required, so I’ve added a constant to the Java source file to indicate a default value. Listing 6 contains the new XML file, MetadataRefreshSample.xml.

Listing 6. XML metadata file with the added recheck interval element

<?xml version="1.0"?>
<metadata>
<!-- 
    This value is in milliseconds. The application will check
    this file for modifications based on this interval.
-->
<configRecheckInterval>60000</configRecheckInterval>
<database>
  <dbDriver>oracle.jdbc.driver.OracleDriver</dbDriver>
  <dbURL>jdbc:oracle:thin:@127.0.0.1:1521:DEV2</dbURL>
  <dbUser>monkey</dbUser>
  <dbPassword>password</dbPassword>
</database>
</metadata>

The relevant additions to the source code are detailed in Listing 7.

Listing 7. Skeletal version of the sample application that periodically checks the XML metadata file for modifications

public class MetadataRecheckSample extends HandlerBase implements Runnable {
         
    private long lastMod;
    private long configRecheckInterval;
    private Thread updater;
    
    public static final long
        DEFAULT_RECHECK_INTERVAL = 60000; // every minute 
     // ... CODE SNIPPED ... //
     
     /** Constructor now starts the re-checker Thread. */
    public MetadataRecheckSample() {
        configRecheckInterval = DEFAULT_RECHECK_INTERVAL;
        init();
        this.updater = new Thread(this);
        updater.start();
    }
    public synchronized void run() {
        for (;;) {
            try {
                wait(configRecheckInterval);
                File propsfile = new File(getClass().getName() + ".xml");
                if (propsfile.lastModified() != lastMod)
                    init();
            } catch (InterruptedException te) {}
        }
    }
    
    // ... CODE SNIPPED ... //
    
}

As you can see, the Thread periodically checks the modification timestamp of the XML properties file. If it is different than the last-known modification time, the init() method is recalled, the properties file reparsed, and the application values updated.

While offering proof of concept, that simple example would need reworking before being deployed in a true production application. The XML file needs to be known by absolute path and to consider the threading model carefully. You’ll also need to consider whether it’s worthwhile to check for modifications at all, as doing so requires overhead that might not be necessary if updates are seldom or never expected to occur or if restarting the application is a trivial issue.

Security and metadata

Configuration files such as those are typically stored as plain text, which is convenient for updating. But deployed production applications may require more careful treatment.

I am paranoid. In my sample application, I don’t feel comfortable leaving my database user name and password in plain view of everyone with filesystem access. I turn to cryptography.

Cryptography may sound intimidating, sparking visions of complex algorithms and bit-level mathematics, but the Java security and cryptography packages make crypto tasks relatively simple to apply.

You’ll need to install the Java Cryptography Extension (JCE) and a provider (the SunJCE provider included in the JCE will do fine) to execute the following example. Pointers to that can be found in Resources.

I wrote a simple utility, CryptoUtil.java, to encrypt and decrypt files. I then wrote MetadataSecureSample.java, a new class based on Listing 7 above that uses CryptoUtil to decrypt the XML file before it is parsed. I also reencrypt the file in a finally block since I don’t want it sitting around in plain text after I’m done with it and want to be sure it is reencrypted even if an Exception is thrown during the parsing process.

Though woefully inadequate for enterprise deployment, a simple secret key will serve to illustrate the concept. The first time it’s used, CryptoUtil will generate such a key. A sample key is also available in the article source code.

You’ll need to encrypt the MetadataSecureSample.xml file before using it in that example. Invoke CryptoUtil.encryptFile() from the command line, pass MetadataSecureSample.xml as an argument, and it creates MetadataSecureSample.xml.enc.

The init() method undergoes a few changes, too, as noted in Listing 8.

Listing 8. The init() method with decryption and encryption

    private void init() {
        File encryptedFile = null;
        File decryptedFile = null;
        try {
            
            // figure out the name of the props file
            String propsfilename = getClass().getName() + ".xml";
            // decrypt the file
            encryptedFile = new File(propsfilename + CryptoUtil.FILE_EXTENSION);
            decryptedFile = CryptoUtil.decryptFile(encryptedFile);
           // ... CODE SNIPPED ... //
                 
            // set last modified
            lastMod = encryptedFile.lastModified();
        } catch (Exception e) {
            throw new RuntimeException("UNABLE TO INITIALIZE, EXITING...");
        }
        finally {
            // re-encrypt the file after using it
            try {
                if (decryptedFile != null)
                    encryptedFile = CryptoUtil.encryptFile(decryptedFile);
            } catch (Exception cre) { 
                System.err.println("UNABLE TO RE-ENCRYPT FILE!");
            }
        }
    } 

Cryptographic operations cost significant overhead, and you wouldn’t want to incur that cost often. Since the code reads the properties at timed intervals rather than upon request, that should not cause problems, but it is something to consider when using any sort of cryptography. As in any design decision, you must weigh the costs and gains carefully with regard to your specific application.

Metadata best practices

How do you figure out which properties are best left outside the application and which are best left in the source code? While that decision is largely subjective, below are a few suggestions.

Choosing metadata

The following kinds of details should find a home in the application metadata:

  • File paths. Root locations for files — html, gifs, mpegs, and so on — often need to change according to the deployment environment.

  • Security settings. Whether the application uses simple security levels or full-blown Access Control Lists and certificates, such data should be configurable.
  • Names. Names used to bind resources to file systems, networks, or registries should be modifiable.
  • Server names, IP addresses, and ports. That is really part of the names category above, but it bears particular mention. Networked applications should treat those settings abstractly.
  • Data source information. Information about persistent storage — whether relational, file, object-oriented, or something else entirely — should ideally be decoupled from the application. That helps enforce architectural decoupling of business logic and data layers.
  • Presentation layer settings. Particularly in client-side applications, users will expect to be able to configure things such as colors and fonts according to their own preferences.
  • Field and value validation rules. Though those sorts of things require some intelligent coding to properly evaluate, the benefit of moving validation rules (for example, “this value has to contain those characters but only when this other field is longer than two characters”) into application metadata is that you can alter such frequently updated logic without touching code.

Choosing code

You can consider leaving the following sorts of details in the application code:

  • Default values for any of the types of parameters mentioned above. If you supply default values such as those in Listing 7, you can overcome null value problems with the metadata file itself. Sometimes, of course, you want to insist that a parameter be present (example: specification of a database URL). When you do choose to provide default constants, specifying them as static and final makes it clear that they are treated as such.
  • Data members that are dangerous. If you have some obscure parameter that has a number of internal dependencies or one that tests reveal should really never be changed, you may not wish to risk application safety or liveness by exposing it to the possibility of alteration.
  • Data members you can’t seem to document or even justify. If you find that you can’t clearly explain the purpose of a configuration element, you may be causing more confusion than good by extracting it. That may also be a signal that you should review the code in question. Metadata isn’t a solution for spaghetti code; even if the spaghetti isn’t cleaned, it should not leak into config files.

Conclusion

To explore examples of time-tested and well edited metadata, take a look at the Apache Web server ( The Apache httpd.conf file, while lengthy (approximately 950 lines), contains well commented metadata properties of mixed name=value and XML format. Within the Java field, you should investigate the J2EE deployment descriptors for EJB, WAR, and EAR archives. Deployment descriptors are complex metadata files that can describe the transactional behavior, data management, naming schemes, access control, and other characteristics of Java Enterprise archives.

The history of engineering is cluttered with examples of complex systems in which interface, implementation, and contextual details are intertwined rather than decoupled. Object-oriented programming calls for a separation of interface and implementation, and metaprogramming calls for a further separation of implementation and contextual details. Use of application metadata configuration files is merely a small, simple example of how developers might exploit metaprogramming techniques. But use it well, and you may find that your applications, like stories recast for new audiences, transcend what even you had intended.

Patrick Sean Neville claims his first coding
experience was helping to crack the WWII German Enigma machine,
something that is not likely considering he was born in the early
’70s. What is certain is that he has led the development of
applications for advertising agencies, television companies, new
media companies, and financial services companies across the
country. His technical writings and opinion are often published in
various online sources and print magazines. He is the founder of
the Code Studio, an active contributor to open source projects, and
a senior software engineer on the Allaire JRun development team.

Source: www.infoworld.com