Robust event logging with Syslog

Syslog is a fast, flexible, and easily extendable logging system

An important part of any project, logging can be used as a debugging tool during development, and a troubleshooting tool once a system has been deployed in a production environment. Because most developers only implement logging if time permits, it is often implemented haphazardly. However, logging provides a way to see what is happening, good or bad, inside a running system. As such, it should be addressed with care and forethought rather than as a last minute burden.

Syslog is an open source logging system that allows programmers to easily add diagnostic and status logging to their Java programs. It separates the act of logging an event from the actual handling of that event. Simple configurations may instruct Syslog to write all log messages to the console. In more elaborate configurations, Syslog can separate messages by almost any criteria, including severity, and send them to different files. These files can rotate based on time or file size and send email across a network via Remote Method Invocation (RMI) or Java Message Service (JMS). API calls allow configuration modification at runtime, and Syslog can save running configuration to an XML file for further refinement at a later time. In the case of application servers hosting multiple applications running in the same VM, message channels allow Syslog to route log messages coming from each application and handle them accordingly.

A list of log message handlers essentially constitutes Syslog. Each handler has a log policy that determines what messages it should consider. If the policy determines that the handler should deal with a given message, the logger uses a message formatter to prepare the message, and the handler then writes the message to a file or sends an email. The architecture is based on interfaces so that custom modules can replace each step.

Syslog provides the following features:

Simple, all-static logging API
Message severities
Pluggable log message handlers
Pluggable log-formatting modules for each handler
Pluggable log policy modules for each handler
Arbitrary log channels
A configuration that can be modified on the fly while running
A running configuration that can be written to a file for reload later
Email integration
Remote logging capabilities — using RMI
JMS integration

Licensed under the GNU LGPL v2, Syslog is free for commercial and noncommercial use, with source code included. Though it was originally based loosely on the Unix Syslog facility, it now shares only its name. In this article, you’ll learn how to easily integrate the Syslog package into your system by adding simple instrumentation to your Java code. You’ll also uncover more elaborate ways to leverage Syslog in projects as an integral part of new system development. Syslog allows developers to concentrate on developing application code, not on how an operations department or hosting provider will handle logging.

Syslog for developers

Two groups of people will benefit from a logging system: developers and operations staffs. Developers need to be able to instrument their code with little effort; otherwise they’ll never perform event logging. As you’ll observe, you can instrument code with Syslog calls in a number of ways, ranging from the simple to the complicated.

Basic usage

Anytime developers want to issue a message to Syslog, they simply call a static method on the com.protomatter.syslog.Syslog class. There’s no need to carry around a reference to an instance of Syslog to make calls. That greatly simplifies usage, as you also don’t need to keep references to a logging object — simply call static methods and that’s it.

You can use 48 methods in the Syslog class to issue messages; however, it’s unlikely that you will use most of them. All of those methods are pass-throughs to a single method that takes six arguments; the other 47 are simply included for convenience. Don’t be intimidated by the large number: Syslog resembles a Swiss army knife in that respect — though it features tools that you rarely use, they will release you from a jam when the time comes.

Developers usually mark up their code with log messages by peppering it with calls to System.out.println(); they invariably have to go back and comment those calls out before the system goes into production. If some calls are missed, operations will ask why the system prints "I'm in here" continuously while running.

One way to become accustomed to working with Syslog is to replace code like this:

  System.out.println("I'm in here");

with code like this:

  Syslog.debug(this, "I'm in here");

That approach requires less work than calling System.out.println() — you actually need fewer keystrokes; and it’s also more useful. Instead of receiving output like this:

  I'm in here

if Test.java made the call, Syslog will produce the code fragment below:

  10:52:53 01/08 [DBUG] Test          I'm in here

If Syslog is set to log the channel and thread name for each message, the output will look like this:

  10:56:44 01/08 [DBUG] [DEFAULT_CHANNEL] [main] Test       I'm in here

Later, when the system goes into production, messages at the “debug” severity level disappear, and nobody will wonder why the system prints "I'm in here" constantly.

Java code is often littered with exception-handling code and, unfortunately, many developers simply catch an exception and call its printStackTrace() method, which sends the stack trace to the System.err stream. Syslog features a utility method for exceptions: calling Syslog.log(this, x) where x is a java.lang.Throwable will produce output like the code below:

  10:40:32 01/08 [EROR] Test          java.lang.Exception: Crap!
  java.lang.Exception: Crap!
          at Test.bar(Test.java:37)
          at Test.foo(Test.java:28)
          at Test.main(Test.java:12)

The toString() method’s output prints as the short description, a stack trace prints as the message detail, and the whole thing logs at the ERROR level. That proves a much better approach than spreading calls to x.printStackTrace() all over the code. As you’ll see later, Syslog can route those kinds of messages to a separate log or even email them to a pager.

Message severity levels

Syslog messages each have an associated severity level that show the importance of each message. The levels are:

DEBUG: Generally referred to as breadcrumbs — useful for letting the developer leave little hints here and there during the initial coding and debugging process.
INFO: Provides informational messages about what’s happening, with notable events including user login and logout.
WARNING: Signifies that something bad has happened, but the system will attempt to recover from it. This message usually indicates that something is starting to go wrong.
ERROR: Indicates that something bad has happened, but the error won’t cause the system to fail in general. A specific task may have failed, and some user may have received an error, but the system will keep running. Exceptions are generally logged at this level.
FATAL: Indicates that something horribly bad has happened, and most likely, the system is of little use to anyone. Someone should probably be paged, and sirens should sound.

It’s important to actually use the different severity levels where they are appropriate. If you log every message in the system at the DEBUG level, the operations team will likely ignore all messages, even those that warrant immediate attention; that defeats the purpose of using severities in the first place.

API methods

Before we go further, let’s look at the most commonly used logging methods that are available in the Syslog class. They fall into two groups:

xxx() and xxxToChannel(): where xxx is DEBUG, INFO, WARNING, ERROR, or FATAL. These are shortcut methods for each severity and have the following forms:

xxx(Object logger, Object msg)
xxx(Object logger, Object msg, Object detail)
xxxToChannel(Object logger, Object chan, Object msg)
xxxToChannel(Object logger, Object chan, Object msg, Object
detail)

log(), which has the following forms:

log(Object logger, Throwable t)
log(Object logger, Throwable t, int level)
log(Object logger, Object msg, Object detail, int level)
log(Object logger, Object chan, Object msg, Object detail, int level)

Each of the above methods is also overloaded to take an InetAddress object as a parameter — I have omitted them for clarity. It’s fairly safe to assume that you will never use them, as Syslog uses them internally for remote logging. The rest of the methods’ arguments are:

Object logger: The object making the log request. Generally, the logger argument is simply this, as in the example Syslog.debug(this, "I'm in here");. If there’s no this reference where the logging call is made, as in a static method, simply pass in the current class object. For example, in the MyClass class, simply pass in MyClass.class instead of this. If you pass in a String, it will be used rather than that class name.
Throwable t: The log methods above that explicitly take a Throwable as an argument are convenience methods. The toString() method on Throwable is called, and the result is used as the msg argument. If the method takes a Throwable and does not take a level argument, the level defaults to ERROR. The Throwable itself is used as the detail argument.
Object msg: A short description of the message. The toString() method of whatever passes in will be called.
Object detail: If included, this argument is treated as a longer explanation of the message. If the object passed as this argument inherits from Throwable, then a stack trace is used.
int level: The severity of the message. There are constants defined in the Syslog class: Syslog.DEBUG, Syslog.INFO, Syslog.WARNING, Syslog.ERROR, and Syslog.FATAL.
Object channel: The channel name for the message, which can be one of two things: String or String[]. If the channel name is String[], the message dispatches to each channel specified in the array. The default value is Syslog.DEFAULT_CHANNEL.

Note: Exact formatting of the above arguments depends on what log-formatting module the logger handling the request uses. The above formatting explanations assume that the logger uses the default formatting module.

Message channels

Syslog supports arbitrary message channels defined on the fly — one of its more advanced features. One VM on a server may host multiple projects, each of which wants to have its logs handled separately. If a channel is not specified, the messages go to a default channel, Syslog.DEFAULT_CHANNEL. The Syslog.ALL_CHANNEL channel broadcasts messages to every channel.

Suppose you have two projects — “BigBank” and “SomeOtherBigBank” — running in the same server VM. You might find it useful to log messages from the BigBank project to the BigBankChannel channel, and messages from the SomeOtherBigBank project to the SomeOtherBigBankChannel channel so that each channel can go to different sets of files, and different sets of people can be notified when bad things happen. There are two ways to accomplish that:

The first approach is to call versions of the log() methods that take a channel argument and specify the channel everywhere. That will quickly prove annoying, and developers will eventually either misspell the channel name somewhere or omit it altogether. The only way to mitigate that is to have some interface in the project that defines a constant like this:

  public interface BigBank
  {
    public static Object CHANNEL = "BigBankChannel";
  }

Then have every class specific to the BigBank project use it when calling log() methods, like so:

  Syslog.debugToChannel(this, BigBank.CHANNEL, "Hello there");

If the project grows fairly large, programmers will tire of this approach, and invariably, messages will leak out to other channels. Also, some classes may be common to both projects. As a result, the SyslogChannelAware interface exists, which is the second approach to logging messages to different channels:

  public interface SyslogChannelAware
  {
    public Object getSyslogChannel();
  }

Any object passed in as the logger argument to one of the log() methods that implements SyslogChannelAware (usually this) is asked what channels it wants to receive the message.

Why is that beneficial? Most objects in projects are built on top of more generic classes, like base classes from frameworks. If each of the base classes used in the system implements SyslogChannelAware, the developers writing subclasses never have to bother with channel names. The base class can read a channel list to use out of a configuration file or from some other source. That works particularly well for Enterprise JavaBeans and servlets since channels can be specified in deployment descriptors and init params for each. This kind of flexibility means that when a system moves into production, different subsystems can be configured to write messages to different log channels without making code changes. Programmers just continue making calls to Syslog.info() and other methods, and the messages are magically routed to the correct channel and handled accordingly.

Syslog internals

In order to get the most out of Syslog, you should understand how it operates and see what’s going on under the hood. At its core, Syslog is pretty simple, and almost entirely based on interfaces, thus allowing custom implementations to override almost anything.

Architecture

The Syslog class itself is a singleton — that is, there is only one instance of it per VM. It contains all configuration information related to where messages route, who listens for those messages, and so forth. Because of its nature, the Syslog jar file should be included in the system CLASSPATH so that only one instance of it exists (app servers commonly use many classloaders), and the class garbage collection should be turned off in the VM (using the -Xnoclassgc argument). Clever configurations can take advantage of the structure of class loaders in application servers to complete more complicated tasks, but that subject reaches outside this article’s scope. If you cannot turn off class garbage collection, then you should retrieve Syslog‘s internal instance using the Syslog.getInstance() method and store it somewhere to prevent the class from unloading.

The following abbreviated diagram illustrates how Syslog looks from the inside. I have omitted many of the provided logger, policy, and text format implementations for clarity.

Syslog maintains a list of objects that implement the Syslogger interface, which is mainly the log(SyslogMessage message) method. Once the log() method is called on the Syslogger, Syslogger should quickly decide if it should react to the sent message. If the message is relevant, then the logger writes the message to a file or does whatever it wants with it.

Default implementations

The default implementation of the Syslogger interface is the BasicLogger abstract class. It is the superclass for all the loggers that are included with Syslog and should be used as the superclass for any custom loggers.

BasicLogger delegates the decision about paying attention to a given message to an implementation of the LogPolicy interface. LogPolicy‘s default implementation is the SimpleLogPolicy class, which knows about log message severity levels and log channels. You can configure a LogPolicy to concentrate only on messages at a certain level, set of levels, or the default mask set overall in Syslog, and to focus only on messages arriving on a certain channel or set of channels.

The BasicLogger also delegates its message-formatting duties to an implementation of the SyslogTextFormatter interface. The default implementation of that interface is the SimpleSyslogTextFormatter class, which can be configured to format log messages in various ways. Refer to the class’s Javadoc for specific configuration information.

During initialization — in the configure() method, part of the Syslogger interface — BasicLogger passes part of its configuration on the specified log policy and text-formatting modules along to the configure() method. You can plug in custom LogPolicy and SyslogTextFormatter implementations with any logger that extends BasicLogger. Refer to each class’s Javadoc for more information about extending or replacing a policy or text formatter.

Performance

The main task that applications perform is usually not logging. Therefore, logging should not take up a significant amount of the system’s resources nor interfere with its operation. The default implementations of loggers, log policies, and text formatters included with Syslog have all been written with performance in mind. Individual Syslogger implementations perform three operations when writing a message to a log file:

Policy check

Syslog asks the policy if it should log a given message or not. This operation occurs most often. Assuming the use of the SimpleLogPolicy policy, the policy check consists of integer bitmask operations and, if channel sets are involved, a HashMap lookup. A policy check runs extremely fast in modern VMs — on the order of millions of calls per second — and inflicts almost no overhead even with extremely high message volumes. If that performance level isn’t satisfactory, set the policy to accept the ALL_CHANNEL channel, and no HashMap lookup will occur. It is also common to use a static final boolean constant around debugging calls so that those calls can turn off completely at compile time, like this:

  public interface DebugFlag
  {
    public static final boolean DEBUG = true; // or false
  }
  ...
  if (DebugFlag.DEBUG)
    Syslog.debug(this, "Here is a debug message.");

Format call

The format call converts a log message to text. The default log formatter has been specifically optimized for speed. It uses StringBuffer objects and generally copies character arrays into those buffers for formatting, caching date format instances, and so forth.

Message write

The logger actually writes the formatted message to a file, sends email, and so forth. Actually writing the message to its final destination can be expensive. The file-based loggers all use buffered streams and have been extensively tested for speed. In addition, they are all necessarily thread-safe, so in the case of multithreaded applications, there can exist some unavoidable contention. Other loggers depend on other systems for their performance — RemoteLog relies on the RMI implementation, and JMSLog depends on the JMS implementation — so if the API providers are slow, the logger will be slow. The MailLog logger uses a background thread to connect to SMTP servers because this operation is generally quite slow.

Syslog includes a performance testing harness for determining the logger speed on a given system. It can test each of the above operations independently and as a whole in single-threaded and multithreaded environments. That proves extremely useful when developing custom loggers, policies, and formatters. Running a multithreaded test on a 700-MHz PIII, JDK 1.3 on Windows NT 4 yields the following performance numbers:

Operation	Average Time	Calls Per Second

Policy check (no channel check)

Very small

23.7 million

Policy check (with channel check)

0.5 µs

1.9 million

Message format

3.8 µs

265,000

Aggregate Syslog.log() call, including file write 57.5 µs 17,400

Syslog performance statistics

Running a single-threaded aggregate log() call test produced an average time of 34.6 µs for a rate of roughly 29,000 calls per second. Writing larger messages will obviously take more time. A message like the one tested above with a larger 200-byte body takes roughly 72 µs for a total rate of around 14,000 calls per second. Thread contention will always remain an issue with multithreaded applications, and Syslog is no exception. The only operation subject to thread contention is the log write; no other operation involves locking.

Syslog for operations

The operations staff also interacts with Syslog. These unsung heroes of production systems are the first people to learn about and handle any problems. Syslog allows the operations staff to completely customize the way they manage messages without changing any source code.

Included loggers

Syslog includes a handful of logger implementations that cover most of the bases as far as log functionality is concerned:

PrintWriterLog: Attached to an instance of java.io.PrintWriter, System.out, or System.err, it mostly writes messages to a console, although it can write to any PrintWriter.

TimeRolloverLog: Perhaps the most widely used logger, it writes to a file that rotates on the minute, hour, day, or month.

LengthRolloverLog: Writes to a file that rotates before it reaches a certain length.

FileLog: Simply writes to a file.

OpenFileLog: Identical to the FileLog in that it writes to a file, except the file opens before and closes after it writes each message. OpenFileLog is a relatively slow logger; you should only use it if there’s a good reason to justify it — such as a third-party log rotating system.

DatabaseLog: Writes messages to a table in a database. It’s useful when multiple machines are involved on a project and you need a unified view from all machines.

MailLog: Connects directly to an SMTP server to send email. It can either send plain text messages — good for pagers — or HTML-formatted messages. Since sending email is a relatively slow operation, this logger uses a background thread to complete the bulk of its work.

JMSLog: Publishes messages to a JMS topic and obeys any transaction context present. You can use JMSLog to send transactional-bounded messages to a remote log server.

RemoteLog: Sends messages to objects bound under the com.protomatter.syslog.remote location in JNDI that implement the RemoteLogReceiver interface. You can use it as a nontransactional transport for messages to remote log servers. RemoteLog includes a default implementation of the RemoteLogReceiver interface and a log server harness.

Each of the above loggers extends BasicLogger and supports pluggable log policies and text-formatting modules. For more information on how to configure these loggers, refer to the Javadoc for each.

Syslog and BEA WebLogic Server

Application servers have become one of the most popular and valuable tools for quickly building Web-based applications. Most, if not all, provide very little in the way of useful logging systems. For several years I have worked with BEA WebLogic Server, which can be configured to initialize Syslog when the server boots. Syslog will work with any Java software and is not tied to WebLogic in any way. Since source code is provided for Syslog, the following example can be easily adapted to work with other application server software.

The Syslog distribution provides a startup class for BEA WebLogic Server, which allows Syslog to initialize when WebLogic boots. To configure WebLogic (4.x or 5.x) to initialize Syslog when it starts, add the following line to the weblogic.properties file:

  weblogic.system.startupClass.Syslog=    com.protomatter.syslog.SyslogT3Startup
  java.system.property.Syslog.config.xml=PathToConfigurationFile

The SyslogT3Startup class reads the configuration file based on the Syslog.config.xml system property. Set the property by specifying it in the weblogic.properties file — as shown above — or by adding -DSyslog.config.xml=PathToConfigurationFile as a command line argument to the JVM in the startup script.

The SyslogT3Startup class implements the weblogic.common.T3StartupDef interface required for startup classes in WebLogic Server. It also implements the weblogic.common.T3ShutdownDef interface and will wait for all loggers to gracefully close their logs when WebLogic shuts down. To configure that behavior, add the following line to the weblogic.properties file:

  weblogic.system.shutdownClass.Syslog=    com.protomatter.syslog.SyslogT3Startup

The Startup & Shutdown link can be accessed through the WebLogic Console in order to configure WebLogic Server 6 to start Syslog when it boots.

If a startup class needs to ensure that Syslog has been configured before it starts, include the following code in the startup() method of the T3StartupDef implementation:

  SyslogT3Startup syslogStartup = new SyslogT3Startup();
  syslogStartup.startup(services);

Syslog and J2EE application servers

Java 2 Platform, Enterprise Edition (J2EE) is an increasingly popular platform for developing enterprise class applications, and Syslog has been developed to integrate with that platform. The J2EE platform provides services not found in regular Java applications; Syslog adaptations that capitalize on J2EE services take the form of custom logger implementations. Although the use of a J2EE-compliant application server brings out extra functionality in Syslog, J2EE is not required.

Using the JMSLog or RemoteLog loggers and the com.protomatter.syslog.SyslogServer command line application, you can route messages to a remote log server for processing. If you use the JMSLog logger, the logging of those messages will obey transactions operating inside the application server.

This is a complicated matter. With that complexity, however, comes a great deal of flexibility. Consider this: A cluster of application servers wants to relay transactionally correct usage information to a remote log server, which will separate messages into different log files and roll them over every hour. However, each machine in the cluster also wants to log all errors locally and send emails to administrators when certain conditions occur. Syslog can do this … here’s how:

Application server(s)

Syslog is configured to start when the application server boots. Its configuration includes the following loggers:

TimeRolloverLog: Configured to write all messages at or above the WARNING severity to a file on the local filesystem and roll it on the hour.
PrintWriterLog: Configured to write all messages at or above the WARNING severity out to the console.
JMSLog: Configured to forward all messages at or above the INFO severity to a JMS topic for further processing on another machine.
MailLog: Configured to send email to an email alias for the on-call operations pager when a message at or above the ERROR severity is logged.

Log server

Runs the com.protomatter.syslog.SyslogServer standalone log server and connects to the JMS topic hosted by the application server cluster. Syslog’s local configuration includes the following loggers:

TimeRolloverLog#1: Several instances of this logger listen to a different log channel — one for each hosted application — and write all messages at or above the WARNING severity out to a file on the local filesystem.
TimeRolloverLog#2: Several instances of this logger listen to a different log channel — one for each hosted application — and write all messages at the INFO severity out to system usage logs on the local filesystem. This TimeRolloverLog could also be a custom logger implementation that collates statistics into reports.
PrintWriterLog#1: Configured to write all messages at or above the WARNING severity out to the console.
PrintWriterLog#2: Configured to write all messages at the INFO severity out to the console, along with the channel and hostname from which each message comes. That produces a unified view of all system activity in the cluster.

Also note that Syslog uses a JMS topic, rather than a queue. Because of that, you could configure multiple log servers to handle different sets of messages or the same messages. Split up the messages either by specifying log policies on each server that select the messages or by using a JMS message selector (or both).

Keep in mind that you can make all these configuration changes at any time without altering a single line of source code in the system. You could even modify the configurations while the system is running. If a given component has problems, start logging all messages coming from that component to a separate log file by using the FileLog logger and the PerClassPolicy log policy, which can filter messages based on the class name producing the message, to select messages.

If the messages do not need to obey transaction contexts in the application server, then simply replace the JMSLog with the RemoteLog in the application server(s) and modify the log server configuration accordingly — a very simple change. Messages will be sent via RMI instead of across a JMS topic. Again, multiple servers can receive those messages.

Syslog and other systems

If you need to configure Syslog from inside another server or application, simply call the Syslog.configure() method and pass in a reference to the desired configuration file. You can also configure Syslog programatically: add, remove, and reconfigure loggers while the system is running using methods in the Syslog class and the individual loggers.

Configuration files

A basic Syslog configuration file looks like this:

  <Syslog defaultMask="DEBUG">
    <Logger name="myPrintWriterLog"
      class="com.protomatter.syslog.PrintWriterLog"
    >
      <stream>System.out</stream>
      <Policy class="com.protomatter.syslog.SimpleLogPolicy">
        <channels>ALL_CHANNEL</channels>
        <logMask>INHERIT_MASK</logMask>
      </Policy>
      <Format
class="com.protomatter.syslog.SimpleSyslogTextFormatter">
        <showChannel>true</showChannel>
        <showHostName>false</showHostName>
      </Format>
    </Logger>
    <!-- as many logger definitions as needed -->
  </Syslog>

The <Syslog> tag in the configuration file must be either the document’s root element or a direct child of the root element so that Syslog’s configuration will integrate with other XML-based systems in the same document.

The defaultMask attribute of the <Syslog> tag must be DEBUG, INFO, WARNING, ERROR, or FATAL. The default log mask is set to accept messages at or above the given level. If this parameter is omitted, the default log mask sets to WARNING.

The <Logger> tag inside the <Syslog> tag specifies a name for each logger, which should be unique, and a Syslogger implementation to use. After the class loads, its default constructor is called, then the logger is named, and the <Logger> element passes to the configure() method so that the logger can configure itself. All XML structures are passed around using JDOM objects.

Syslog will produce the default configuration and write it out to the console if the com.protomatter.syslog.Syslog class invokes from the command line. It will also parse and validate any configuration file passed in as the first command line argument. Default values are also filled in, and the resulting configuration prints to the console.

Conclusion

Syslog has evolved over time into a robust system for handling logging in a wide variety of Java applications. In this article, you’ve learned how to easily integrate Syslog into a Java program without carrying references to loggers in code that need to perform logging. You’ve also learned how to reconfigure Syslog while a system is running. In addition, configuration files can manipulate all aspects of log message handling, which eliminate code changes. Syslog saves valuable development time by eliminating the need for developers to reinvent the wheel each time a new system is built and deployed. Moreover, using Syslog in many projects makes it easier for an operations department to handle successive deployments because there is no need to learn how to configure each new system.

Nate Sammons is a freelance Java developer who
has been writing server-side Java applications since the Servlet
API first appeared, and has worked with BEA’s WebLogic Server since
it was called Tengah. He currently lives in the Boston metro area,
but is planning a return to Denver soon.

Source: www.infoworld.com