Take control of the servlet environment, Part 2

Alternatives to servlet session management

In Part 1 of this series, we introduced the Rudimental Servlet Extension Framework (RSEF) and delved into its bowels, exposing the potential power of intercepting communications between your servlets and the servlet engine. In Part 2, we will introduce a concrete implementation of one of the wrappers to show you how to extend and use the power of the framework. That example will allow you to take control of session management, facilitating a flexible plug-and-play mechanism. You can switch from client-stored sessions, to in-memory server sessions, to persistent database sessions without hacking your existing servlets. Each flavor solves a unique problem in the Web-based application genre.

TEXTBOX: TEXTBOX_HEAD: Take control of the servlet environment: Read the whole series!

Part 1: Invisibly extend the functionality of the servlet API

Part 2: Alternatives to servlet session management

Part 3: Beware of the cookie monster
:END_TEXTBOX

What is a session?

The interaction between a Web browser and a Web server is stateless. The browser connects to the server, requests a single piece of information, then disconnects, at which point the server completely forgets about the transaction and awaits the next request. Sessions are traditionally used to create a state for Web-based communications. Essentially, they are dictionaries of name-value pairs that persist from one request to the next.

Behind the scenes, most servers store the session data in memory and map it to a respective browser via a special cookie. When the browser connects to the server for the first time, the server assigns it a unique identification code and tells the browser to save that code as a cookie. Any future requests from the browser include the cookie, which the server uses to look up any session data stored in memory.

Alternatives to storing session data include encoding the session ID into all of the URLs on the page being served or using the client’s IP address, but those options are either too complex or unreliable. URL encoding requires that you visit each link via code, which is cumbersome and, if you are using a template system, impractical. Using the client’s IP isn’t reliable because the client might be behind a proxy that allows multiple machines to share a single IP to the outside network.

What is wrong with sessions?

Nothing is wrong with the concept of sessions, but the way the server handles sessions can produce problems. Storing session data in memory precludes the effective use of load balancing on a farm of Web servers. If a browser were directed to a different server each time it connected, it would have multiple sessions in existence on each server it visited. And, of course, those sessions would not synchronize with each other, thus leading to complete pandemonium.

The most common solution is to use a load-balancing mechanism that makes a browser sticky to a particular server. The load-balancing mechanism remembers which browsers visited which servers and ensures that they keep returning to the same place. Figure 1 illustrates the separate copies of each session across the farm of Web servers.

Figure 1. Session management can become futile

Making a client sticky to a single server turns the server into a single point of failure. If the server crashes, the client loses all its session data. If the server undergoes an excessive load, and thus responds slowly, the user experience degrades. In addition, it is entirely possible that a majority of abnormally active users will coincidentally be routed to a single server, overburdening that server more than its brethren.

An alternative

The obvious solution, to anybody familiar with multitier applications, is to move the sessions out of the Web servers and into a central point of reference. So, no matter which Web server in the farm a browser connects to, it will receive the same session in the same state following the last request. This central point of reference could be a database (JDBC), a remote object (Remote Method Invocation/Enterprise JavaBeans), a naming server (JNDI), or even a cookie (assuming the browser can handle cookies). Figure 2 shows the session relocated from the Web server to the database.

Figure 2. Session management centralized

In Part 2, we will use a database for storing session data.

The database

To plug database-stored sessions into the RSEF, we have to implement a version of SessionWrapper that reads from and writes to a database.

As stated above, a session is just a dictionary of name-value pairs, so our database table needs to reflect that. The table also needs a way to remember which session each data pair belongs to. Here’s what the table looks like:

SESSION_MASTER
--------------
SESSION_ID  CHAR(50)    KEY
NAME        CHAR(50)    KEY
VALUE       CHAR(1024)
DATE        DATETIME

(We assume that you have a basic knowledge of database theory, as the subject expands beyond the scope of this article. In addition, for purposes of brevity and simplicity, we will ignore the issues of char versus varchar versus clob or blob fields in relation to indexing, keys, etc.)

The DATE column time-stamps the session entries, thus accommodating expiration or garbage-collection services. You’ll see how the following code examples handle this column, but it will not be part of the overall discussion.

Now that we have laid the foundation for storing session data, we need to code the logic for utilizing the database. This logic will be neatly packed into our first wrapper.

The wrapper

The concept of a wrapper is simple. RSEF allows us to intercept any references to the session object via the SessionWrapper. So, whenever a servlet needs to put data into a session, retrieve data from a session, or complete any function related to the session’s data, we intercept the request and map it to our database table.

The first thing we need to do is create our wrapper class:

(Note: The following code examples use the SQLUtil tool discussed in “Clever Facade Makes JDBC Look Easy,” Thomas Davis (JavaWorld, May 1999). The class name has changed to JdbcFacade since the original publication. The tool hides much of the code bloat required by the JDBC API; it also hides connection pooling behind the scenes. Regardless of whether you’ve read the article or not, you should be able to understand what the code is doing.)

package net.rudiment.servlet.session.database;
public class SessionWrapper extends net.rudiment.servlet.SessionWrapper
{
    private static final long expiration = net.rudiment.util.Times.oneDay;
    private JdbcFacadeFactory _factory;
    public SessionWrapper( RequestWrapper request, ResponseWrapper
response, HttpSession session, JdbcFacadeFactory factory )
    {
        super( request, response, session );
        this._factory = factory;
        load();
    }
}

We won’t go into the details of the JdbcFacadeFactory; all you need to know is that it produces instances of JdbcFacade, which are used to communicate with the database. Since the factory is required in the constructor, your bootstrap servlet (see Part 1) must provide it. Here’s how your SessionWrapperFactory, from the bootstrap, might look:

    new SessionWrapperFactory()
    {
        public SessionWrapper wrapSession(
            RequestWrapper request,
            ResponseWrapper response,
            HttpSession session )
        {
            return(
                new net.rudiment.servlet.session.database.SessionWrapper(
                    request,
                    response,
                    session,
                    new JdbcFacadeFactory()
                    {
                        public JdbcFacade getInstance() throws SQLException
                        {
                            return( new com.xyzzy.util.JdbcFacade() );
                        }
                    }
                )
            );
        }
    }

You may also disregard the expiration variable. It garbage-collects old session data, but that too is outside the scope of this article.

So what does that enigmatic load() method do? It loads all the session values from the database into memory. Traditionally, a bunch of strings constitute data stored in the session. But, since the servlet API supports it, data placed into the session may be any arbitrary object. And to store an object into the database, it must be serialized. Likewise, when data is retrieved, it must be deserialized. Since serialization also extends beyond the scope of this discussion, all of the magic disappears behind the Serialize object in the following code. All you need to know is that Serialize returns a string (not byte array) representation to any object. The wrapper passes each data piece loaded from the database up to its superclass — which, for purposes of this article, we assume will store in volatile memory. Here’s the code:

    protected void load()
    {
        JdbcFacade util = null;
        try
        {
            util = this._factory.getInstance();
            util.setSQL( "select name, value from session_master " +
                         "where session_id = ? and date >  ?" );
            util.setString( 1, getId() );
            util.setDate( 2, new Date( System.currentTimeMillis() - 
expiration ) );
            ResultSet rset = util.executeQuery();
            while( rset.next() )
            {
                String name = rset.getString( "name" );
                Object obj = Serialize.objectFromString( rset.getString( 
"value" ) );
                if( obj != null )
                {
                    super.putValue( name, obj );
                }
            }
        }
        catch( SQLException e )
        {
            System.err.println( e );
        }
        finally
        {
            if( util != null )
            {
                util.close();
            }
        }
    }

Why do we load them all at once, rather than wait for them to be requested? Isn’t that a waste of time and memory? I’m glad you asked. From the get point of view, it does seem rather inefficient. The wrapper loads all of the data even though none of it might be requested; in which case, it wastes the time required to load the data and the memory required to store the data. But you need to step back and look at the whole picture. How does this appear from the set point of view? If I put a piece of data into the session, the wrapper serializes the data and writes it to the database. If I request that data within the same execution context of the servlet, the data must be loaded back out of the database and deserialized. This transaction becomes expensive as the number of puts and gets increase in frequency.

But, performance isn’t the only downside. A nasty and obscure bug hides beneath the code. Take, for example, the following code that might appear somewhere in one of your servlets:

    Date date = new Date( yesterday );
    session.putValue( "date", date );
    date.setTime( tomorrow );

And then this code in another servlet:

    out.println( session.getValue( "date" ) );

Which value appears on the page: yesterday or tomorrow? tomorrow should appear. Though sloppy, we can legally change the date object’s state after placing it into the session. If our wrapper had immediately serialized that object and written it to the database, the retrieval code would have seen the value of yesterday. We’ll admit it, an early draft of RSEF contained such a bug.

In the end, we don’t override putValue or getValue. We only access the database twice: once to load all the data and once to store the data. But how in the world does the session object know when to write all the data back into the database? Certainly you wouldn’t be so callous as to force the programmer to call a save() method at the end of each servlet? Of course not. save() does exist, but the programmer need not know about it. It is called at the end of the RSEF’s version of the service() method in its HttpServlet class:

    try
    {
        ((SessionWrapper)wrappedRequest.getSession( true )).save();
    }
    catch( ClassCastException e )
    {
        e.printStackTrace();
    }

We check for a class cast exception, because possibly the session wasn’t wrapped at all.

The save() method simply iterates through all the data stored in the superclass and writes each entity to the database. Rather than keep track of which values are already in the database (an exercise for the reader), it simply attempts to insert each one, and upon failure, assumes a primary key constraint violation and reverts to an update:

    public void save()
    {
        String[] names = super.getValueNames();
        for( int x = 0; x < names.length; x++ )
        {
            String name = names[x];
            Object value = super.getValue( name );
            writeToDatabase( name, value );
        }
    }
    protected void writeToDatabase( String name, Object value )
    {
        JdbcFacade util = null;
        Date now = new Date( System.currentTimeMillis() );
        String ser = Serialize.objectToString( value );
        try
        {
            util = this._factory.getInstance();
            util.setSQL( "insert into session_master " +
                         "( value, date, session_id, name ) " +
                         "values ( ?, ?, ?, ? )" );
            util.setString( 1, ser );
            util.setDate( 2, now );
            util.setString( 3, getId() );
            util.setString( 4, name );
            util.executeUpdate();
        }
        catch( SQLException e ) // assume a primary key constraint 
violation
        {
            try
            {
                util.reset();
                util.setSQL( "update session_master " +
                             "set value = ?, date = ? " +
                             "where session_id = ? and name = ?" );
                util.setString( 1, ser );
                util.setDate( 2, now );
                util.setString( 3, getId() );
                util.setString( 4, name );
                util.executeUpdate();
            }
            catch( SQLException e2 )
            {
                System.err.println( e2 );
            }
        }
        finally
        {
            if( util != null )
            {
                util.close();
            }
        }
    }

Note: For purposes of simplicity, we assume that any object placed into the session is serializable.

Identification

We’ve omitted one minor detail: identifying the session. The getId() method referenced by the prior methods must pull off a few tricks. As mentioned in the beginning of this article, most servlet engines use a cookie to handle the browser-to-session mapping. We could, and probably should, assume that our engine does the same and use the superclass implementation of getId(). But we’ll roll our own, just in case the server does something fishy that would cause the session ID to differ as the client bounces from one server to the next (thus resulting in unique session states on each server visited).

The logic presented here is simple. We create a unique session ID by taking the current system time in milliseconds and appending a random three-digit number. We add the random number in case two session IDs materialize at the exact same millisecond in time. Although this is highly improbable, Murphy’s Law guarantees that it will happen in a mission-critical application. You can use any mechanism you desire to generate the session ID; it must simply be a unique string.

Once you generate a session ID, you must store it in a cookie so that the browser remembers it. That is one of the reasons the SessionWrapper needs a reference to the ResponseWrapper. But that session might have already been accessed and, thus, could already have an ID, so we must check the existing cookies. We’ve used a utility class to cover the code bloat of looping through each cookie and comparing their names:

    private static final String param = "rudiment.session";
    private static final Random random = new Random();
    private String sessionID;
    public String getId()
    {
        if( null == this.sessionID )
        {
            this.sessionID = Util.getCookieValue( _request, param );
            if( null == this.sessionID )
            {
                this.sessionID = (
                    ( System.currentTimeMillis() * 100 ) +
                    "." +
                    ( 100 + ( Math.abs( random.nextInt() ) % 1000 ) ) );
                if( _response != null )
                {
                    _response.addCookie( new Cookie( param, this.sessionID 
) );
                }
            }
        }
        return( this.sessionID );
    }

With storing and retrieving out of the way, the next logical step is cleaning up. The session interface provides two methods for this purpose: removeValue() deletes a single data pair and invalidate() wipes the entire session.

Our version of removeValue() asks the superclass to remove its in-memory reference and then immediately deletes the database record:

    public void removeValue( String name )
    {
        super.removeValue( name );
        JdbcFacade util = null;
        try
        {
            util = this._factory.getInstance();
            util.setSQL( "delete from session_master " +
                         "where session_id = ? and name = ?" );
            util.setString( 1, getId() );
            util.setString( 2, name );
            util.executeUpdate();
        }
        catch( SQLException e )
        {
            System.err.println( e );
        }
        finally
        {
            if( util != null )
            {
                util.close();
            }
        }
    }

The invalidate() method completes the puzzle. That method tells the session to simply flush itself; it forgets everything. We could simply iterate over all the data and call removeValue() for each, but that would create excess database traffic. Our version of the method must tell the superclass to invalidate and then clean up the database in one fell swoop:

    public void invalidate()
    {
        super.invalidate();
        JdbcFacade util = null;
        try
        {
            util = this._factory.getInstance();
            util.setSQL( "delete from session_master where session_id = ?" 
);
            util.setString( 1, getId() );
            util.executeUpdate();
        }
        catch( SQLException e )
        {
            System.err.println( e );
        }
        finally
        {
            if( util != null )
            {
                util.close();
            }
        }
        Cookie cookie = new Cookie( param, "" );
        cookie.setMaxAge( 1 );
        _response.addCookie( cookie );
    }

Notice that we also wipe the cookie that stores the session ID. This might not seem necessary since the session is now empty, but the database delete might have failed. By forgetting the session ID, we ensure that the next request will receive a fresh session.

Conclusion

Database-stored sessions present a shining example of the power and versatility of the RSEF. You now have the power to control your session management, which is no longer at the whim of another vendor’s engine. Next month we’ll explain a tricky little problem with cookies and demonstrate how the RSEF comes to the rescue once again.

Thomas E. Davis is a Sun Certified Java
Developer and the chief technology officer of his second successful
Internet-related company. In addition to being a Java advocate,
Thomas is a strong proponent of the extreme programming, design
patterns, and refactoring philosophies, which he preaches to his
colleagues and supports within his own department. Thomas welcomes
constructive criticism and intelligent comments in relation to his
articles. Craig Walker is a Sun Certified Java
Programmer (awaiting the results of his Developer exam) and senior
software engineer. A former consultant to IBM, working on its
internal Java initiatives, Craig has pursued the mastery of many
Java APIs ranging from the presentation layer to the enterprise and
everything in between.

Source: www.infoworld.com