Should you go with JMS?
Why JMS isn’t always the best solution for distributed system development
Distributed system development is growing rapidly as software developers build systems that must keep up with the ever-increasing requirements imposed by e-business. But, never before has the design and implementation of a message-processing layer within a distributed system been as complex as it is today. This is mostly due to the dramatic increase in potential functionality enabled by standards like Java Message Service (JMS) that connect many vendors’ technologies in a single system. In addition, the proliferation of the Internet has given rise to new, expansive user bases and has made available several protocols for communication within a distributed system. Such protocols include CORBA IIOP (Internet Inter-ORB Protocol), Microsoft DCOM (Distributed Component Object Model), and Java RMI (Remote Method Invocation).
The natural evolution of these protocols has led to the introduction of message-oriented middleware (MOM), which allows for looser coupling within distributed systems by abstracting translation, security, and the underlying communications protocols from clients and servers. Middleware solutions include SOAP (Simple Object Access Protocol) and JMS. Proprietary, middle-layer transaction processing has existed since the early days of COBOL (Common Business Oriented Language), but it wasn’t very complex because of early messaging technologies’ limitations.
With the advent of standards like JMS, developers can now connect numerous technologies. Distributed-system design decisions are more difficult, and their implications on data integrity and distribution are critical to system success or failure.
A pervasive and tacit assumption is that the introduction of technology is an asset while its liabilities are oftentimes ignored. Not accounting for the liabilities often results in a system that is either unnecessarily complicated and/or over-engineered. A basic understanding of JMS and its inherent qualities (system-independent qualities), followed by a careful analysis in relation to specific distributed-system scenarios can indicate how well JMS might solve system requirements versus either altering existing problems or even introducing new ones.
JMS overview
JMS, introduced by Sun Microsystems in 1999 as part of the Java 2 Platform, Enterprise Edition (J2EE) specification, is a set of standards that describe the foundations for a message-processing middleware layer. JMS allows systems to communicate synchronously or asynchronously via both point-to-point and publish-subscribe models. Today, several vendors provide JMS implementations such as BEA Systems, Hewlett-Packard, IBM, Macromedia, and Oracle, thereby allowing JMS to interact with multiple vendor technologies.
Figure 1 shows a simple JMS-based system with an outgoing queue populated with messages for clients to process, and an incoming queue, which collects the client processing results for insertion into a database.
As mentioned above, MOM (like JMS) allows looser coupling within distributed systems by abstracting translation, security, and the underlying communications protocols from the clients and servers. One of the message-processing layer’s main assets is that, because it introduces this abstraction layer, the implementation of either the client or server can change, sometimes radically, without affecting other system components.
Two specific scenarios
In this section, I present two distributed systems that are potential candidates for JMS and explain each system’s goals and why the systems are JMS candidates.
Scenario 1
The first candidate is a distributed encoding system (shown in Figure 2). This system has a set of N clients that retrieve encoding jobs from a central database server. The clients then execute the actual transformation (encoding) from digital master to encoded files, and finish by reporting their post-processing status (e.g., success/failed) back to the central database server.
The types of encoding (e.g., text, audio, or video) or transformations (e.g., .pdf
to .xml
, .wav
to .mp3
, .avi
to .qt
) do not matter. It is important to understand that encoding is CPU-intensive and requires distributed processing across multiple clients to scale.
At a glance, this system is a potential JMS candidate because:
- Processing must be distributed as it is extremely processor (CPU) intensive
- It may be problematic, from a system performance standpoint, to connect multiple clients directly to a single database server
Scenario 2
The second JMS candidate system is a global registration system for an Internet portal. Global registration handles requests for new user creation (registration), login, and authentication.
Specific registration information (e.g., name, address, favorite color) and user-authentication methods (e.g., server-side user objects, HTTP cookies) are unimportant. However, it is important that this system scale to handle millions of users, and usage patterns are difficult, if not impossible, to predict. (During a televised ESPN World Cup game the announcer says, “Log in and vote in our online poll. We’ll present the results at the end of the show.” All of a sudden, 500,000 users log in within a three-minute interval. 3 minutes = 180 seconds; 500,000 user logins/180 seconds = 2,778 user logins/sec.)
This system is a potential JMS candidate for the following reasons:
- The system must be distributed to scale the transaction volume
- Transactions are atomic (e.g., login), so they are stateless and therefore candidates for distribution
The two systems are architecturally alike. Several client machines extract data from a central database server (possibly replicated out to M read-only database servers), execute some logic on the client, and then report the status back to the central database server. One system delivers encoded files to a filesystem over UNC/FTP; the other delivers HTML content to Web browsers over HTTP. Both systems are distributed.
This is as far as many engineers go with their analyses before applying JMS. In the rest of this article, I explain that, although these systems share many characteristics, the appropriateness of JMS becomes clearer and more divergent as we examine each system’s requirements, including system performance, data distribution, and scalability.
System analysis: To integrate or not to integrate
JMS has intrinsic, system-independent qualities. Some of these qualities (pros denoted by +, cons denoted by -) that apply to both systems include:
- (+) JMS is a set of standards created by multiple vendor implementations; therefore, you avoid the dreaded vendor lock-in problem.
- (+) JMS allows for abstraction (via a generic API) between client and server; you can change a database schema or platform without changing the application layer (implicit here are other potential system changes, isolated from one another by the messaging layer).
- (+)/(-) JMS can help a system scale (a pro). The con is that any system that scales with JMS can scale without it.
- (-) JMS is complicated. It’s an entirely new layer with a new set of servers. Software rollout management, server monitoring, and security are just a few of the nontrivial problems associated with a JMS rollout. Costs should also be considered.
- (-) Vendors do not always interpret and therefore implement standards exactly the same way, so differences exist between various implementations.
- (-) With JMS, you need more system checks and balances. You not only introduce a new layer, you also introduce asynchronous data distribution and acknowledgement, which has the added complexity of asynchronous notification.
- (-) No message reporting/updating/monitoring queues without custom software.
JMS also has system-dependent qualities. The appropriateness of JMS depends on how well these qualities map to the problem set you’re trying to solve. Some of these qualities and how they relate to the two systems of interest follow:
Caching
Caching is a primary consideration for capacity planning within any distributed system. JMS has many features that allow its use as a caching technology (mainly that it’s distributed, synchronous or asynchronous, and data exchanges as objects in messages). Therefore, an existing JMS installation can be leveraged as a caching infrastructure if required.
When considering the encoding system, caching is generally not useful to increase overall system performance, as most file transformations execute once and move to a hosting facility or SAN (storage area network), and there is little content overlap between customers. Global registration is a prime candidate for a user-information cache, as users usually log in, browse, and then log out. Login creates a user’s cache entry, and this object provides subsequent user authentication while the user is on the site.
Processing order
Within the global registration system, there is no scheduling and/or order for transaction processing. Pseudo-random users enter the system at pseudo-random intervals upon login, browse content (and are therefore authenticated when they access restricted content and/or applications), and then log out.
Within the encoding system, processing is ordered. Content batches into groups for delivery depending on the availability of removable storage (e.g., DLT Solutions or Network Appliance storage). Content is not delivered until the batch is complete, so batches must execute in order (although transforms within a batch can potentially be unordered). Implementation of priority queues within JMS to preserve processing order is possible, but maintaining this order of message batches between multiple JMS servers and multiple queues becomes quite complicated. A relational database server with support for transactions is a more suitable technology for managing this workflow.
Security
Security is not part of the JMS specification. The security problem is not necessarily changed with a JMS-based implementation (if you have a security requirement pre-JMS, you will have a similar security requirement post-JMS). Knowing this, it’s important to understand how JMS might relate to existing infrastructure security.
In general, the more technology you use, the more vulnerable your system becomes to hackers and security violations. Because the global registration application server is Web-facing, security flaws discovered in your vendors’ JMS implementation and published in Internet news groups quickly become security liabilities for your site. Also, because JMS is a generic API, it’s more prone to security breaches than a proprietary system that uses an unpublished API.
While you can leverage your existing firewall and IP-based network security to protect your back-end (read: not Web-facing—pun intended) application and database servers from security violations, there is a significant security risk created by exposing JMS application servers directly to the Internet.
The encoding system generally exists on the same network (also a network isolated from the Internet). So, there’s nothing inherent about this system’s network topography that relates to JMS and leveraging this topography to provide security (there are far fewer security requirements for the encoding system, as it is not Web-facing).
Scalability
Because the global registration system is subject to the whims of a large and capriciously-clicking user base, the system’s scalability requirements warrant JMS. JMS will not only help scale the system, it will queue transactions, although it won’t be much help when user requests flood the system.
Because the distributed encoding system has carefully regulated data traffic (as it’s presumably a self-contained system), the system’s scalability requirements are not as formidable. For distributed encoding, you can connect your O[100]
clients directly to your database and throttle their traffic to balance encoding throughput with database server performance.
Performance
The introduction of a single JMS server can change performance issues rather than solve them. For this reason, a JMS system should be designed with multiple JMS servers (and therefore multiple queues). Figure 4 shows why performance problems are altered instead of solved. It illustrates the processing layers required for a generic data server to respond to client-connection requests:
Data exchange between client and server is a two-part process, whether this is a client-to-database or client-to-JMS server:
- Data access
- Thread and socket management, pooling, and caching
A JMS and a database server look exactly the same (Figure 4). They handle socket connections, thread management, and access to the server’s data.
With only one JMS server, potential performance problems simply commute from the database server to the JMS server. In addition to possible performance degradation associated with context switching within your database server, performance problems are now potentially greater due to JVM performance issues within your JMS server.
A single JMS server adds significant complexity to your system and might also introduce performance problems related to the connection of multiple clients to a single server. The impact of multiple JMS servers on your system design and data flow can mean the difference between a successful or failed system rollout.
In summary, features versus potential JMS impact looks like so:
Features versus JMS impact
|
JMS alternatives
If your distributed system does not warrant a complex JMS solution, as is the case for the distributed encoding system, an alternate data distribution (and data collection) strategy is required. Following is one alternative for data distribution within a distributed system. By data distribution, I mean:
- Transferring data from a central database server to N clients
- Collecting data from N clients and storing it in a central database server
- Reconciling messages with acknowledgements
Figure 5 shows a two-database (master and secondary) system, fronted by one or more Web servers containing a single ISAPI/NSAPI (Internet Server API/Netscape API) module written in either C or C++. The clients communicate with the Web servers using XML over HTTP. The master database server maintains aggregate data; the secondary database collects raw data for aggregation.
The ISAPI/NSAPI modules maintain a pool of ODBC (Open Database Connectivity) connections to the databases, along with numerous threads awaiting HTTP requests delivered via the Web servers. Data reads are cached in memory within the ISAPI/NSAPI environment, which concurrency handles using mutexes/critical sections. Data writes are also cached in memory and periodically dumped to local disk (Web server) in BCP format. Clients connect to the Web servers using HTTP and request data (or write data) using XML (possibly with a multipart form post).
Data writes can be synchronous (HTTP POST -> SQL Stored Procedure INSERT -> HTTP 200 ACK
returned to client) or asynchronous (HTTP POST -> in-memory cache -> HTTP 200
returned to client, cache later dumped to disk for BCP insert into data collection server).
Note that this strategy includes innumerable permutations like database replication to a set of read-only databases, adding another set of Web servers for data reads and a second set of Web servers for data writes.
This strategy has the following pros and cons that map quite closely to the evaluation criteria for JMS-based analysis:
Pros:
- NSAPI/ISAPI modules are quickly written in native C/C++
- The system leverages existing Web server software for port, socket, and thread management to improve performance (e.g., ability to maintain 50,000 simultaneous connections)
- ISAPI/NSAPI modules are relatively simple and lightweight (not very much code)
- Throttles traffic out of database, as connection pool is configurable
- Maintains many records in memory (
O[100k..1m] ]
(not limited by JVM memory) - Easy to make data access thread safe using mutexes, semaphores, and critical sections
- Uses HTTP, so any client can connect, and client/server implementation is decoupled
- Simpler rollout and configuration than JMS servers
- Uses existing database connectivity technologies (ODBC)
- ODBC is relatively fast (it’s native)
- Asynchronous data collection for Internet traffic surges
- Choice of either periodic data writes to disk in BCP format for bulk insert into database, or single inserts and/or groups of inserts over ODBC (single inserts are
O[500/sec]
or more; bulk inserts areO[10k/sec]
or more) - Data queuing limited by local disk space, not RAM
- If the same module is used for both data distribution and data collection, the asynchronous gap narrows such that the module reports database server status to clients (e.g., shuts down services, stops collecting data, accumulates data to secondary storage) in real time
Cons:
- Custom code means development time
- C++ is generally more difficult to write than Java
- Single point of failure (in a single Web server scenario)
- No data reporting/updating cached within ISAPI/NSAPI module without custom code
- Messaging between client and server is asynchronous (this is possibly desired)
- Messages are not persistent in a relational database server that supports transactions
Only-when-necessary rule of thumb
JMS can be a real win from a scalability, caching, and avoiding-vendor-lock-in standpoint. More specific to the system and/or application is how JMS relates to performance, security, processing order, and data integrity. One thing is well understood: JMS integration into a distributed system is complicated.
Avoid JMS unless its benefits far outweigh its liabilities. In making this determination, your JMS system analysis should focus on both system-dependent and system-independent features. If your analysis suggests benefiting from caching, performance, and scalability from multiple JMS servers, JMS may be appropriate for your system. However, many simpler alternatives can provide the requisite layer of abstraction between client and server while taking advantage of HTTP and XML, and offer desired scalability and synchronous/asynchronous communication.
Understanding JMS and message-processing basics, coupled with thorough system analysis allows Java developers to evaluate trade-offs associated with either JMS implementation or elimination of this message-processing layer altogether. Once developers recognize these trade-offs, they can make informed decisions about system design and architecture for their distributed systems.
Author’s note: This article was written with help of Darrell Cavens, CTO of BlueNile.com, who provided many insightful and invaluable comments and suggestions.