The autonomic puzzle

IBM’s autonomic enterprise is missing several important pieces

WHEN IBM REBRANDED eLiza last month, she was given a cute send-off. “In a self-configuring transformation of historical proportions,” the announcement read, “Project eLiza of IBM self-managing IT infrastructure fame, is now known as the IBM autonomic computing initiative.”

In the flurry of white papers that swirled around this event, the four mantras of IBM’s autonomic vision — self-configuring, self-healing, self-optimizing, self-protecting — were used everywhere, consistently. But it was hard to avoid concluding that “autonomic” for IBM has become what “.Net” is for Microsoft: an umbrella marketing term that encompasses everything and nothing in particular.

Is “autonomic” just a label for good ideas and best practices that have been floating around for a long time in both IBM and non-IBM products? Are IBM DB2’s self-tuning features notably different from those in other enterprise databases? We put these questions to Alan Ganek, IBM’s vice president for autonomic computing.

According to Ganek, IBM has defined a set of architectural principles for autonomic computing. “We define the attributes of autonomic managers for resource elements,” he says, “and we lay out the notion of sensors and effectors for each of those elements, and the reference model for monitoring, analyzing, planning, and executing change.” The lingo is deliberately biological: sensors, effectors, and feedback loops are the tools used by the autonomic nervous system to maintain homeostasis, or dynamic equilibrium. Using these same tools to keep computers and networks healthy is a great idea. And like many great ideas, it’s been around for a while.

“This is a journey,” admits Michael Zisman, general manager of IBM’s storage software unit, “and it didn’t start last year when we began using the word autonomic.” The current marketing push signals a concentrated effort to change the slope of the curve. He cites the Linux-based virtualization engine in the forthcoming StorageTank product, which isolates the system administrator from storage, as an enabler of self-configuring, self-healing capabilities. Ganek cites DB2’s new Configuration Advisor, which analyzes its environment, and, he says, delivers recommendations that rival IBM’s best human experts, and can double the throughput of some customer DBAs.

Other IBM brands are singing in the same choir. The Tivoli Risk Manager uses a DB2-like expert-system model to correlate intrusions and recommend how to handle them. WebSphere 5.0 is expected to announce self-configuring and self-tuning features today.

It’s a journey, not a destination, we agree. But along the way, we wonder if IBM is creating more than a methodology of sensors, effectors, and control loops. Was the Intelligent Resource Director for IBM’s zSeries mainframes, for example, based on a reusable expert-systems engine that’s also driving the autonomic features of DB2, Tivoli, WebSphere, and StorageTank?

Apparently not. Some of the correlation engines may have broader applicability, Ganek says, but for now, they’re specialized and domain-specific. “Look, 20 percent of this is about advanced cognitive kinds of things,” he says, “and 80 percent is the infrastructure. If you don’t have good instrumentation, if you don’t know what the system is doing, it’s hard to ask some analytic engine to tell you what to do about it.”

In other words, there may well be some hot expert-systems technology floating around IBM’s Almaden Research Center, but the company for now is focusing on the basics. So let’s take a specific example: Linux. IBM’s newest strategic operating system would surely be a better autonomic citizen if the ways to use and extend its instrumentation were better defined. It’s true that when virtualized on the mainframe, Linux inherits the host’s intelligent workload management capability. But Linux’s own approach to instrumentation lacks consistency.

Steve Mills, senior vice president of IBM’s software group, agrees that’s a problem. He thinks that a standard logging format is a crucial foundation for autonomic computing. “Make all the log schemas consistent,” he says, “and you can have common tracing and debugging, you can create probes, you can have first-failure data capture.” In an open-standards world, says Mills, it’s hard to push for this kind of consistency. Nevertheless, he says, “there are interest groups, there is a dialog, and we intend to spark it with implementations we can move into standards bodies.”

Good! Of course it’s more fun to write press releases about exciting new autonomic features than about boring old log file formats. But garbage in, garbage out. Quality instrumentation is the name of the game. We’d like to see IBM spell out how it plans to help enrich several pieces of strategic infrastructure so that layered products can participate more fully in the autonomic vision. Linux is one of those pieces. The Web services stack is another.

In the last six months, the notion of loosely-coupled Web services has taken hold. This design pattern offers major benefits, but exacts a price in terms of auditing and control. “Tell an IT guy you have a hard time with auditability and control,” says Mills, when asked about peer-to-peer technology, “and he’ll see people showing up in handcuffs on TV.”

Routing would seem to be a crucial ingredient of autonomic Web services. Microsoft’s WS-Routing and WS-Referral suggest one approach, but so far IBM hasn’t tossed its hat into that ring. When asked how the autonomic vision will play out in terms of Web services, IBM points to its grid computing effort. “With a lot of leadership,” says Ganek, “IBM moved the grid community onto a much more productive paradigm with Web services interfaces.” The forthcoming Version 3.0 of the Globus Toolkit, an open-source grid middleware project, will be a reference implementation of the OGSA (Open Grid Services Architecture). With OGSA, grid resources can be described using WSDL and encapsulated in J2EE containers. By taking a services-oriented approach to building “a meta-operating-system environment,” says Ganek, every operating system can participate.

The major challenge in the design of the grid operating system, Mills says, is scheduling. For example, how does a “superscheduler” interact with a population of Linux, Solaris, and Windows servers? In the grid environment, autonomic features such as recovery and data migration will await an answer to what is still a research problem. “We’re not ready to tell UPS to move the world’s largest transaction processing environment onto a grid,” Mills says. He is equally frank about Web services. “It’s a declarative technology that helps you connect, not transact,” he says. “So when someone says the answer is Web services, my response is, ‘You don’t know what you’re talking about.’ ”

We appreciate the candor. Transactions are where the rubber meets the road. There’s undoubtedly real value to be found at the intersection of grid computing, Web services, and autonomic computing. And there is arguably no company better positioned than IBM to deliver that value in a robust commercial form.

But the details remain sketchy.

Source: www.infoworld.com