IBM bets on ‘smart’ computing
Senior VP Steve Mills discusses IBM’s grid and autonomic computing initiatives
STEVE MILLS, IBM’S senior vice president in charge of the company’s software solutions group, is responsible for shaping IBM’s overall software strategy. Mills met with InfoWorld Test Center Director Steve Gillmor, News Editor Mark Jones, Editor at Large Ed Scannell, Lead Analyst Jon Udell, Technical Director Tom Yager, and Senior Editor Tom Sullivan to discuss IBM’s grid computing and autonomic capabilities strategies.
InfoWorld: Can you give us a road map for how IBM might implement grids on a broad-based basis?
InfoWorld: Do you see peer-to-peer as a way to get into grid computing?
Mills: I don’t really like to relate it to peer-to-peer because I think peer-to-peer carries this [negative] Napster notion. Businesses are not striving for that kind of environment. There’ve been no major Groove rollouts anywhere in the world; they’ve all been pilot projects because of the inherent problems associated with those types of very loosely connected environments in a business context. [The problem] is security control, scalability control, [and] recoverability. [It’s an] interesting technology, but in the context of today’s business concerns, as soon as you say to an IT guy [that] you’re going to have a hard time dealing with audit trail, recoverability, and security, the last thing [they] want to do is deploy a system that’s fraught with those inherent difficulties.
InfoWorld: How would you characterize the synergy between Designer and the Microsoft technology XDocs?
Mills: I view XDocs as a derivative of OLE, [where] you’re trying to just get interoperability between program elements. In the case [of XDocs], you’re trying to get linkages set up between document-centric expressions of things. We’ll see how they evolve the technology. I don’t view it as a threat to Designer. Designer is a forms-based development [environment] and it’s based on Lotus Spread, which is a VB-derivative technology [and] looks like VB in terms of its underlying structure. So it’s a virtual machine environment in which from a dialog box interface you’re crafting forms-based applications. Next year the forms-based Design structure [will] move onto Java and we’ll deliver a Java-based forms Designer — that’s the successor to the Domino Designer product that’s in the marketplace today. I don’t view XDocs in that same context; I view it more as a linking, embedding initiative on Microsoft’s part. It’s a logical extension of the way they think about portals and what portals are supposed to do, which is personal use against largely document-based data as opposed to transaction-based data.
InfoWorld: Scott Hebner [director of WebSphere marketing] said in response to a question about the complexity of Java development that in the new offering IBM is hiding Java from the developers. Will that extend into the Domino Designer implementation?
Mills: Yes. There are a variety of forms-based development products in the marketplace today, and Power Builder was one of those products. [Take] Oracle Forms: Go back far enough [and] you’ve got Easy Forms. This is a very popular paradigm, this forms-based approach, and it’s always done in a 4GL type of environment, running on top of a virtual machine infrastructure of some sort and many of them have been proprietary over the years. This particular one will be done on Java and it’ll mask out the underlying Java syntactic structure. You’ll never see it. Obviously some programmers who have the experience and knowledge can go in and hybridize applications that are written natively in Java as well as using the Designer capability.
InfoWorld: How is the integration of Same Time and Quick Place going? And will they merge into the Domino Notes environment or continue to have different styles?
Mills: They’re merging into the WebSphere environment. We’re going to replace the embedded session services, messaging, and file control layers that make up the bulk of the Domino code base — that’s being substituted beginning next year for WebSphere session control and DB2 with NSF [Network File System] mapped natively and transparently onto the DB2 schemas. We’ll deliver [that] in the first part of next year for beta customers.
InfoWorld: How’s the work coming in DB2 to store and manipulate both structures?
Mills: Extremely well. [It] works today. DB2 has had unstructured data support in various forms since 1995. So this just continues that process of schema mapping and additional beta type support within the database. We’ll be finally moving to XML as a full natively supported structure on top of the relational table architecture.
InfoWorld: Can you give us a road map for how these two technologies might merge?
Mills: Let’s separate the pieces. We are putting grid protocol support into WebSphere in the 5.x time period [and] will be doing pilot projects with customers in 2003. Grid is the topology for deployment and autonomic is a set of capabilities within the products that hopefully reflect themselves consistently in products so you could make systems more autonomic. Autonomic [computing] is all about configuration, it’s about self-diagnosis, it’s about taking corrective action, self-healing capabilities when there’s [a problem]. We put together a taxonomy for describing these capabilities. And what you see is systematically, over time, more and more capabilities appear in products that makes them more autonomic-like in nature. If you decide to deploy those products in a grid environment, the attributes of the products move across to the topology. But it’s not topology unique, it’s a statement of function.
We’re in the midst of making all of our log schemas consistent across IBM. Then when you have your log schemas consistent, you [can] have common tracing, common debugging. Once you get common log schemas, then you begin to understand the interaction between products that causes failures, start to develop a collection of [data] that allow you to create probes [and] first-failure data capture. Once the patterns become clear, you can start the scripting corrective action that works on the fly whenever that event occurs. The problem with today’s systems, given the connective nature of systems, is that the events cascade through the layers of a system stack and then they cascade horizontally or across the multiple servers. So it’s a very complicated problem and you deal with it the way we dealt with in the mainframe. It’s a systematic building up of a consistent environment that allows you to then apply the proper techniques to that environment to be able to understand what’s going on where the failure occurred. But without consistent logging you can’t tell the origin point of a failure.
InfoWorld: Is it necessary to drive that consistency down through the operating system?
Mills: Huge value can be derived from getting consistency at the operating system layer for those capabilities. You would like the operating system to carry with it a great deal of quality-of-service function that the rest of the system inherits. The more you can inherit from the lowest level of the product, the more reliable the system’s going to be. If the hardware has built-in features for dealing with bit errors and other kinds of related errors at the hardware level everything running [on], the product should logically inherit those recovery features. So you want to then move it to the OS. In the open systems world it’s more challenging to move it to the OS because you can’t get everybody to agree [on] what to move to the OS. So this is going to have to have an open standards characteristic to it. We plan to push this agenda forward in the marketplace. My experience over the last 30 years has been if we start to push an agenda forward in the marketplace and it appears to provide us with business leverage, then all of our competitors will want to do something of a similar or same nature.
InfoWorld: What needs to be done within the industry to enable linking your autonomic capabilities to competitors’ autonomic capabilities?
Mills: For the system to be autonomic there’s a level of interaction between the elements that begins around things like log schemas. My view is the industry needs to move to a standard XML log schema so you can do correlation between products in a heterogeneous environment. How do you solve that problem? You could work on consistency and correlation tools and debugging tools. You could adhere to certain standard structures for being able to run traces across multiple systems [and] probe architectures so that you could in fact do more consistent event tracking across systems. There are a lot of things that could be done at the industry level.
InfoWorld: Have you been trying to spark that?
Mills: There are a number of interest groups that are floating around out there. There’s a dialog taking place. We’re early on in the process and our intent is to try to spark it through coming forward with implementations that we can move into standards bodies. The early phases of this will be very focused on schema and the use of XML as a description mechanism for schemas.
InfoWorld: The OGSA (Open Grid Services Architecture) talks about a set of grid services. What services do you think are going to take hold in the enterprise?
Mills: The No. 1 issue for grid services is scheduling. This is nothing [but] a virtualized scheduler sitting above multiple operating systems, controlling work on each individual server. Is the scheduler aware of available resources [on those servers] that could be utilized? Does the server recognize the scheduler? It’s a simple problem on the one hand that’s been dealt with in other domains, but it’s never been dealt with in this fashion. Solaris has not had to adapt itself to a super scheduler sitting above it that schedules work into spaces that it controls. How do you set up your Solaris systems so that it’s aware of the super scheduler and what’s the Solaris operating system going to permit the super scheduler to do in light of task scheduling against what’s running on that system today? These computational grids are flat in nature; there’s very little running. You’re not doing a lot with the OS when you’re running floating-point calculations at the maximum capacity of the machine. These are very simple environments in contrast to transaction environments with lots and lots of scheduling activity that the operating system itself is doing. And these operating systems are not built to take external interrupts, so there [are] a lot of complicated issues to be dealt with. We’re not going to [visit] United Parcel Service and tell them they ought to move one of the world’s largest transaction processing systems running on mainframes into a grid. We wouldn’t know how to tell them to deploy that
InfoWorld: If scheduling is the biggest issue, what are the other ones you face?
Mills: To me scheduling is [number] one, two, and three. It’s the only topic that I’m currently engaged with the team on. My view is that everything else is secondary. We’ll figure out the other problems as we go. But if you can’t schedule work onto the systems, you have no dynamic provisioning capability because the systems find the overlying virtual scheduling service to be intrusive to what they’re trying to run. If you don’t solve that problem, you’re not going to do anything else. All the other [problems] — recovery, mirroring, data movement, any other set of background activities — are secondary to solving the scheduling problem. It all hinges on building an “operating system” that controls the grid.
InfoWorld: Are the issues of security and quality of service built into that scheduling issue?
Mills: [We] will deal with those issues along the way as add-ons. Security has always been something that you’ve had to add into the environment. At the local OS you have to deal with issues of thread control, thread services. Security is heavily based upon process and procedure and layering. The most secure systems in the world are totally heterogeneous and layered environments in which each level of the system’s different. That’s how the federal government deals with security.
InfoWorld: Despite today’s tight economy, are you seeing the amount being spent on integration going up?
Mills: As a relative percent, it may have gone up. The anecdotal evidence implies the number may be higher. But that’s anecdotal. Then you get a little bit of buzz in there as to new function commingled with integration. I think they’re still spending money this year. They spent a trillion dollars. It’s still a market with lots of money being spent.
InfoWorld: Do you think Web services is going to take some of that spending?
Mills: I think Web services is one help among many helps. Web services is a declarative connection technology, is all it is. It doesn’t instantiate the function, it’s the declarative interface that connects to the function. It doesn’t do anything in terms of helping you transact. It doesn’t in and of itself help you integrate, but it does help you connect, and so it’s a good thing. But it’s [only] one of many good things that are out there. Some [people] are out there saying the answer is Web services. My response is: You don’t know what you’re talking about.
InfoWorld: Are you going to join the Liberty Alliance?
Mills: No plans at this point to join the Liberty Alliance, because we’re quite comfortable with the things that we’re doing. You’ll have to watch the space in terms of how the standards bodies evolve. We don’t see any particular reason to join the Liberty Alliance at this point.