EMC manages your data
CTO Mark Lewis discusses the state of the storage market and the importance of protecting data
EMC IS A leader in the field of storing and managing information, and Mark Lewis, executive vice president of new ventures and CTO, believes strongly that protecting data is a core competency within companies. Lewis met with InfoWorld Test Center Director Steve Gillmor and Senior Section Editor Doug Dineley to discuss the state of the enterprise storage market, standardization of storage products, and what Cisco’s purchase of Andiamo Systems means for the market.
InfoWorld: Tell us a bit about what EMC is currently working on.
InfoWorld: Was Centera a strategic strike at Network Appliance’s pricing model, among other things?
Lewis: No, actually not. Centera is priced like a very low-end SAN, at two-and-a-half cents per megabyte. So it’ll carve a little out of our own SAN space. It’ll carve an appreciable amount, we believe, out of tape. It’ll also carve some out of paper documents that are kept simply because people want records that they couldn’t be assured of digitally until now. We believe it’s going to create a new market, pulling from four or five different categories.
There will be a lot of discussion about what is the right way to store these files, and there’s probably a lot stored on Net App appliances these days that really should be considered protected, fixed-content information. We do expect that there’ll be a certain amount that can be better served, not only from the standpoint of Centera guaranteeing the content. [Centera] does incredible compression, say, when you have an e-mail archive or other things where you have a core group of users that all send the same presentations and attachments back and forth to each other. You pull that together and put it on Centera, where it only has a storage attachment one time. It actually does content compression. What it does is it takes the entire file, scans it, and makes a fingerprint of it [that’s] digitally unique, based on the contents of the file. So by … looking at the fingerprint comparison, it can say, “These two files are identical, so I’m not going to store it. I’ll just put another pointer in place so it can compress all this.”
InfoWorld: Is it similar to a reverse-engineering of the capability in Notes and other messaging systems to deliver only one central copy of a file attachment and distribute it to multiple endpoints?
Lewis: It’s another way to look at it, absolutely. This is an object-based storage focus. Databases do it; we want to be able to do it for archives and for other things — any document retention, X-rays, anything else like that. We want to be able to cease the duplication of stored information, as well as make sure it can guarantee to people like the government that this [file] was saved at a particular time and the fingerprint is identical from whatever. We can guarantee, for example, a read of the information. We re-scan it to see if the fingerprint’s the same [and] we can guarantee that nobody got in there and tried to modify the file directly. The file itself makes up the fingerprint, so if you change the file even one bit, it changes the whole fingerprint.
InfoWorld: What’s the relationship between the new line and Centera?
Lewis: We’re trying to fulfill as many of the storage needs of customers as we can. So in one sense, it’s filling different holes in storage. If you look at file-based storage, block-based storage, and object-based storage, we believe those three are needed in most environments. That’s why we have all three. To some degree we’ll try to commonize the hardware where we can, but it will be fairly dependent. With the management software layers on top, we want to provide single-view management and other capabilities. Virtually all of our software layers today are designed to work across all the product lines, so you can have one management view into your schema. And even if you have a box over here with fixed content and you have an array that’s attached on the SAN and you have NAS, all of that can be managed in a very holistic way. That’s our way of connecting the dots.
InfoWorld: Do you have plans to extend the management tools out to a heterogeneous environment?
Lewis: We are beginning to support more and more, like the Storage Works lines. As soon as we can, we’ll support the Hitachi and IBM lines with heterogeneous management. We believe that we’re really the only people working to do that today, either through API integration or, if we have to, through reverse-engineering. There are a lot of standards evolving which are going to be great. And we look forward to them, too, because it’d save us a lot of work if there were more standards. But over the next 12 to 18 months, we want to be able to provide all that technology; literally, it’s going to take 12 to 18 months for the standards to sort out. Then you have legacy equipment already installed. So we want to provide that technology to our customers today.
InfoWorld: What standards are you focused on, and are you contributing any intellectual property to those?
Lewis: We’re a lead player on the SIM Bluefin standard, which we believe [is most likely] to succeed as a management standard for SANs and network storage in general. We’re hopeful that that strategy will gain ground. There are a few other individual standards around management protocol for switches, [but] the major focus right now is the SIM Bluefin standard. I think that one will get traction. We are providing the SIM Bluefin port into our environments as rapidly as we can, but it’s still 12 months away from getting all the standards in place, getting everything locked down, and starting to really see products. We have to kind of do both: We have to do all those standards and we want to provide the software interoperability layer, such that we can interoperate between systems that exist today and provide heterogeneous support for our customers today.
InfoWorld: To what extent are you seeing the impact of Web services, and the XML revolution in general, on storage? Are you addressing that in any specific way?
Lewis: Yes and no. I wouldn’t say that that area falls huge into our sphere of influence, but XML has become one of the primary ways, for example, that we allow application interfaces to plug in. Having to be the universal translator, we provide a C interface, we provide a Java interface, we provide an XML interface. We’ve had to kind of be the universal translator on this because [while] your forward-thinking companies are on the leading edge, you [also] have companies on the lagging edge. You’re all over.
InfoWorld: Is there a considerable amount of discussion, if not actual implementation, around XML interfaces?
Lewis: I would say in the storage industry, it’s less interesting at the moment. Storage people tend to [be] a little more conservative and not quite up in the application space. We think it will come. We have interfaces designed around it in a lot of layers. But there are a lot of small companies and a lot of little niche applications that can come in and say, “I’m going to do this one thing and I’m going to do it this way.” We have to understand there’s a wide variety of ways they want to do things, even within the single enterprise account. They’ve got mergers and acquisitions, they’ve got EMC arrays, they’ve got Hitachi arrays, they’ve got Compaq arrays — I mean, they’ve got everything, right? Even if the present person says, “I’m going to just buy EMC from now on,” that doesn’t do any good. We still have to deal with all this legacy equipment, legacy applications.
InfoWorld: Which of the interfaces is getting the most action, Java or COM?
Lewis: COM. Java, I don’t know. I don’t have a lot of visibility into it. But I would say that XML has the most discussion, but not the most momentum. Because again you’re dealing with a lot of legacy on this.
InfoWorld: Coming from HP [where Lewis was vice president and general manager of Compaq’s Enterprise Storage Group], have you developed a storage-centric view as you’ve taken on your new responsibilities?
Lewis: The interesting thing is even in HP I had a very storage-centric view of the world. So I can say I’ve only hardened my opinion, and I’m not changing it. I believe that storage right now is going to continue to see what I would call an “organizational focus” on it, for a lot of reasons. No. 1 and most important, data protection is generally a corporate-wide need. If you left data protection up to all of us as individual PC users, the companies would go bankrupt. So data and how it’s protected, enforced, needs to be a core competency within companies. We believe that just simply by networking storage and separating it from the application and the servers, you can save a lot more money on how you manage that storage. You create things like storage utilities, you pool your storage, you do all of these things that again drive more cost savings. Where’s the money [going]? Even with all the innovation and storage management, if you’re still doing tape backup, you’ve got your biggest teams probably on storage. Your biggest team might be application deployment, and your second biggest team is probably your storage operations and maintenance in IT. Storage is so people-centric. You want to throw a lot of software at the problem, you know? Because anything you can do to reduce your people resources for a given amount of storage is a big deal. Add all that together, [and] you’ve got storage-focused organizations in most enterprises today.
InfoWorld: Sun’s play around Linux in some significant measure seems to be about commoditizing and ultimately incorporating the app server into the operating system. Do you see that potentially having a similar effect on their storage products?
Lewis: Where that doesn’t tie directly together is if you have Oracle driving the value in the database, and you have Intel capable of producing very, very high-performance chips at a low cost, and you have Linux out there in the market becoming a basis for the OS. What is the role that Sun plays?
InfoWorld: If they have already produced the stack and they leverage open-source technologies like MySQL for what they call “edge computing,” they can take a bite out of the market that they’ve lost at the low end and the small and midsize business end.
Lewis: My argument would say that since that doesn’t require any engineering on Sun’s part, Dell’s going to do it better. I believe that will happen. But to take it back to storage, it faces a lot of the same risks of commoditization. Things will get more and more standardized in terms of how data is protected and how things are done. These are things that we all have to deal with over time. I think right now it costs five to seven times the acquisition [price of] the storage to manage it. We’re not in any danger any time soon.
People want to take their same resources and manage lots more storage. We want to take storage management for managing what might be 5TB today, to managing 500TB in two to three years. If you look at that kind of growth, that’s where the savings can come for us. And people will continue to value that. You can always look eight, 10 years down the road and say, “If we get very commoditized, that’s very classical.” But even as a technologist, I don’t like to think more than about three years out, because I believe the industry is simply too dynamic.
InfoWorld: Three sounds like a big number to me these days.
Lewis: It is. You talk to a CIO [and] most of them want to find out, “What can you do for me in the next 12 months to improve my operations?” This is the chief strategist, the CIO. They’re near-term thinkers, too. They want to hear a little vision pitch, and that’s all fine. “Here’s my problem. Here’s what I need to do. How are you going to fix it?” They’re very tactical.
InfoWorld: In terms of — to use your word — “tactical” sales, are you still seeing a bump around business continuity as the result of Sept. 11?
Lewis: We’re seeing a bump in business continuity, in general, and a continuing rise in interest in data replication technology, in particular. I’m sure some of it is still a result of [Sept. 11], but I could never quantify how much. But I think in general, awareness is much higher in the global market space of the need for good data protection. Period.
InfoWorld: Do you see that driving any of the standards adoption? Do you see that as a significant sales driver?
Lewis: Actually, no, I don’t. Customers want to see standards. They want to see flexibility. But most of them that make decisions to buy solutions, if they really value those decisions, they want to go in and competitively bid a solution. It’s like going in and picking a database. At some point in time the customer has to choose between SQL and Oracle and DB2 and whatever. As far as I can tell, Oracle and DB2 haven’t sat down and worked out a definitive way to say, “Well, we’ll just be the same.” They’re in the business of selling features and capabilities. A lot of that’s the same way in these types of business continuity solutions. Most of the value we provide can’t be standardized.
If they said, “Wow, you’re going to standardize how you do data replication and everybody’s going to do it the same way,” if that was really possible, then you turn it over to the Linux community and you let them do it. Maybe at some point that happens, but I don’t see. I see standardization happening between what I would call “fundamental layers” — the storage layer and the storage networking layer. Customers want to be able to pick between three arrays and two management softwares and this and that. Even though they might pick from one supplier, they don’t want to have that lock-in. But within the product lines, I don’t expect to see a lot more standardization, unless the product itself just commoditizes.
InfoWorld: There was a grid standards effort to subsume some of the Web services stack in order to encourage client adoption. Where are you on that?
Lewis: I also believe that grid computing is a likely wave of change. If you think about the logical outcomes of what makes grid computing very plausible economically, [it] is actually storage networking, because you’ve now separated the compute elements and the storage elements, and you not only can have a one-compute element to an “n” storage element model, but you can actually have an n-by-n model that would say, “You can have your storage repository identical in three sites around the world and it can be served by 28 grid computing elements around the world.” You can have a non-specific 1:1 ratio between your compute elements and your storage elements. What you need to do that are really two enabling technologies. One you’ve already got: network storage. Separate the storage and put enough intelligence in the storage to make [it happen]. And two will be a concept I call “global file system,” which is the idea that these elements need to understand where data is on an n-by-n basis and how to get it and how to operate on it.
InfoWorld: How are you moving toward that, because that sounds like a somewhat more difficult problem?
Lewis: That is somewhat of a more difficult problem. We are working within our R&D efforts to look at it. We have file systems within EMC, we have capabilities to a lesser degree in NAS today, where you can do file sharing. But the problem is more deep than that. It’s beyond simple clustering because you’re talking about multiple instances of data going to more and more servers. So that’s something we’re working on the R&D side as well.
InfoWorld: What does Cisco’s $2.5 billion purchase of Andiamo Systems mean for storage?
Lewis: I think Cisco now realizes that SAN is a big thing. I think from a protocol perspective, it also validates that while IP may gain ground in individual markets — like we use it today for wide area data replication and other things — it obviously would have been a lot easier for Cisco to say, “Make everything IP.” And that obviously wasn’t possible. There were too many technical advantages to having a fibre channel SAN in those datacenters. So I think it showed that this is a big market. You can’t ignore fibre channel as an interconnect space. And it really legitimized the fact that this is not just a niche play for switches anymore. I don’t know that it’ll drive much beyond that, but we do believe that it really adds a lot of credibility to where we’ve taken the market thus far.
InfoWorld: Does it take Cisco off the table in terms of undertaking joint efforts with you?
Lewis: Not at all. As a matter of fact, Cisco is in our e-lab, interoperability lab, testing [the Andiamo fibre channel-based switch]. And assuming they pass or if and when they do pass the certification, we’ll put them on our [qualified] list. We view the infrastructure piece of the storage networking as a critical piece. Cisco becomes another potential supplier of that capability.
The promise in the future — and it’ll be great when they do it — is multiprotocol switches. We use IP switches for some WAN storage capabilities, but pretty much we use the customer’s infrastructure on that. For storage networking, we specifically integrate a product we call Connectrix, which is the storage infrastructure switching, all together with management with our storage, as a complete solution.
InfoWorld: When you think we’ll see that multiprotocol switch?
Lewis: Most of the folks are touting end of this year to be able to start demonstrating those things in a real way. But it’ll be next year, pretty much everybody believes. It’ll be an interesting thing; you’ll have a single switch where you plug it in and say, “Make this a fibre channel switch, make it an IP, make it a gig-Ethernet.” But it does take away all the argument that networking speeds will mature at different levels in different technologies. I do think networking pretty much will mature at the same rate in all major technologies. Because there’s lasers and silicon and if A can do it, B can do it.