Back in the XML soup

XPath and XSLT power document-oriented Web services

At the XML Web Services One conference in Boston last month, I spent most of my allotted time in an all-day W3C/OASIS seminar on Web services security. But I couldn’t help ducking out now and then in order to go next door and catch parts of Ken Holman’s hands-on seminar, Practical Transformation Using XSLT and XPath.

Holman, who was involved in the making of XML, has now given this talk 38 times. What struck me as I watched him recite the 13 “axes” along which XPath expressions can begin to select fragments of XML documents is that this material is already being repeated by dozens of trainers, in hundreds of seminars, for thousands of developers — and that’s just for starters.

I’ve got to admit that my own path to learning XPath and XSLT has been neither smooth nor straight. At first I used XSLT stylesheets for a variety of purposes, but didn’t really internalize the mindset. Regular expressions came more easily and naturally to me, whether in Perl or Python or Unix shell or another scripting environment. They had served me well in the past, and continue to do so. Since before the advent of the Web, and even more so after, an astonishing range of IT chores reduces to matching, or searching and replacing, patterns in text files. Those skills still matter.

Lately, however, I’ve begun to make my peace with XPath and XSLT. There’s more XML data in the world every day, and although it’s tempting (and indeed possible) to shred that data using tools like regular expressions, XPath’s ability to pinpoint subsets of documents, by means of concise declarative expressions, becomes more useful all the time.

Consider a search engine that returns results in XML, as many now do. Sure, you can parse the result document in order to pick out the titles of the found items, along with their relevance scores, but XPath expressions can grab these sets of elements directly. And they can shred documents with surgical precision. It’s easy to say, “Give me all the results where the category attribute is ‘Jobs’” or “Give me just the results with relevance scores greater than 3.5.”

What do you do with these document subsets once you’ve found them? The XSLT stylesheets in which XPath expressions appear were originally used mainly to produce HTML pages and Web applications. Now that Web services are generating lots of new XML data, another use of XSLT is coming into focus. It is the pre-eminent tool for pure XML-to-XML transformation.

Two years ago, SOAP was being enshrined as “CORBA with angle brackets” — a remote-procedure call technology that, by virtue of its simplicity and its platform and programming language neutrality, would succeed where others had failed. In Visual Studio .Net and other toolkits, this meant programmers were shielded from the underlying XML. Everything looked like a function call. XML data was marshalled back and forth under the covers.

Now, though, things have subtly shifted. Even as the first-generation toolkits were reaching the market, the “loose coupling” mantra began to be chanted everywhere, and the RPC style of SOAP messaging was deprecated in favor of the alternative document-oriented style. In practice this means that developers aren’t just making and calling functions. They’re back in the XML soup, building and shredding documents.

There’s room for both styles, and the pendulum will probably swing back toward the center. But there are sound reasons for the document-oriented approach, and they go beyond the oft-cited “loose coupling” justification. Business processes are largely enacted by means of document exchange. It makes sense to turn these documents into the packets of the business web.

Of course layers of APIs will be wrapped around purchase orders, contracts, and other SOAP payloads, so that programmers can continue to see the world through function-call spectacles. But managing those payloads as XML, using tools such as XPath and XSLT, is a core competency that looks even more strategic than it did two years ago.

Source: www.infoworld.com