Friday, October 21, 2005

Updations on XML

I didn't have much work for today except for attending a series of (boring) meetings and churning out word docs and spreadsheets, the weekly status reports and the rest of the junk, tch!. A friend of mine mailed a problem that triggered a deluge of mails with interesting and impressive ways to solve the problem. The hinge point of the problem relied on a XPath expression that should perform a "reverse search" - retrieve all nodes with any attribute having value "abc". Here is my approach:

1. The attribute can be nested anywhere in the document, so my expression starts with //
2. Should return all matching nodes, //*
3. Th test condition is a predicate, //*[]
4. The tricky one, each is a attribute node and can be any attribute,
//*[@*='abc']

That completed the XPath expression.

The complete solution for the problem was decided to be implemented using XQuery. Awesome! Sometime back, I implemented a basic screen scraper (HTML parsing) using XQuery, but that was stop-gap solution and didnot scale well. I'm still a novice to XQuery and wanted to learn the nitty-grrity of the FLWOR expression. As XQuery relies on XPath, I chose to update on the XPath2.0 first. The new version has added functions to make lives easy. Some of such functions are collection,current-date,current-time,implicit-timezone etc. Ofthese, the collection function, apparantly, was found very useful. It can be used to overcome the memory issues with the doc/document() function. This would not only eliminate a round of looping but also offers a clean solution in terms of performance.

Thanks to my friend, my XML arsenal got enriched with new info.

1 comment:

Anonymous said...

It was a typo. I modified the entry. Thanks for pointing it out.