Thursday, March 4, 2010

Namespace agnostic XPath/XQuery

Problem Description

I have two different type of XML files, both have multiple namespaces in it. First one contains prefixes along with namespaces, where second one doesn't. The question is how do I find title (Organization->Person->title) using XPath or XQuery?



Solution - 1 (Aware of Namespace)

The standard way (which I don't recommend) of doing this would be to create XPath resolver and register all the namespaces with it. I created external properties file with all the namespace like below (Note: Namespace prefix doesn't have to match between XPath and XML Input Payload (e.g. org to org1). Only namespaces has to match.):


Now, I started writing a method which does Xpath resolution.

For this method, I had to create a class for Namespace resolution. I wrote a class which resolves namespaces based on property file:


That's it, now if you provide /org1:Organization/person1:Person/person1:title as XPath to this program, it can evaluate it as "Engineer"! I was not too impressed with this approach because everytime payload comes up with change/addition in namespace, I have to maintain my property file. In small organization it might workout OK, but in rapidly changing world I can not imagine it would work at all.



Solution - 2 (Namespace Agnostic XPath/XQuery)

For this, I wrote method which doesn't require any Namespace Context. E.g.


Now the trick is in the XPath, I wrote XPath: /*[contains(name(),'Organization')]/*[contains(name(),'Person')]/*[contains(name(),'title')]. This XPath resolves to "Engineer" given XML payload and it doesn't require any namespace. Correct way of writing this XPath would be to [name()='Organization' or ends-with(name(),':Organization')], but unfortunately ends-with is not supported with Oracle XML parser, but I don't see any huge issue with current Xpath.


Test

Finally I just wrote test case to test this out, e.g.


Ran the test, and I could not be more happier after looking at the results:




FYI, just found two more approaches to write name space agnostic XPath:
a) /*[local-name()='Organization']/*[local-name()='Person']/*[local-name()='title']
b) //*[local-name()='title']

No comments: