1) In general XPath queries return a list of nodes. What is the semantics of xpath_double (eg.) return if XPath evaluates to multiple nodes.
Only xpath() returns multiple nodes (list).
xpath_string() returns the text of the first matching node (and its subnodes, if any).
- xpath_string('<a>aa<b>b1</b><b>b2</b></a>','a') returns 'aab1b2'
- xpath_string('<a>aa<b>b1</b><b>b2</b></a>','b') returns 'b1'
xpath_double()/float() return the numeric value of the text of the first matching node, or NaN if the text value is not numeric.
xpath_int()/long()/short() return the numberic value of the text of the first matching node, or 0 if the text value is not numeric, or MAX_INT, MAX_LONG, MAX_SHORT respectively if the value overflows.
2) Is the XPath query parsed for every input row, or only parsed once?
The XPath expression is compiled and cached. It is reused if the next expression matches the previous. Otherwise, it is recompiled. So, the xml is always parsed for every input row, but the xpath expression is precompiled and reused for the vast majority of use cases.
3a) Do you support DTD and XMLSchema?
Not sure how these would apply, as the Java XPath API is schema agnostic (no validation being performed). However, malformed xml (e.g., '<a><b>1</b></aa>') will result in a runtime exception being thrown.
3b) What about namespace and backward axes in XPath?
Namespace is not currently supported, but could be easily added later.
Backward axes are supported:
> select xpath ('<a><b id="1"><c/></b><b id="2"><c/></b></a>','/descendant::c/ancestor::b/@id') from t1 limit 1 ;
4) If XPath evaluates to empty list, do you return NULL or empty string (in case of xpath())?
When no match is found:
xpath() returns an empty list.
xpath_string() returns an empty string.
xpath_int(), float(), etc. will return 0.
xpath_boolean() will return false.