Uploaded image for project: 'XalanJ2'
  1. XalanJ2
  2. XALANJ-2456

XPath ignores namespace declarations except on root element

Attach filesAttach ScreenshotAdd voteVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 2.7.1
    • Xalan, XPath
    • Security Level: No security risk; visible to anyone (Ordinary problems in Xalan projects. Anybody can view the issue.)
    • None
    • windows

    Description

      Dear XalanJ2 developers,

      I work as technical support engineer for Fujitsu Europe Ltd and I have the main role of supporting Fujitsu's software tools.

      Fujitsu Interstage uses xalan and xpath api internally, and as we benefit of these open source tools, I hope xalan can benefit of our contribution.

      We were having this problem using xpath queries with namespaces, and though we believed both our query and xml had the correct syntax, we were getting the following error:

      org.w3c.dom.DOMException: Prefix must resolve to a namespace: ns0

      After further investigations, we realised that the method getNamespaceForPrefix of the class PrefixResolverDefault was somewhat not returning namespaces definitions from nodes that were not the root of the xml document.

      We decided to fix that, and since we got a working version, we decided to submit it back to apache.

      Having looked at xalanj2's jira, I can see Guy Rixon has had the same problem back in 2004 with version 2.5Dx. I would clone this issue to version 2.7.1, but I realised there was no way of doing that. The reference is: http://issues.apache.org/jira/browse/XALANJ-1790.

      I can see from this issue these comments from Joe Kesselman:
      "it does not search the subtree for them (which would be slow, could have conflicts,etc)."

      And I would like to propose our fix by trying to discuss those arguments.

      Our solution: our proposed solution is to rewrite the method getNamespaceForPrefix (String). This is the method the xpath api is calling (it does not use getNamespaceForPrefix (String, Node) directly - instead the first method gets the context information using the m_context class member).

      For the sake of good engineering I chose to mark getNamespaceForPrefix (String, Node) as deprecated, and create the method findNamespaceForPrefix (String, Node), recommending its use instead. I kept the getNamespaceForPrefix (String), but I pointed it to the new method.

      findNamespaceForPrefix (String, Node) will then perform a search on the xml tree.

      Being concerned about the performance, I decided to implement an amplitude search, instead of a depth search. The main reason for this is that I still expect the namespace x prefix binding to occur on the top level nodes. It was of course a bit harder to implement and it will be a bit harder to read.

      On the other hand, any xml that defines its namespaces x prefix bindings on the top level node (root node) will still benefit from the best performance - O(1) to reach the namespace binding.

      All that said, there should be no slowing down on existing systems.

      About conflicts: we shouldn't be worried about namespaces conflicts, since the method will return the first namespace it finds for a prefix. It should be a task of the developer to ensure that no prefix maps to more than one namespace. In fact I tested that and an empty string is returned.

      But from my understanding of w3c documentation, the relation between prefixes and namespaces is:
      Prefix n:1 Namespace.

      At least it does not prohibit that.

      That brings me back to my sample xml:
      <soapenv:Envelope xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" >
      <soapenv:Body>
      <ns0:scheduleServiceResponse xmlns="http://util.fel.fujitsu.com" xmlns:ns0="http://util.fel.fujitsu.com">
      <ns0:scheduleServiceReturn>
      <ns1:code xmlns:ns1="http://data.fel.fujitsu.com" >5</ns1:code>
      <ns2:dateScheduled xmlns:ns2="http://data.fel.fujitsu.com" >2008-10-01T00:01:02.000Z</ns2:dateScheduled>
      <ns3:description xmlns:ns3="http://data.fel.fujitsu.com" >New opening</ns3:description>
      <ns4:typeofService xmlns:ns4="http://data.fel.fujitsuooooo.com" >
      <ns4:code>6</ns4:code>
      <ns4:description>Moving pictures</ns4:description>
      </ns4:typeofService>
      </ns0:scheduleServiceReturn>
      </ns0:scheduleServiceResponse>
      </soapenv:Body>
      </soapenv:Envelope>

      After the modifications I made, I got the following results for the following xpath expressions:

      (1) "/soapenv:Envelope/soapenv:Body/ns0:scheduleServiceResponse/ns0:scheduleServiceReturn/ns4:typeofService/ns4:description/text()"
      Result: Moving pictures
      (2) "/soapenv:Envelope/soapenv:Body/ns0:scheduleServiceResponse/ns0:scheduleServiceReturn/ns3:description/text()"
      Result: New opening
      Which is exactly what I wanted, and perfectly w3c compliant.

      I am posting attached the modified code for PrefixResolverDefault. I hope it can be of use.
      I decided to report (actually re-report) this as an improvement instead of a bug because:

      I am not sure whether it is a bug - perhaps only a lack of compliance. But it still can be seen as a bug...

      Best regards, Romeu Flores.

      Attachments

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            Unassigned Unassigned
            romeuflores Romeu Flores

            Dates

              Created:
              Updated:

              Slack

                Issue deployment