Uploaded image for project: 'Groovy'
  1. Groovy
  2. GROOVY-10304

XmlSlurper failing to handle namespaces

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 3.0.9
    • None
    • XML Processing
    • None
    • MacOS, although it doesn't look like something that would vary by platform.

    Description

      I would love to be told how I'm doing this wrong.  

      My ultimate goal here is to use XmlSlurper on a StreamingSAXBuilder to prepare a document in a standard way, then modify it, without having to render it to an intermediate file or a large string in memory. I think I've narrowed the main problem down to XmlSlurper, although there are other issues with StreamingSAXBuilder. But it seems so basic, such a non-unusual thing to do that I feel sure it's just me doing it wrong...

      Try the following in groovyConsole:

       

      import groovy.xml.*
      def src = '''<?xml version="1.0" encoding="UTF-8"?><outer xmlns="uri:urn">
       <inner>hello</inner>
      </outer>
      '''
      def xml = new XmlSlurper()
      def gpr = xml.parseText(src)
      def serialized = XmlUtil.serialize(gpr)
      println serialized
      assert serialized == src
      

       

      The assertion fails:

      groovy> import groovy.xml.* 
      groovy> def src = '''<?xml version="1.0" encoding="UTF-8"?><outer xmlns="uri:urn"> 
      groovy>   <inner>hello</inner> 
      groovy> </outer> 
      groovy> ''' 
      groovy> def xml = new XmlSlurper() 
      groovy> def gpr = xml.parseText(src) 
      groovy> def serialized = XmlUtil.serialize(gpr) 
      groovy> println serialized 
      groovy> assert serialized == src 
       
      <?xml version="1.0" encoding="UTF-8"?><tag0:outer xmlns:tag0="uri:urn">
        <tag0:inner>hello</tag0:inner>
      </tag0:outer>
      
      Exception thrown
      
      Assertion failed: 
      
      assert serialized == src
             |          |  |
             |          |  '<?xml version="1.0" encoding="UTF-8"?><outer xmlns="uri:urn">\n  <inner>hello</inner>\n</outer>\n'
             |          false
             '<?xml version="1.0" encoding="UTF-8"?><tag0:outer xmlns:tag0="uri:urn">\n  <tag0:inner>hello</tag0:inner>\n</tag0:outer>\n'
      
      	at ConsoleScript49.run(ConsoleScript49:11)
      	at jdk.internal.reflect.GeneratedMethodAccessor78.invoke(Unknown Source)
      	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
      
      

      What's with all the "tag0"?

      I have a forked XmlSlurper and slurpersupport.Node that I'm currently using to work around the above problem. But it doesn't solve the one below. And I'm sure it's the wrong approach anyway.

      It gets worse if actually trying to slurp a StreamingSAXBuilder directly:

       

      import groovy.xml.*
      def expected = '''<?xml version="1.0" encoding="UTF-8"?><outer xmlns="uri:urn">
       <inner>hello</inner>
      </outer>
      '''
      def doc = new StreamingSAXBuilder().bind {
       mkp.declareNamespace('': 'uri:urn')
       outer {
       inner 'hello'
       }
      }
      def xml = new XmlSlurper()
      doc(xml)
      def gpr = xml.document
      def serialized = XmlUtil.serialize(gpr)
      println serialized
      assert serialized == expected
      

       

      gives the output:

      groovy> import groovy.xml.* 
      groovy> def expected = '''<?xml version="1.0" encoding="UTF-8"?><outer xmlns="uri:urn"> 
      groovy>   <inner>hello</inner> 
      groovy> </outer> 
      groovy> ''' 
      groovy> def doc = new StreamingSAXBuilder().bind { 
      groovy>     mkp.declareNamespace('': 'uri:urn') 
      groovy>     outer { 
      groovy>         inner 'hello' 
      groovy>     } 
      groovy> } 
      groovy> def xml = new XmlSlurper() 
      groovy> doc(xml) 
      groovy> def gpr = xml.document 
      groovy> def serialized = XmlUtil.serialize(gpr) 
      groovy> println serialized 
      groovy> assert serialized == expected 
       
      [Fatal Error] :2:70: The prefix "xmlns" cannot be bound to any namespace explicitly; neither can the namespace for "xmlns" be bound to any prefix explicitly.
      Exception thrown
      
      groovy.lang.GroovyRuntimeException: org.xml.sax.SAXParseException; lineNumber: 2; columnNumber: 70; The prefix "xmlns" cannot be bound to any namespace explicitly; neither can the namespace for "xmlns" be bound to any prefix explicitly.
      	at ConsoleScript0.run(ConsoleScript0:16)
      	at jdk.internal.reflect.GeneratedMethodAccessor78.invoke(Unknown Source)
      	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
      

      It's not super-clear from the above but the error is thrown inside XmlUtil::serialize. The top of the stack trace from running it in my forked version (useful maybe for the XmlUtil line numbers, as XmlUtil has not been forked here:

      groovy.lang.GroovyRuntimeException: org.xml.sax.SAXParseException; lineNumber: 2; columnNumber: 70; The prefix "xmlns" cannot be bound to any namespace explicitly; neither can the namespace for "xmlns" be bound to any prefix explicitly.
      	at groovy.xml.XmlUtil.serialize(XmlUtil.java:475)
      	at groovy.xml.XmlUtil.serialize(XmlUtil.java:461)
      	at groovy.xml.XmlUtil.serialize(XmlUtil.java:191)
      	at groovy.xml.XmlUtil.serialize(XmlUtil.java:160)
      	at org.codehaus.groovy.vmplugin.v8.IndyInterface.fromCache(IndyInterface.java:318)
      	at uk.co.merus.groovy.xml.XmlSlurperTest.testSlurpingFromSAXBuilderWithNamespace(XmlSlurperTest.groovy:306)
      
      

      Possibly-relevant old bug: GROOVY-5879, however while in my forked version of StreamingSAXBuilder I tried porting in the fix applied in that bug to StreamingMarkupBuilder, it didn't help in this case.

      Attachments

        1. slurpFromString.groovy
          0.3 kB
          Rachel Greenham
        2. slurpFromSaxBuilder.groovy
          0.4 kB
          Rachel Greenham

        Activity

          There are no comments yet on this issue.

          People

            Unassigned Unassigned
            Rachel Greenham Rachel Greenham
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated: