Uploaded image for project: 'Apache Knox'
  1. Apache Knox
  2. KNOX-767

Knox transforms XML files written to WebHDFS

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.11.0
    • Component/s: None
    • Labels:
      None

      Description

      When you write an XML file to WebHDFS through Knox with the Content-Type header set to text/xml or application/xml it is transformed by Knox so that empty tags like <xyz/> are written as <xyz></xyz> and CDATA is interpreted. This does not happen if written directly to WebHDFS. For example:

      [root@hdp250 ~]# cat xxx
      <document>
         <empty/>
         <![CDATA[<xyz>wibble</xyz>]]>
      </document>
      
      [root@hdp250 ~]# curl -u guest:guest-password -i -k -X PUT "https://hdp250.local:8443/gateway/default/webhdfs/v1/tmp/xxx?op=CREATE&overwrite=true"
      HTTP/1.1 307 Temporary Redirect
      Date: Thu, 27 Oct 2016 19:25:35 GMT
      Set-Cookie: JSESSIONID=bt3timb9jl7k546fcntrj8s;Path=/gateway/default;Secure;HttpOnly
      Expires: Thu, 01 Jan 1970 00:00:00 GMT
      Set-Cookie: rememberMe=deleteMe; Path=/gateway/default; Max-Age=0; Expires=Wed, 26-Oct-2016 19:25:35 GMT
      Cache-Control: no-cache
      Expires: Thu, 27 Oct 2016 19:25:35 GMT
      Date: Thu, 27 Oct 2016 19:25:35 GMT
      Pragma: no-cache
      Expires: Thu, 27 Oct 2016 19:25:35 GMT
      Date: Thu, 27 Oct 2016 19:25:35 GMT
      Pragma: no-cache
      Content-Type: application/octet-stream
      X-FRAME-OPTIONS: SAMEORIGIN
      Location: https://hdp250.local:8443/gateway/default/webhdfs/data/v1/webhdfs/v1/tmp/xxx?_=AAAACAAAABAAAAEQ-XO5sAM86ubmjRdUYXJEZkpM4Vdv3vmIprBetQwfaKaZNN4uc9O1IN8jujDD9GpPPCDJCKxebul_GlCFxDIZzbkhZ1tnhY5rZ6V12SVJgLo5DxMxC8zECeaM4M8OFLqHxamNnvduuUkD5y23RJczzHHJ9SyYuG6yiCpDJKB_5MffZIWFaEEcYM7jOkjStZHU_7cjIg_vRJL2nFCVTWKf1FPkB00QCbXHN-Ua6MfEG8p2aoQB70tfVHnmhhnBWx2PZARJ-kHp42rrpA1yrI86v3Q-OGI4Ya3pnPRWhPj0wbdDr_p_FDinsw2KRu1_aRSIXXznmJ--aX6TflbBGZvDImkw4x0QM48UGFpOChaLtHk73rlMMUbbbAwOew0gJ2-69PuXiL4QB48
      Server: Jetty(6.1.26.hwx)
      Content-Length: 0
      
      [root@hdp250 ~]# curl -u guest:guest-password -i -k -X PUT -T xxx -H 'Content-Type: text/xml' "https://hdp250.local:8443/gateway/default/webhdfs/data/v1/webhdfs/v1/tmp/xxx?_=AAAACAAAABAAAAEQ-XO5sAM86ubmjRdUYXJEZkpM4Vdv3vmIprBetQwfaKaZNN4uc9O1IN8jujDD9GpPPCDJCKxebul_GlCFxDIZzbkhZ1tnhY5rZ6V12SVJgLo5DxMxC8zECeaM4M8OFLqHxamNnvduuUkD5y23RJczzHHJ9SyYuG6yiCpDJKB_5MffZIWFaEEcYM7jOkjStZHU_7cjIg_vRJL2nFCVTWKf1FPkB00QCbXHN-Ua6MfEG8p2aoQB70tfVHnmhhnBWx2PZARJ-kHp42rrpA1yrI86v3Q-OGI4Ya3pnPRWhPj0wbdDr_p_FDinsw2KRu1_aRSIXXznmJ--aX6TflbBGZvDImkw4x0QM48UGFpOChaLtHk73rlMMUbbbAwOew0gJ2-69PuXiL4QB48"
      HTTP/1.1 100 Continue
      
      HTTP/1.1 201 Created
      Date: Thu, 27 Oct 2016 19:25:54 GMT
      Set-Cookie: JSESSIONID=3o27jby7c2a6mdpxducddqac;Path=/gateway/default;Secure;HttpOnly
      Expires: Thu, 01 Jan 1970 00:00:00 GMT
      Set-Cookie: rememberMe=deleteMe; Path=/gateway/default; Max-Age=0; Expires=Wed, 26-Oct-2016 19:25:54 GMT
      Location: https://hdp250.local:8443/gateway/default/webhdfs/v1/tmp/xxx
      Connection: close
      Server: Jetty(9.2.15.v20160210)
      
      [root@hdp250 ~]# hdfs dfs -cat /tmp/xxx
      <?xml version="1.0" standalone="no"?><document>
         <empty></empty>
         &lt;xyz&gt;wibble&lt;/xyz&gt;
      </document>
      
      

        Attachments

        1. KNOX-767.patch
          7 kB
          Jeffrey E Rodriguez

          Activity

            People

            • Assignee:
              jeffreyr97 Jeffrey E Rodriguez
              Reporter:
              sumit.gupta Sumit Gupta
            • Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: