Uploaded image for project: 'Apache Knox'
  1. Apache Knox
  2. KNOX-1518

Large HDFS file downloads are incomplete when content is gzipped

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 1.1.0
    • 1.2.0
    • Server
    • None

    Description

      org.apache.knox.gateway.filter.rewrite.impl.UrlRewriteResponse employs  java.util.zip.GZIPInputStream for gzipped content streams.

      There appears to be an expectation in the GZIPInputStream of the InputStream#available() method, for which the behavior is varied across InputStream implementations. InputStream implementations that do not satisfy this expectation cause the GZIPInputStream to terminate prematurely, resulting in only partial reads.

      There is an OpenJDK bug (https://bugs.openjdk.java.net/browse/JDK-8081450) for this, and the Oracle JDK suffers from the same.

      This can be overcome in Knox with code.

      Attachments

        Issue Links

          Activity

            People

              pzampino Philip Zampino
              pzampino Philip Zampino
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: