Uploaded image for project: 'Apache Knox'
  1. Apache Knox
  2. KNOX-1518

Large HDFS file downloads are incomplete when content is gzipped

    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 1.1.0
    • Fix Version/s: 1.2.0
    • Component/s: Server
    • Labels:
      None

      Description

      org.apache.knox.gateway.filter.rewrite.impl.UrlRewriteResponse employs  java.util.zip.GZIPInputStream for gzipped content streams.

      There appears to be an expectation in the GZIPInputStream of the InputStream#available() method, for which the behavior is varied across InputStream implementations. InputStream implementations that do not satisfy this expectation cause the GZIPInputStream to terminate prematurely, resulting in only partial reads.

      There is an OpenJDK bug (https://bugs.openjdk.java.net/browse/JDK-8081450) for this, and the Oracle JDK suffers from the same.

      This can be overcome in Knox with code.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                pzampino Philip Zampino
                Reporter:
                pzampino Philip Zampino
              • Votes:
                0 Vote for this issue
                Watchers:
                2 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: