Uploaded image for project: 'Infrastructure'
  1. Infrastructure
  2. INFRA-15707

Spark pairs.com mirror incorrectly includes "Content-Encoding: x-gzip" in header

    Details

    • Project:
      Spark

      Description

      Not sure if INFRA is the right place for this.

      Browsers like Chrome (and possibly curl?) will automatically decompress anything sent over the wire when the header has "Content-Encoding: x-gzip". This means downloads of Spark from the pairs.com mirror get unzipped automatically, so what ends up on your machine is just the tar (but with the original filename intact, even if it ends in .gz). This also means that trying to validate the checksum of the download will mysteriously fail, since the posted checksums are for the .tar.gz, not the .tar.

      See SPARK-22851 for reference.

      Relevant example:

      > curl -I http://apache.mirrors.pair.com/spark/spark-2.2.1/spark-2.2.1-bin-hadoop2.7.tgz
      HTTP/1.1 200 OK
      Date: Thu, 21 Dec 2017 23:46:58 GMT
      Server: Apache/2.2.29
      Last-Modified: Sat, 25 Nov 2017 02:44:26 GMT
      ETag: "32b662-bfa03c4-55ec5a5c358a1"
      Accept-Ranges: bytes
      Content-Length: 200934340
      Content-Type: application/x-tar
      Content-Encoding: x-gzip

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                henp Henk Penning
                Reporter:
                jbrock John Brock
              • Votes:
                0 Vote for this issue
                Watchers:
                5 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: