Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-32017

Make Pyspark Hadoop 3.2+ Variant available in PyPI

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 3.0.0
    • 3.1.0
    • PySpark
    • None

    Description

      The version of Pyspark 3.0.0 currently available in PyPI currently uses hadoop 2.7.4.

      Could a variant (or the default) have its version of Hadoop aligned to 3.2.0 as per the downloadable spark binaries.

      This would enable the PyPI version to be compatible with session token authorisations and assist in accessing data residing in object stores with stronger encryption methods.

      If not PyPI then as a tar file in the apache download archives at the least please.

      Attachments

        Activity

          People

            hyukjin.kwon Hyukjin Kwon
            gpongracz George Pongracz
            Votes:
            1 Vote for this issue
            Watchers:
            13 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: