Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-32017

Make Pyspark Hadoop 3.2+ Variant available in PyPI

    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 3.0.0
    • Fix Version/s: 3.1.0
    • Component/s: PySpark
    • Labels:
      None

      Description

      The version of Pyspark 3.0.0 currently available in PyPI currently uses hadoop 2.7.4.

      Could a variant (or the default) have its version of Hadoop aligned to 3.2.0 as per the downloadable spark binaries.

      This would enable the PyPI version to be compatible with session token authorisations and assist in accessing data residing in object stores with stronger encryption methods.

      If not PyPI then as a tar file in the apache download archives at the least please.

        Attachments

          Activity

            People

            • Assignee:
              hyukjin.kwon Hyukjin Kwon
              Reporter:
              gpongracz George Pongracz
            • Votes:
              1 Vote for this issue
              Watchers:
              13 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: