Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-17808

BinaryType fails in Python 3 due to outdated Pyrolite

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 2.0.1
    • Fix Version/s: 2.0.2, 2.1.0
    • Component/s: PySpark
    • Labels:
      None
    • Environment:

      spark-2.0.1-bin-hadoop2.7 with Python 3.4.3 on Ubuntu 14.04.4 LTS

      Description

      Attempting to create a DataFrame using a BinaryType field fails under Python 3 because the underlying Pyrolite library is out of date. Spark appears to be using Pyrolite 4.9; this issue was fixed in Pyrolite 4.12. See original bug report and patch

      Test case & output attached. I'm just a Python guy, not really sure how to build Spark / do classpath magic to test if this works correctly with updated Pyrolite.

        Attachments

        1. demo.py
          0.4 kB
          Pete Fein
        2. demo_output.txt
          18 kB
          Pete Fein

          Activity

            People

            • Assignee:
              bryanc Bryan Cutler
              Reporter:
              pfein Pete Fein
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: