Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-26566

Upgrade apache/arrow to 0.12.0

    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 2.4.0
    • Fix Version/s: 3.0.0
    • Component/s: PySpark
    • Labels:
      None

      Description

      Version 0.12.0 includes the following selected fixes/improvements relevant to Spark users:

      • Safe cast fails from numpy float64 array with nans to integer, ARROW-4258
      • Java, Reduce heap usage for variable width vectors, ARROW-4147
      • Binary identity cast not implemented, ARROW-4101
      • pyarrow open_stream deprecated, use ipc.open_stream, ARROW-4098
      • conversion to date object no longer needed, ARROW-3910
      • Error reading IPC file with no record batches, ARROW-3894
      • Signed to unsigned integer cast yields incorrect results when type sizes are the same, ARROW-3790
      • from_pandas gives incorrect results when converting floating point to bool, ARROW-3428
      • Import pyarrow fails if scikit-learn is installed from conda (boost-cpp / libboost issue), ARROW-3048
      • Java update to official Flatbuffers version 1.9.0, ARROW-3175

      complete list here

      PySpark requires the following fixes to work with PyArrow 0.12.0

      • Encrypted pyspark worker fails due to ChunkedStream missing closed property
      • pyarrow now converts dates as objects by default, which causes error because type is assumed datetime64
      • ArrowTests fails due to difference in raised error message
      • pyarrow.open_stream deprecated
      • tests fail because groupby adds index column with duplicate name

       

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                bryanc Bryan Cutler
                Reporter:
                bryanc Bryan Cutler
              • Votes:
                1 Vote for this issue
                Watchers:
                5 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: