Uploaded image for project: 'Apache Arrow'
  1. Apache Arrow
  2. ARROW-1654

[Python] pa.DataType cannot be pickled

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • 0.8.0
    • Python

    Description

      In [26]: t
      Out[26]: DataType(int64)

      In [25]: pickle.dumps(t)
      ---------------------------------------------------------------------------
      TypeError Traceback (most recent call last)
      <ipython-input-25-f90063f6658b> in <module>()
      ----> 1 pickle.dumps(t)

      /home/icexelloss/miniconda3/envs/spark-dev/lib/python3.5/site-packages/pyarrow/lib.cpython-35m-x86_64-linux-gnu.so in pyarrow.lib.DataType._reduce_cython_()

      TypeError: no default _reduce_ due to non-trivial _cinit_

      This is discovered when trying to send a pa.DataType along with a udf in pyspark. The workaround is to send pyspark DataType and convert to pa.DataType. It would be nice to able to pickle pa.DataType.

      Attachments

        Issue Links

          Activity

            People

              wesm Wes McKinney
              icexelloss Li Jin
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Slack

                  Issue deployment