Description
Update PySpark's version of Cloudpickle to match version 0.4.3. The reasons for doing this are:
- Pick up bug fixes, improvements with newer version
- Match a specific version as close as possible (Spark has additional changes that might be necessary) to make future upgrades easier
There are newer versions of Cloudpickle that can fix bugs with NamedTuple pickling (that Spark currently has workarounds for), but these include other changes that need some verification before bringing into Spark. Upgrading first to 0.4.3 will help make this verification easier.
Discussion on the mailing list: http://apache-spark-developers-list.1001551.n3.nabble.com/Thoughts-on-Cloudpickle-Update-td23188.html
Upgrading to the recent release of v0.4.3 will include the following:
- Fix pickling of named tuples https://github.com/cloudpipe/cloudpickle/pull/113
- Built in type constructors for PyPy compatibility [here](https://github.com/cloudpipe/cloudpickle/commit/d84980ccaafc7982a50d4e04064011f401f17d1b)
- Fix memoryview support https://github.com/cloudpipe/cloudpickle/pull/122
- Improved compatibility with other cloudpickle versions https://github.com/cloudpipe/cloudpickle/pull/128
- Several cleanups https://github.com/cloudpipe/cloudpickle/pull/121 and [here](https://github.com/cloudpipe/cloudpickle/commit/c91aaf110441991307f5097f950764079d0f9652)
- [MRG] Regression on pickling classes from the _main_ module https://github.com/cloudpipe/cloudpickle/pull/149
- BUG: Handle instance methods of builtin types https://github.com/cloudpipe/cloudpickle/pull/154
- Fix #129 : do not silence RuntimeError in dump() https://github.com/cloudpipe/cloudpickle/pull/153
Attachments
Issue Links
- contains
-
SPARK-22809 pyspark is sensitive to imports with dots
- Resolved
- is related to
-
SPARK-22809 pyspark is sensitive to imports with dots
- Resolved
- links to