Details
-
Improvement
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
None
-
None
Description
The idea is that most of the logic of calling Python actually has nothing to do with RDD (it is really just communicating with a socket – there is nothing distributed about it), and it is only currently depending on RDD because it was written this way.
If we extract that functionality out, we can apply it to area of the code that doesn't depend on RDDs, and also make it easier to test.
Attachments
Issue Links
- is duplicated by
-
SPARK-10494 Multiple Python UDFs together with aggregation or sort merge join may cause OOM (failed to acquire memory)
- Resolved
- relates to
-
SPARK-12792 Refactor RRDD to support R UDF
- Resolved
- links to