Details
-
Sub-task
-
Status: Closed
-
Major
-
Resolution: Not A Problem
-
None
-
None
-
None
Description
If the input data is a byte array, we should allow TorrentBroadcast to skip serializing and compressing the input.
To do this, we should add a new parameter (shortCircuitByteArray) to TorrentBroadcast, and then avoid serialization in if the input is byte array and shortCircuitByteArray is true.
We should then also do compression in task serialization itself instead of doing it in TorrentBroadcast.
Attachments
Issue Links
- is related to
-
SPARK-7448 Implement custom bye array serializer for use in PySpark shuffle
- Closed