DISTRIBUTE BY allows the user to control the partitioning and ordering of a data set which can be very useful for some applications.
- is related to
-
SPARK-11504 API audit for distributeBy and localSort
-
- Resolved
-
-
SPARK-4849 Pass partitioning information (distribute by) to In-memory caching
-
- Resolved
-
- links to