Details
-
Documentation
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
None
-
None
Description
As reported in SPARK-3098 for example, for users using zipWithIndex, zipWithUniqueId, etc, (and maybe even things like mapPartitions) it's confusing that the order of elements in each partition after a shuffle operation is nondeterministic (unless the operation was sortByKey). We should explain this in the docs for the zip and partition-wise operations.
Another subtle issue is that the order of values for each key in groupBy / join / etc can be nondeterministic – we need to explain that too.
Attachments
Issue Links
- is related to
-
SPARK-6617 Word2Vec is nondeterministic
- Resolved
- links to