Details
-
Bug
-
Status: Closed
-
Blocker
-
Resolution: Fixed
-
1.1.0
-
None
Description
The correction of Flink's murmur hash with commit [1], breaks Flink's backwards compatibility with respect to savepoints. The reason is that the changed murmur hash which is used to partition elements in a KeyedStream changes the mapping from keys to sub tasks. This changes the assigned key spaces for a sub task. Consequently, an old savepoint (version 1.0) assigns states with a different key space to the sub tasks.
I think that this must be fixed for the upcoming 1.1 release. I see two options to solve the problem:
- revert the changes, but then we don't know how the flawed murmur hash performs
- develop tooling to repartition state of old savepoints. This is probably not trivial since a keyed stream can also contain non-partitioned state which is not partitionable in all cases. And even if only partitioned state is used, we would need some kind of special operator which can repartition the state wrt the key.
I think that the latter option requires some more thoughts and is thus unlikely to be done before the release 1.1. Therefore, as a workaround, I think that we should revert the murmur hash changes.
[1] https://github.com/apache/flink/commit/641a0d436c9b7a34ff33ceb370cf29962cac4dee
Attachments
Issue Links
- links to