Uploaded image for project: 'Flink'
  1. Flink
  2. FLINK-11141

Key generation for RocksDBMapState can theoretically be ambiguous

    XMLWordPrintableJSON

Details

    Description

      RocksDBMap state stores values in RocksDB under a composite key from the serialized bytes of key-group-id|key|namespace|user-key. In this composition, key, namespace, and user-key can either have fixed sized or variable sized serialization formats. In cases of at least 2 variable formats, ambiguity can be possible, e.g.:

      abcd <-> efg
      abc <-> defg

      Our code takes care of this for all other states, where composite keys only consist of key and namespace by checking for 2x variable size and appending the serialized length to each byte sequence.

      However, for map state there is no inclusion of the user-key in the check for potential ambiguity, as well as for appending the size. This means that, in theory, some combinations can produce colliding composite keys in RocksDB. What is required is to include the user-key serializer in the check and append the length there as well.

      Please notice that this cannot be simply changed because it has implications for backwards compatibility and requires some form of migration for the state keys on restore.

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              srichter Stefan Richter
              Votes:
              0 Vote for this issue
              Watchers:
              10 Start watching this issue

              Dates

                Created:
                Updated: