Uploaded image for project: 'Kafka'
  1. Kafka
  2. KAFKA-2590 KIP-28: Kafka Streams Checklist
  3. KAFKA-3499

byte[] should not be used as Map key nor Set member

    XMLWordPrintableJSON

    Details

    • Type: Sub-task
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 0.10.0.0
    • Fix Version/s: 0.10.0.0
    • Component/s: streams
    • Labels:

      Description

      On the JVM, Array.equals and Array.hashCode do not incorporate array contents; they inherit Object.equals/hashCode. This implies that Collections that rely upon equals/hashCode (eg, HashMap/HashSet and variants) treat two arrays with equal contents as distinct elements.

      Many of the Kafka Streams internal classes currently use generic HashMaps and Sets to manage caches and invalidation status. For example, RocksDBStore.cacheDirtyKeys is a HashSet<K>. Then, in RocksDBWindowStore, the Elements are constructed as RocksDBStore<byte[], byte[]>.

      Similarly, the MemoryLRUCache<K, RocksDBCacheEntry> internally holds a LinkedHashMap<K,V> map, and a HashSet<K> keys, and these end up holding byte[] keys. Finally, user-code may attempt to use any of these provided types with byte[], with undesirable results.

      Keys that are byte-arrays should be wrapped in a type that incorporates the content in their computation of equals/hashCode. java.nio.ByteBuffer is one such type that could be used, but a purpose-built immutable class would likely be a better solution.

        Attachments

          Activity

            People

            • Assignee:
              guozhang Guozhang Wang
              Reporter:
              joshng josh gruenberg
            • Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: