Uploaded image for project: 'Kafka'
  1. Kafka
  2. KAFKA-2590 KIP-28: Kafka Streams Checklist
  3. KAFKA-3499

byte[] should not be used as Map key nor Set member

    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 0.10.0.0
    • 0.10.0.0
    • streams

    Description

      On the JVM, Array.equals and Array.hashCode do not incorporate array contents; they inherit Object.equals/hashCode. This implies that Collections that rely upon equals/hashCode (eg, HashMap/HashSet and variants) treat two arrays with equal contents as distinct elements.

      Many of the Kafka Streams internal classes currently use generic HashMaps and Sets to manage caches and invalidation status. For example, RocksDBStore.cacheDirtyKeys is a HashSet<K>. Then, in RocksDBWindowStore, the Elements are constructed as RocksDBStore<byte[], byte[]>.

      Similarly, the MemoryLRUCache<K, RocksDBCacheEntry> internally holds a LinkedHashMap<K,V> map, and a HashSet<K> keys, and these end up holding byte[] keys. Finally, user-code may attempt to use any of these provided types with byte[], with undesirable results.

      Keys that are byte-arrays should be wrapped in a type that incorporates the content in their computation of equals/hashCode. java.nio.ByteBuffer is one such type that could be used, but a purpose-built immutable class would likely be a better solution.

      Attachments

        Activity

          People

            guozhang Guozhang Wang
            joshng josh gruenberg
            Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: