Uploaded image for project: 'Samza'
  1. Samza
  2. SAMZA-2502

Byte array keys be partitioned based on array contents in InMemorySystemProducer

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Closed
    • Major
    • Resolution: Fixed
    • None
    • 1.5
    • None
    • None

    Description

      InMemorySystemProducer uses the hashCode of the partition key to decide to which partition the message goes. This works well when the key is an object whose hashCode method can be override. But in the case when the partition key is serialized as a byte[], the message can go to any partition. It turns out that the hash code of a byte array is based on the address in memory but not the content. Therefore, even though two messages may have same key, they can be sent to different partitions after their keys are serialized into byte[] whose hash code is kind of random.

       

      We want to be able to partition messages based on the contents of the partition keys. An easy fix would be: in the case of byte array, we calculate the hash code with Arrays.hashCode(byte[] input). This allows us to calculate the hash code of the byte array by its contents.

      Attachments

        Issue Links

          Activity

            People

              YixingZhang Yixing Zhang
              YixingZhang Yixing Zhang
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 40m
                  40m