Affects Version/s: 0.9.1
Fix Version/s: None
Samza's KafkaSystemProducer class generates the partition key using:
abs(envelope.getPartitionKey.hashCode()) % numPartitions
However, Kafka's producer generates the partition key this way:
Utils.abs(Utils.murmur2(record.key())) % numPartitions
This makes it difficult for me to join 2 data sources on a common key when one source is produced by Samza and the other by a default Kafka producer.
As a work-around, I have to modify the upstream job (that uses the default kafka producer) to write with an explicit partition key using Samza's hashing logic.