A cached PCollection/PTable built by Avros using SparkPipeline seem to reuse avro objects. Here is a test that shows this behavior
I would expect the output of this program to create a pair with same key, value. However, this produces Pair with different key value. I have tested this with text file input source and it works as expected. Removing cache() produces expected result.