Details
-
Bug
-
Status: Resolved
-
Trivial
-
Resolution: Won't Fix
-
2.3.2, 3.0.0
-
None
-
None
Description
@Override public Writable serialize(Object o, ObjectInspector objectInspector) throws SerDeException { if(badSchema) { throw new BadSchemaException(); } return getSerializer().serialize(o, objectInspector, columnNames, columnTypes, schema); } @Override public Object deserialize(Writable writable) throws SerDeException { if(badSchema) { throw new BadSchemaException(); } return getDeserializer().deserialize(columnNames, columnTypes, writable, schema); } ... private AvroDeserializer getDeserializer() { if(avroDeserializer == null) { avroDeserializer = new AvroDeserializer(); } return avroDeserializer; } private AvroSerializer getSerializer() { if(avroSerializer == null) { avroSerializer = new AvroSerializer(); } return avroSerializer; }
getDeserializer and getSerializer methods are not thread safe, so neither are deserialize and serialize methods. It probably didn't matter with MapReduce, but now that we have Spark/Tez, it may be an issue.
You could visualize a scenario where three threads all enter getSerializer and all see that avroSerializer is null and create three instances, then they would fight to assign the new object to the avroSerializer variable.