Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-18956

AvroSerDe Race Condition

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Trivial
    • Resolution: Won't Fix
    • 2.3.2, 3.0.0
    • None
    • None

    Description

        @Override
        public Writable serialize(Object o, ObjectInspector objectInspector) throws SerDeException {
          if(badSchema) {
            throw new BadSchemaException();
          }
          return getSerializer().serialize(o, objectInspector, columnNames, columnTypes, schema);
        }
      
        @Override
        public Object deserialize(Writable writable) throws SerDeException {
          if(badSchema) {
            throw new BadSchemaException();
          }
          return getDeserializer().deserialize(columnNames, columnTypes, writable, schema);
        }
      
      ...
      
        private AvroDeserializer getDeserializer() {
          if(avroDeserializer == null) {
            avroDeserializer = new AvroDeserializer();
          }
      
          return avroDeserializer;
        }
      
        private AvroSerializer getSerializer() {
          if(avroSerializer == null) {
            avroSerializer = new AvroSerializer();
          }
      
          return avroSerializer;
        }
      

      getDeserializer and getSerializer methods are not thread safe, so neither are deserialize and serialize methods. It probably didn't matter with MapReduce, but now that we have Spark/Tez, it may be an issue.

      You could visualize a scenario where three threads all enter getSerializer and all see that avroSerializer is null and create three instances, then they would fight to assign the new object to the avroSerializer variable.

      Attachments

        Activity

          People

            Unassigned Unassigned
            belugabehr David Mollitor
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: