Uploaded image for project: 'Avro'
  1. Avro
  2. AVRO-930

Memory leak in resolved-writer.c in Avro C

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Blocker
    • Resolution: Fixed
    • Affects Version/s: 1.6.0
    • Fix Version/s: 1.6.0
    • Component/s: c
    • Labels:
      None
    • Environment:

      All.

      Description

      I was able to isolate a memory leak in the new schema resolver
      (resolved-writer.c) in Avro C on the avro-trunk (v1.6.0), in which the
      code attempts to access and free memory that has already been freed.

      There is a simple one-line fix to this issue, using the reference
      counting mechanism provided in Avro, attached in the patch file.

      This patch is different from the one I originally sent to the Avro Dev
      mailing list, and I think my new patch is better.

      Longer Description:

      I created a small test program to test schema resolution in C. The
      program contains a writer schema with two integer elements and a reader
      schema with three integer elements. Since the reader schema has elements
      that the writer doesn't contain, and default values are not yet
      supported in C, the function try_record() goes into the "error" section
      (by design) and tries to clean up after itself. This cleanup process has
      a memory bug in it.

      Specifically, if the reader schema contains two elements of the same
      type (in my case two ints), try_record() uses the same resolver for
      both elements. This common resolver is used because the function
      avro_resolved_writer_new_memoized() tests to see if the schema's of
      the two elements are the same, and if they are matched, it returns the
      resolver, and does not create a new resolver. It also does not
      increment the reference count of this existing resolver. So, multiple
      elements of the array of field_resolvers[] can contain pointers to the
      same resolver, and these reference are not accounted for in the
      reference count.

      During the cleanup process, there is a loop that checks if the pointer
      field_resolvers[i] is non-NULL, and tries to free all non-NULL
      resolvers. Since we have two (or more) elements of the
      field_resolvers[] array pointing to the same resolver, and the
      reference count of the resolver doesn't count these additional
      references, the code tries to call avro_value_iface_decref() on an
      already freed resolver.

      This can be simply fixed as follows: Increment the reference count
      each time you assign a resolver to the field_resolvers[] array, even
      when you are re-assigning the same resolver to a new element in the
      array.

      I verified this patch using valgrind. Multiple valgrind errors were
      seen in try_record() and avro_refcount_dec() before applying the
      patch, and these errors went away after applying the patch.

      In an earlier email to the avro dev mailing list, I provided a patch
      which didn't use the reference count mechanism. Instead it checked the
      field_resolvers[] array for duplicate entries, and set the duplicates
      to NULL. This patch used more code, but had a more localized impact –
      in that it was exercised only when an error condition was
      encountered. I can provide this patch if anyone wants it.

        Attachments

        1. avro-930-test.c
          3 kB
          Vivek Nadkarni
        2. AVRO-930.patch
          0.4 kB
          Vivek Nadkarni

          Activity

            People

            • Assignee:
              dcreager Douglas Creager
              Reporter:
              vnadkarni Vivek Nadkarni
            • Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Time Tracking

                Estimated:
                Original Estimate - 24h
                24h
                Remaining:
                Remaining Estimate - 24h
                24h
                Logged:
                Time Spent - Not Specified
                Not Specified