Avro
  1. Avro
  2. AVRO-685

Certain recursive schemas can prevent Schema.RecordSchema.hashCode() and .equals() from returning

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Critical Critical
    • Resolution: Fixed
    • Affects Version/s: 1.4.1
    • Fix Version/s: 1.5.0
    • Component/s: java
    • Labels:
      None
    • Environment:

      all

    • Hadoop Flags:
      Reviewed

      Description

      I am creating a protocol in memory by building up Schema objects, then writing the avpr file to disk and running SpecificCompiler against it to generate Java sources. My protocol file causes SpecificCompiler to hang. Running in the debugger, I can see a long stack trace emanating from SpecificCompiler.enqueue() (see debugger stack trace at end of this text). What appears to be happening is that Schema.RecordSchema.hashCode() is removing itself from the SEEN_HASHCODE map prematurely; schemas with circular references in multiple fields are added and removed from SEEN_HASHCODE causing the code to bounce around between fields without ever unwinding to the root object.

      Attached is a patch that fixes the problem. If this patch is accepted, I'd like to request an incremental release as this is a showstopper for us. I've also attached a sample avpr file that reproduces the issue.

      Debugger stack trace referenced above:
      org.apache.avro.specific.SpecificCompiler at localhost:3273
      Thread [main] (Suspended)
      System.identityHashCode(Object) line: not available [native method]
      IdentityHashMap<K,V>.hash(Object, int) line: 284
      IdentityHashMap<K,V>.put(K, V) line: 412
      Schema$RecordSchema.hashCode() line: 601
      Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527
      Schema$UnionSchema.hashCode() line: 781
      Schema$ArraySchema.hashCode() line: 703
      Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527
      Schema$UnionSchema.hashCode() line: 781
      Schema$Field.hashCode() line: 421
      Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527
      Schema$RecordSchema.hashCode() line: 602
      Schema$Field.hashCode() line: 421
      Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527
      Schema$RecordSchema.hashCode() line: 602
      Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527
      Schema$UnionSchema.hashCode() line: 781
      Schema$Field.hashCode() line: 421
      Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527
      Schema$RecordSchema.hashCode() line: 602
      Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527
      Schema$UnionSchema.hashCode() line: 781
      Schema$Field.hashCode() line: 421
      Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527
      Schema$RecordSchema.hashCode() line: 602
      Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527
      Schema$UnionSchema.hashCode() line: 781
      Schema$Field.hashCode() line: 421
      Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527
      Schema$RecordSchema.hashCode() line: 602
      Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527
      Schema$UnionSchema.hashCode() line: 781
      Schema$Field.hashCode() line: 421
      Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527
      Schema$RecordSchema.hashCode() line: 602
      Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527
      Schema$UnionSchema.hashCode() line: 781
      Schema$Field.hashCode() line: 421
      Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527
      Schema$RecordSchema.hashCode() line: 602
      Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527
      Schema$UnionSchema.hashCode() line: 781
      Schema$Field.hashCode() line: 421
      Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527
      Schema$RecordSchema.hashCode() line: 602
      Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527
      Schema$UnionSchema.hashCode() line: 781
      Schema$ArraySchema.hashCode() line: 703
      Schema$Field.hashCode() line: 421
      Schema$LockableArrayList<E>(AbstractList<E>).hashCode() line: 527
      Schema$RecordSchema.hashCode() line: 602
      HashMap<K,V>.getEntry(Object) line: 344
      HashMap<K,V>.containsKey(Object) line: 335
      HashSet<E>.contains(Object) line: 184
      SpecificCompiler.enqueue(Schema) line: 134
      SpecificCompiler.<init>(Protocol) line: 70
      SpecificCompiler.compileProtocol(File, File) line: 114
      SpecificCompiler.main(String[]) line: 399

      1. test.avpr
        56 kB
        Richard Ahrens
      2. Schema.patch
        1 kB
        Richard Ahrens
      3. AVRO-685.patch
        3 kB
        Doug Cutting

        Activity

        Hide
        Doug Cutting added a comment -

        Thanks for finding this!

        I can't yet see how this can happen, yet it does with the protocol you provide.

        I'm trying to create a minimal example. Does anyone have an intuition?

        Show
        Doug Cutting added a comment - Thanks for finding this! I can't yet see how this can happen, yet it does with the protocol you provide. I'm trying to create a minimal example. Does anyone have an intuition?
        Hide
        Doug Cutting added a comment -

        My current theory is that this is not actually a loop but an exponential explosion.

        Show
        Doug Cutting added a comment - My current theory is that this is not actually a loop but an exponential explosion.
        Hide
        Doug Cutting added a comment -

        Yes, it was an exponential blowup. I've added a test that illustrates this.

        If there are no objections, I'll commit this tomorrow.

        Show
        Doug Cutting added a comment - Yes, it was an exponential blowup. I've added a test that illustrates this. If there are no objections, I'll commit this tomorrow.
        Hide
        Scott Carey added a comment -

        Patch looks good and passes tests. I haven't completely thought through this problem yet but the change looks safe.

        Show
        Scott Carey added a comment - Patch looks good and passes tests. I haven't completely thought through this problem yet but the change looks safe.
        Hide
        Doug Cutting added a comment -

        I just committed this. Thanks, Richard!

        Show
        Doug Cutting added a comment - I just committed this. Thanks, Richard!
        Hide
        Richard Ahrens added a comment -

        Terrific! Much appreciated, Doug.

        Show
        Richard Ahrens added a comment - Terrific! Much appreciated, Doug.

          People

          • Assignee:
            Richard Ahrens
            Reporter:
            Richard Ahrens
          • Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development