Avro
  1. Avro
  2. AVRO-607

SpecificData.getSchema not thread-safe

    Details

    • Type: Bug Bug
    • Status: Open
    • Priority: Major Major
    • Resolution: Unresolved
    • Affects Version/s: 1.3.3
    • Fix Version/s: None
    • Component/s: java
    • Labels:
      None

      Description

      SpecificData.getSchema uses a WeakHashMap to cache schemas, but WeakHashMap is not thread-safe, and the method itself is not synchronized. Seems like this could lead to the data structure getting corrupted.

      1. AVRO-607.patch
        0.9 kB
        Doug Cutting

        Issue Links

          Activity

          Hide
          Ryan Blue added a comment -

          Gwen Shapira or Tom White, do you have time to take a look?

          Show
          Ryan Blue added a comment - Gwen Shapira or Tom White , do you have time to take a look?
          Hide
          Ryan Blue added a comment -

          Dave Latham's comment is a good excuse to start this thread again. From the discussion above, it sounds like building the WeakConcurrentMap is a problem because its equals method doesn't override the default and two WeakReferences to the same object aren't equal. But, WeakIdentityHashMap creates a subclass of WeakReference that overrides equals to avoid that problem and it would be appropriate to use an IdentityHashMap for this cache. So I created a version of the WeakIdentityHashMap that is backed by a ConcurrentHashMap instead of a regular HashMap. Even if the GC removes the weakly-referenced object while the ConcurrentHashMap is doing some operation, it either removes all weak references or none, so key equality is preserved.

          I also caught a couple of places where I think the current implementation violates its contract. For example, keySet gets all of the identity weak references and accumulates a set of the real values. As soon as each real value is referenced, we know that the weak reference won't disappear. But I think there's a small opportunity between reaping the current reference set and adding all the referenced objects to the key set for GC to run and remove values:

            public Set<K> keySet() {
              reap();
              Set<K> ret = new HashSet<K>();
              for (IdentityWeakReference<K> ref : backingStore.keySet()) {
                // half-way through the loop, GC could run and remove the next referenced object
                ret.add(ref.get());
              }
              return Collections.unmodifiableSet(ret);
            }
          

          I've also added checks so that in keySet and entrySet, the WeakReference is resolved (which prevents losing the value) and checked to see if it is null before adding to the output collection.

          I've added PR #30 on github, and I can upload a patch if that's preferred.

          Show
          Ryan Blue added a comment - Dave Latham 's comment is a good excuse to start this thread again. From the discussion above, it sounds like building the WeakConcurrentMap is a problem because its equals method doesn't override the default and two WeakReferences to the same object aren't equal. But, WeakIdentityHashMap creates a subclass of WeakReference that overrides equals to avoid that problem and it would be appropriate to use an IdentityHashMap for this cache. So I created a version of the WeakIdentityHashMap that is backed by a ConcurrentHashMap instead of a regular HashMap. Even if the GC removes the weakly-referenced object while the ConcurrentHashMap is doing some operation, it either removes all weak references or none, so key equality is preserved. I also caught a couple of places where I think the current implementation violates its contract. For example, keySet gets all of the identity weak references and accumulates a set of the real values. As soon as each real value is referenced, we know that the weak reference won't disappear. But I think there's a small opportunity between reaping the current reference set and adding all the referenced objects to the key set for GC to run and remove values: public Set<K> keySet() { reap(); Set<K> ret = new HashSet<K>(); for (IdentityWeakReference<K> ref : backingStore.keySet()) { // half-way through the loop, GC could run and remove the next referenced object ret.add(ref.get()); } return Collections.unmodifiableSet(ret); } I've also added checks so that in keySet and entrySet , the WeakReference is resolved (which prevents losing the value) and checked to see if it is null before adding to the output collection. I've added PR #30 on github, and I can upload a patch if that's preferred.
          Hide
          ASF GitHub Bot added a comment -

          GitHub user rdblue opened a pull request:

          https://github.com/apache/avro/pull/30

          AVRO-607. Make SpecificData#getSchema thread-safe.

          This adds a version of WeakIdentityHashMap that is thread-safe by
          replacing the backing Map implementation with ConcurrentHashMap. Both
          WeakIdentityHashmap and the new WeakIdentityConcurrentMap are subclases
          of the (new) WeakIdentityMap, which allows subclasses to supply a
          backing map implementation.

          You can merge this pull request into a Git repository by running:

          $ git pull https://github.com/rdblue/avro AVRO-607-concurrent-schema-cache

          Alternatively you can review and apply these changes as the patch at:

          https://github.com/apache/avro/pull/30.patch

          To close this pull request, make a commit to your master/trunk branch
          with (at least) the following in the commit message:

          This closes #30


          commit b287db7560928d88b1ef1dd0cc79d1ad3e7cfa1e
          Author: Ryan Blue <blue@apache.org>
          Date: 2015-03-31T00:31:36Z

          AVRO-607. Make SpecificData#getSchema thread-safe.

          This adds a version of WeakIdentityHashMap that is thread-safe by
          replacing the backing Map implementation with ConcurrentHashMap. Both
          WeakIdentityHashmap and the new WeakIdentityConcurrentMap are subclases
          of the (new) WeakIdentityMap, which allows subclasses to supply a
          backing map implementation.


          Show
          ASF GitHub Bot added a comment - GitHub user rdblue opened a pull request: https://github.com/apache/avro/pull/30 AVRO-607 . Make SpecificData#getSchema thread-safe. This adds a version of WeakIdentityHashMap that is thread-safe by replacing the backing Map implementation with ConcurrentHashMap. Both WeakIdentityHashmap and the new WeakIdentityConcurrentMap are subclases of the (new) WeakIdentityMap, which allows subclasses to supply a backing map implementation. You can merge this pull request into a Git repository by running: $ git pull https://github.com/rdblue/avro AVRO-607 -concurrent-schema-cache Alternatively you can review and apply these changes as the patch at: https://github.com/apache/avro/pull/30.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #30 commit b287db7560928d88b1ef1dd0cc79d1ad3e7cfa1e Author: Ryan Blue <blue@apache.org> Date: 2015-03-31T00:31:36Z AVRO-607 . Make SpecificData#getSchema thread-safe. This adds a version of WeakIdentityHashMap that is thread-safe by replacing the backing Map implementation with ConcurrentHashMap. Both WeakIdentityHashmap and the new WeakIdentityConcurrentMap are subclases of the (new) WeakIdentityMap, which allows subclasses to supply a backing map implementation.
          Hide
          Dave Latham added a comment -

          This is affecting us in production, with the symptom being hundreds of threads being stuck in a loop at:

          	at java.util.WeakHashMap.get(WeakHashMap.java:470)
          	at org.apache.avro.specific.SpecificData.getSchema(SpecificData.java:138)
          	at org.apache.avro.specific.SpecificDatumWriter.<init>(SpecificDatumWriter.java:33)
          

          until the process is restarted.

          Show
          Dave Latham added a comment - This is affecting us in production, with the symptom being hundreds of threads being stuck in a loop at: at java.util.WeakHashMap.get(WeakHashMap.java:470) at org.apache.avro.specific.SpecificData.getSchema(SpecificData.java:138) at org.apache.avro.specific.SpecificDatumWriter.<init>(SpecificDatumWriter.java:33) until the process is restarted.
          Hide
          Scott Carey added a comment -

          If I recall, that turns out to be very hard due to how the equals contract works with weak references. There is already a Java WeakHashMap, so making one with identity semantics wasn't too hard.
          We may need thousands of lines of code and might have to implement our own concurrent map implementation. I think I'd rather spend my efforts figuring out how to extract Google's implementation into another namespace in the build with shade, jarjar, or similar.

          Show
          Scott Carey added a comment - If I recall, that turns out to be very hard due to how the equals contract works with weak references. There is already a Java WeakHashMap, so making one with identity semantics wasn't too hard. We may need thousands of lines of code and might have to implement our own concurrent map implementation. I think I'd rather spend my efforts figuring out how to extract Google's implementation into another namespace in the build with shade, jarjar, or similar.
          Hide
          Doug Cutting added a comment -

          > In Avro code, a weak concurrent hash map is in high demand.

          Might we add a org.apache.avro.util.WeakConcurrentMap<T> class that's a ConncurrentHashMap<WeakReference<T>>? We already have our own WeakIdentityHashMap...

          Show
          Doug Cutting added a comment - > In Avro code, a weak concurrent hash map is in high demand. Might we add a org.apache.avro.util.WeakConcurrentMap<T> class that's a ConncurrentHashMap<WeakReference<T>>? We already have our own WeakIdentityHashMap...
          Hide
          Scott Carey added a comment -

          Alternative to this patch, we could synchronize the method, or we can use a ThreadLocal<WeakHashMap<Type, Schema>>.

          A cache with a Type or Class key (that is not weak) that becomes static can lead to classloader leaks.

          In Avro code, a weak concurrent hash map is in high demand.

          Show
          Scott Carey added a comment - Alternative to this patch, we could synchronize the method, or we can use a ThreadLocal<WeakHashMap<Type, Schema>>. A cache with a Type or Class key (that is not weak) that becomes static can lead to classloader leaks. In Avro code, a weak concurrent hash map is in high demand.
          Hide
          Scott Carey added a comment -

          I quite like Guava, Having a concurrent weak hash map is great, and the Immutable collections are very useful, and several other collection types are massive time savers (Multiset, Multimap and BiMap).

          However, items get deprecated and dissapear in 2 years in Guava, so we would have to avoid the newest APIs and quickly move off of deprecated ones to prevent users who also use it from coming into conflict. It is manageable, but it is a dependency that is very likely to be used by our users, and if we are on version 11 while a user is on 13, we could be in a position where neither version works for both of us simultaneously. I also worry about our place as a library far down the stack for some users.

          We could complicate our build to shade in only the classes we use under a different namespace to avoid such problems (this may be useful for other dependencies as well).

          Show
          Scott Carey added a comment - I quite like Guava, Having a concurrent weak hash map is great, and the Immutable collections are very useful, and several other collection types are massive time savers (Multiset, Multimap and BiMap). However, items get deprecated and dissapear in 2 years in Guava, so we would have to avoid the newest APIs and quickly move off of deprecated ones to prevent users who also use it from coming into conflict. It is manageable, but it is a dependency that is very likely to be used by our users, and if we are on version 11 while a user is on 13, we could be in a position where neither version works for both of us simultaneously. I also worry about our place as a library far down the stack for some users. We could complicate our build to shade in only the classes we use under a different namespace to avoid such problems (this may be useful for other dependencies as well).
          Hide
          Doug Cutting added a comment -

          Here's a simple patch. I've converted it to use ConcurrentHashMap instead. We already have static ConcurrentHashMaps that contain instances of Class, and most instances of Type are Class. So this will prevent classes and types from getting GC'd, but neither need to be GC'd much.

          A better fix might be to update all uses of static ConcurrentHashMaps with 'new com.google.common.collect.
          new MapMaker().weakKeys().makeMap()'.

          http://google-collections.googlecode.com/svn/trunk/javadoc/com/google/common/collect/MapMaker.html

          Should we add a dependency on Google Collections?

          Show
          Doug Cutting added a comment - Here's a simple patch. I've converted it to use ConcurrentHashMap instead. We already have static ConcurrentHashMaps that contain instances of Class, and most instances of Type are Class. So this will prevent classes and types from getting GC'd, but neither need to be GC'd much. A better fix might be to update all uses of static ConcurrentHashMaps with 'new com.google.common.collect. new MapMaker().weakKeys().makeMap()'. http://google-collections.googlecode.com/svn/trunk/javadoc/com/google/common/collect/MapMaker.html Should we add a dependency on Google Collections?

            People

            • Assignee:
              Unassigned
              Reporter:
              Stephen Tu
            • Votes:
              1 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

              • Created:
                Updated:

                Development