Uploaded image for project: 'Groovy'
  1. Groovy
  2. GROOVY-7535

Groovy category throwing MissingMethodException and MissingPropertyException when using multiple threads

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 2.3.11, 2.4.4
    • Fix Version/s: 2.4.12
    • Component/s: None
    • Labels:
      None
    • Environment:
      I have been able to reproduce this issue on a Cent O/S version 6.4 with Java 64 Bit JDK 1.8 and groovy 2.4.4.
    • Flags:
      Important

      Description

      When using groovy use block, we randomly get groovy.lang.MissingPropertyException when trying to access a property off a category. (Attached is an example)

      index 76
      Exception in thread "Thread-77" groovy.lang.MissingPropertyException: No such property: millisecond for class: java.lang.Integer
      at org.codehaus.groovy.runtime.ScriptBytecodeAdapter.unwrap(ScriptBytecodeAdapter.java:51)
      at org.codehaus.groovy.runtime.callsite.PojoMetaClassGetPropertySite.callGetProperty(PojoMetaClassGetPropertySite.java:43)
      at TimeCategoryTest$__spock_initializeFields_closure1$_closure4$_closure5.doCall(TimeCategoryTest.groovy:23)
      at TimeCategoryTest$__spock_initializeFields_closure1$_closure4$_closure5.doCall(TimeCategoryTest.groovy)
      at sun.reflect.GeneratedMethodAccessor12.invoke(Unknown Source)
      at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
      at java.lang.reflect.Method.invoke(Method.java:497)
      at org.codehaus.groovy.reflection.CachedMethod.invoke(CachedMethod.java:90)
      at groovy.lang.MetaMethod.doMethodInvoke(MetaMethod.java:324)
      at org.codehaus.groovy.runtime.metaclass.ClosureMetaClass.invokeMethod(ClosureMetaClass.java:292)
      at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1016)
      at groovy.lang.Closure.call(Closure.java:423)
      at groovy.lang.Closure.call(Closure.java:417)
      at org.codehaus.groovy.runtime.GroovyCategorySupport$ThreadCategoryInfo.use(GroovyCategorySupport.java:109)
      at org.codehaus.groovy.runtime.GroovyCategorySupport$ThreadCategoryInfo.access$400(GroovyCategorySupport.java:65)
      at org.codehaus.groovy.runtime.GroovyCategorySupport.use(GroovyCategorySupport.java:249)
      at org.codehaus.groovy.runtime.DefaultGroovyMethods.use(DefaultGroovyMethods.java:403)

      or groovy.lang.MissingMethodException when trying to access a method from a category.

      index 82
      Exception in thread "Thread-207" groovy.lang.MissingMethodException: No signature of method: java.lang.String.test() is applicable for argument types: (java.lang.String) values: [ bar]
      Possible solutions: next(), toSet(), getAt(java.lang.String), wait(). trim(), toList()
      at org.codehaus.groovy.runtime.ScriptBytecodeAdapter.unwrap(ScriptBytecodeAdapter.java:56)
      at org.codehaus.groovy.runtime.callsite.PojoMetaClassSite.call(PojoMetaClassSite.java:46)
      at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:122)
      at CategoryTest$__spock_initializeFields_closure1$_closure5.doCall(CategoryTest.groovy:24)
      at CategoryTest$__spock_initializeFields_closure1$_closure5.doCall(CategoryTest.groovy)
      at sun.reflect.GeneratedMethodAccessor12.invoke(Unknown Source)
      at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
      at java.lang.reflect.Method.invoke(Method.java:497)
      at org.codehaus.groovy.reflection.CachedMethod.invoke(CachedMethod.java:90)
      at groovy.lang.MetaMethod.doMethodInvoke(MetaMethod.java:324)
      at org.codehaus.groovy.runtime.metaclass.ClosureMetaClass.invokeMethod(ClosureMetaClass.java:292)
      at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1016)
      at groovy.lang.Closure.call(Closure.java:423)
      at groovy.lang.Closure.call(Closure.java:417)
      at org.codehaus.groovy.runtime.GroovyCategorySupport$ThreadCategoryInfo.use(GroovyCategorySupport.java:109)
      at org.codehaus.groovy.runtime.GroovyCategorySupport$ThreadCategoryInfo.access$400(GroovyCategorySupport.java:65)
      at org.codehaus.groovy.runtime.GroovyCategorySupport.use(GroovyCategorySupport.java:249)
      at org.codehaus.groovy.runtime.DefaultGroovyMethods.use(DefaultGroovyMethods.java:403)

      On the server, either one of these exceptions to appear every few days or weeks.

      I found a similar issue GROOVY-2105 which contains a test that reproduces this problem on Linux. Looking back the fix that was implemented for this ticket was to add synchronized to 3 methods. These changes were late removed in a refactor of GroovyCategorySupport.

      I have attached 3 test cases that reproduce the threading issue.

      NOTE: problem does not happen 100% of the time as it is a threading and timing issue, so you may have to play with the number of threads to generate the exception. What I have attached generates the exception on my linux box most of the time.

      1. 0001-GROOVY-7535-fix-race-condition-between-the-modificat.patch
        2 kB
        Jochen Kemnade
      2. CategoryTest.groovy
        1 kB
        James Oswald
      3. exceptionForCategoryTest.txt
        2 kB
        James Oswald
      4. exceptionForTimeCategoryTest.txt
        2 kB
        James Oswald
      5. locking.png
        117 kB
        Jochen Kemnade
      6. Test.groovy
        0.3 kB
        James Oswald
      7. TimeCategoryTest.groovy
        1 kB
        James Oswald

        Issue Links

          Activity

          Hide
          joswald James Oswald added a comment -

          From GROOVY-2105:
          The methods in question are org.codehaus.groovy.runtime.GroovyCategorySupport#newScope, org.codehaus.groovy.runtime.GroovyCategorySupport#endScope and org.codehaus.groovy.runtime.GroovyCategorySupport#hasCategoryInAnyThread

          https://github.com/groovy/groovy-core/blob/master/src/main/org/codehaus/groovy/runtime/GroovyCategorySupport.java#L82

          Show
          joswald James Oswald added a comment - From GROOVY-2105 : The methods in question are org.codehaus.groovy.runtime.GroovyCategorySupport#newScope, org.codehaus.groovy.runtime.GroovyCategorySupport#endScope and org.codehaus.groovy.runtime.GroovyCategorySupport#hasCategoryInAnyThread https://github.com/groovy/groovy-core/blob/master/src/main/org/codehaus/groovy/runtime/GroovyCategorySupport.java#L82
          Hide
          pascalschumacher Pascal Schumacher added a comment -

          Hi James,

          thanks for reporting this.

          Sadly I could not replicate the problem with groovy 2.4.5 (on windows) and your testcase (tried different times and different amounts of threads).

          I guess that's just caused by platform differences as I do not remember any changes in 2.4.5 related to this problem and GroovyCategorySupport.java looks like it could have concurrency issues.

          Show
          pascalschumacher Pascal Schumacher added a comment - Hi James, thanks for reporting this. Sadly I could not replicate the problem with groovy 2.4.5 (on windows) and your testcase (tried different times and different amounts of threads). I guess that's just caused by platform differences as I do not remember any changes in 2.4.5 related to this problem and GroovyCategorySupport.java looks like it could have concurrency issues.
          Hide
          blackdrag Jochen Theodorou added a comment -

          sorry, first had answered on dev, because I thought it was a normal thread What kind of cpu are you using James?

          As for the change, the explicit synchronization was removed by a logic using an AtomicInteger and an int to allow unsychronized checks for category usage. The idea is the following: A category is only visible in the same thread that it is used, so updates to categoriesInUse from the same thread are visible and hasCategoryInCurrentThread will return true. For the actual counting of the number of active categories overall we use the AtomicInteger atomicCategoryUsageCounter, which will also be used to update categoriesInUse. hasCategoryInAnyThread should always return true if any category is active. The actual building of the category methods list happens in a thread local and is read in a thread local. I would love to learn where this all fails.

          Show
          blackdrag Jochen Theodorou added a comment - sorry, first had answered on dev, because I thought it was a normal thread What kind of cpu are you using James? As for the change, the explicit synchronization was removed by a logic using an AtomicInteger and an int to allow unsychronized checks for category usage. The idea is the following: A category is only visible in the same thread that it is used, so updates to categoriesInUse from the same thread are visible and hasCategoryInCurrentThread will return true. For the actual counting of the number of active categories overall we use the AtomicInteger atomicCategoryUsageCounter, which will also be used to update categoriesInUse. hasCategoryInAnyThread should always return true if any category is active. The actual building of the category methods list happens in a thread local and is read in a thread local. I would love to learn where this all fails.
          Hide
          joswald James Oswald added a comment - - edited

          My dev box has 4 Xeon(R) CPU E5-1607 0 @ 3.00GHz with 16Gigs of ram.
          (Not sure if its cores or separate CPUs. Linux reports them to me as 4 CPUs.)

          The servers I've seen this on were VMs with 8 Cores and 16 Gigs of Ram.

          Both systems were running Cent O/S.

          Show
          joswald James Oswald added a comment - - edited My dev box has 4 Xeon(R) CPU E5-1607 0 @ 3.00GHz with 16Gigs of ram. (Not sure if its cores or separate CPUs. Linux reports them to me as 4 CPUs.) The servers I've seen this on were VMs with 8 Cores and 16 Gigs of Ram. Both systems were running Cent O/S.
          Hide
          githubbot ASF GitHub Bot added a comment -

          GitHub user ksuderman opened a pull request:

          https://github.com/apache/incubator-groovy/pull/169

          Non-atomic use of atomicCategoryUsageCounter

          The atomicCategoryUsageCounter was Incremented/decremented on one line and and then read on the next line leaving a small window for race conditions to happen. This is a possible fix for https://issues.apache.org/jira/browse/GROOVY-7535

          You can merge this pull request into a Git repository by running:

          $ git pull https://github.com/ksuderman/incubator-groovy master

          Alternatively you can review and apply these changes as the patch at:

          https://github.com/apache/incubator-groovy/pull/169.patch

          To close this pull request, make a commit to your master/trunk branch
          with (at least) the following in the commit message:

          This closes #169


          commit 919e90a15bd2ec6de2e015eee8e2e77a8f5f1bd5
          Author: Keith Suderman <suderman@cs.vassar.edu>
          Date: 2015-10-31T19:11:33Z

          Fixed non-atomic use of atomicCategoryUsageCounter.

          commit 3025ed2cd7a80cf49403fe1205f22cb815f5f78b
          Author: Keith Suderman <suderman@cs.vassar.edu>
          Date: 2015-10-31T19:14:16Z

          Merge branch 'master' of https://github.com/apache/incubator-groovy


          Show
          githubbot ASF GitHub Bot added a comment - GitHub user ksuderman opened a pull request: https://github.com/apache/incubator-groovy/pull/169 Non-atomic use of atomicCategoryUsageCounter The atomicCategoryUsageCounter was Incremented/decremented on one line and and then read on the next line leaving a small window for race conditions to happen. This is a possible fix for https://issues.apache.org/jira/browse/GROOVY-7535 You can merge this pull request into a Git repository by running: $ git pull https://github.com/ksuderman/incubator-groovy master Alternatively you can review and apply these changes as the patch at: https://github.com/apache/incubator-groovy/pull/169.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #169 commit 919e90a15bd2ec6de2e015eee8e2e77a8f5f1bd5 Author: Keith Suderman <suderman@cs.vassar.edu> Date: 2015-10-31T19:11:33Z Fixed non-atomic use of atomicCategoryUsageCounter. commit 3025ed2cd7a80cf49403fe1205f22cb815f5f78b Author: Keith Suderman <suderman@cs.vassar.edu> Date: 2015-10-31T19:14:16Z Merge branch 'master' of https://github.com/apache/incubator-groovy
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user asfgit closed the pull request at:

          https://github.com/apache/incubator-groovy/pull/169

          Show
          githubbot ASF GitHub Bot added a comment - Github user asfgit closed the pull request at: https://github.com/apache/incubator-groovy/pull/169
          Hide
          pascalschumacher Pascal Schumacher added a comment -

          James Oswald I merged the pull request. It would be nice if you could use a snapshoot version (e.g. http://ci.groovy-lang.org/repository/download/Groovy_Jdk7Build/29066:id/target/distributions/apache-groovy-binary-2.5.0-SNAPSHOT.zip) to test if this fixes the issue. Thanks!

          Show
          pascalschumacher Pascal Schumacher added a comment - James Oswald I merged the pull request. It would be nice if you could use a snapshoot version (e.g. http://ci.groovy-lang.org/repository/download/Groovy_Jdk7Build/29066:id/target/distributions/apache-groovy-binary-2.5.0-SNAPSHOT.zip ) to test if this fixes the issue. Thanks!
          Hide
          pascalschumacher Pascal Schumacher added a comment -

          James Oswald I'm resolving this for now. Please reopen if the fix does not solve the issue completely. Thanks!

          Show
          pascalschumacher Pascal Schumacher added a comment - James Oswald I'm resolving this for now. Please reopen if the fix does not solve the issue completely. Thanks!
          Hide
          jkemnade Jochen Kemnade added a comment -

          I can actually reproduce this issue with Groovy 2.4.11 on Debian with OpenJDK Runtime Environment (build 1.8.0_131-8u131-b11-1-b11). I'm using

          class TestCategory {
              def static String test(String self, String foo) {
                  return self + foo
              }
          }
          
          def test = {
              1000000.times {
                  sleep(10)
                  use(TestCategory) {
                      assert "foo".test(' bar') == 'foo bar'
                      assert "foo".test(' bar') == 'foo bar'
                      assert "foo".test(' bar') == 'foo bar'
                      assert "foo".test(' bar') == 'foo bar'
                  }
                  sleep(50)
                  
              }
          }
          
          
          1000.times {
              Thread.start test
          }
          
          Show
          jkemnade Jochen Kemnade added a comment - I can actually reproduce this issue with Groovy 2.4.11 on Debian with OpenJDK Runtime Environment (build 1.8.0_131-8u131-b11-1-b11). I'm using class TestCategory { def static String test( String self, String foo) { return self + foo } } def test = { 1000000.times { sleep(10) use(TestCategory) { assert "foo" .test(' bar') == 'foo bar' assert "foo" .test(' bar') == 'foo bar' assert "foo" .test(' bar') == 'foo bar' assert "foo" .test(' bar') == 'foo bar' } sleep(50) } } 1000.times { Thread .start test }
          Hide
          jkemnade Jochen Kemnade added a comment -

          I can also reproduce it with 2.5.0-alpha-1. Can we please reopen this?

          Show
          jkemnade Jochen Kemnade added a comment - I can also reproduce it with 2.5.0-alpha-1. Can we please reopen this?
          Hide
          paulk Paul King added a comment -

          I haven't had time to explore this issue, but just a meta-comment. Do we believe the applied PR fixed something or nothing? If nothing, I'd re-open and remove the fix version; otherwise, I'd leave this one as is so that the fix version for whatever was fixed is tracked and I'd clone the issue and describe what remains to be fixed (if not exactly as per the current description).

          Show
          paulk Paul King added a comment - I haven't had time to explore this issue, but just a meta-comment. Do we believe the applied PR fixed something or nothing? If nothing, I'd re-open and remove the fix version; otherwise, I'd leave this one as is so that the fix version for whatever was fixed is tracked and I'd clone the issue and describe what remains to be fixed (if not exactly as per the current description).
          Hide
          jkemnade Jochen Kemnade added a comment - - edited

          The first part of the PR should not have changed anything. categoriesInUse is merely used as a flag, so if another thread increments the value to n+1, it will still be !=0 in hasCategoryInCurrentThread.
          I'm unsure about the second part, but I think it didn't have any effect either for about the same reason as above. The error happens when categoriesInUse is 0. This can only happen if there is only 1 thread running, so there is no race condition. I might be overlooking something here though.
          However, I think the race condition happens somewhere else:
          DefaultMetaClassInfo.setCategoryUsed(false) is called in endScope if atomicCategoryUsageCounter was 0 further above. But another thread might have entered newScope in between, incrementing atomicCategoryUsageCounter again. So I think, we need some synchronization around atomicCategoryUsageCounter.get and DefaultMetaClassInfo.setCategoryUsed(false).

          Show
          jkemnade Jochen Kemnade added a comment - - edited The first part of the PR should not have changed anything. categoriesInUse is merely used as a flag, so if another thread increments the value to n+1, it will still be !=0 in hasCategoryInCurrentThread . I'm unsure about the second part, but I think it didn't have any effect either for about the same reason as above. The error happens when categoriesInUse is 0 . This can only happen if there is only 1 thread running, so there is no race condition. I might be overlooking something here though. However, I think the race condition happens somewhere else: DefaultMetaClassInfo.setCategoryUsed(false) is called in endScope if atomicCategoryUsageCounter was 0 further above. But another thread might have entered newScope in between, incrementing atomicCategoryUsageCounter again. So I think, we need some synchronization around atomicCategoryUsageCounter.get and DefaultMetaClassInfo.setCategoryUsed(false) .
          Hide
          jkemnade Jochen Kemnade added a comment -

          Patch attached, I'm currently testing it.

          Show
          jkemnade Jochen Kemnade added a comment - Patch attached, I'm currently testing it.
          Hide
          jkemnade Jochen Kemnade added a comment -

          Unfortunately, the patch doesn't fix it.

          Show
          jkemnade Jochen Kemnade added a comment - Unfortunately, the patch doesn't fix it.
          Hide
          blackdrag Jochen Theodorou added a comment -

          Jochen Kemnade your change synchronizes the categoriesInUse == 0 case, but with what? you need to also synchronize the increment place

          Show
          blackdrag Jochen Theodorou added a comment - Jochen Kemnade your change synchronizes the categoriesInUse == 0 case, but with what? you need to also synchronize the increment place
          Hide
          jkemnade Jochen Kemnade added a comment -

          I don't think so. I'm not synchronizing the decrement but the call of the DefaultMetaClassInfo.setCategoryUsed(false). I thought that the bug (or MME) is caused by the state in DefaultMetaClassInfo, but apparently that isn't the case. Even if i remove DefaultMetaClassInfo.setCategoryUsed(false) altogether, the issue persists.

          Show
          jkemnade Jochen Kemnade added a comment - I don't think so. I'm not synchronizing the decrement but the call of the DefaultMetaClassInfo.setCategoryUsed(false) . I thought that the bug (or MME) is caused by the state in DefaultMetaClassInfo , but apparently that isn't the case. Even if i remove DefaultMetaClassInfo.setCategoryUsed(false) altogether, the issue persists.
          Hide
          blackdrag Jochen Theodorou added a comment -

          Since the removal of setting it to false did not do it, it is probably a different issue, yes.

          Show
          blackdrag Jochen Theodorou added a comment - Since the removal of setting it to false did not do it, it is probably a different issue, yes.
          Hide
          jkemnade Jochen Kemnade added a comment - - edited

          I can reproduce the error with 1.8.0 but not with 1.7.x (so far). I tried to do a bisect earlier, but those old versions are pretty hard to build.

          Show
          jkemnade Jochen Kemnade added a comment - - edited I can reproduce the error with 1.8.0 but not with 1.7.x (so far). I tried to do a bisect earlier, but those old versions are pretty hard to build.
          Hide
          jkemnade Jochen Kemnade added a comment -

          Alright, I can reproduce it with 1.8.0-beta-3 but not with 1.8.0-beta-2. Could it be https://github.com/apache/groovy/commit/8e489ce9da4ebf21b6717b656e2ffff225244ca6?

          Show
          jkemnade Jochen Kemnade added a comment - Alright, I can reproduce it with 1.8.0-beta-3 but not with 1.8.0-beta-2. Could it be https://github.com/apache/groovy/commit/8e489ce9da4ebf21b6717b656e2ffff225244ca6?
          Hide
          jkemnade Jochen Kemnade added a comment -

          A quick git bisect session also leads to that commit.

          Show
          jkemnade Jochen Kemnade added a comment - A quick git bisect session also leads to that commit.
          Hide
          githubbot ASF GitHub Bot added a comment -

          GitHub user jwagenleitner opened a pull request:

          https://github.com/apache/groovy/pull/540

          GROOVY-7535: Groovy category throwing MissingMethodException and Miss…

          …ingPropertyException when using multiple threads

          You can merge this pull request into a Git repository by running:

          $ git pull https://github.com/jwagenleitner/groovy groovy7535

          Alternatively you can review and apply these changes as the patch at:

          https://github.com/apache/groovy/pull/540.patch

          To close this pull request, make a commit to your master/trunk branch
          with (at least) the following in the commit message:

          This closes #540


          commit 1c2ca41491fd919ddc1c543ec92a8ab7ca3b860b
          Author: John Wagenleitner <jwagenleitner@apache.org>
          Date: 2017-05-14T01:51:47Z

          GROOVY-7535: Groovy category throwing MissingMethodException and MissingPropertyException when using multiple threads


          Show
          githubbot ASF GitHub Bot added a comment - GitHub user jwagenleitner opened a pull request: https://github.com/apache/groovy/pull/540 GROOVY-7535 : Groovy category throwing MissingMethodException and Miss… …ingPropertyException when using multiple threads You can merge this pull request into a Git repository by running: $ git pull https://github.com/jwagenleitner/groovy groovy7535 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/groovy/pull/540.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #540 commit 1c2ca41491fd919ddc1c543ec92a8ab7ca3b860b Author: John Wagenleitner <jwagenleitner@apache.org> Date: 2017-05-14T01:51:47Z GROOVY-7535 : Groovy category throwing MissingMethodException and MissingPropertyException when using multiple threads
          Hide
          jkemnade Jochen Kemnade added a comment -

          Oh, the race condition occurs between the modification of the AtomicInteger and the assignment of the return value to the int field.
          I created a minimalistic patch that fixes it. But the PR LGTM too.

          Show
          jkemnade Jochen Kemnade added a comment - Oh, the race condition occurs between the modification of the AtomicInteger and the assignment of the return value to the int field. I created a minimalistic patch that fixes it. But the PR LGTM too.
          Hide
          jkemnade Jochen Kemnade added a comment -

          I think that this issue should be reopened as it seems to affect all versions since 1.8.0.

          Show
          jkemnade Jochen Kemnade added a comment - I think that this issue should be reopened as it seems to affect all versions since 1.8.0.
          Hide
          jwagenleitner John Wagenleitner added a comment -

          The updated patch looks good, though it does seem to me that if the counters are going to be synchronized that the DefaultMetaClassInfo.setCategoryUsed calls should be included in the block as well. Otherwise it could be called with false just after another thread got done calling with true.

          The updated attached patch keeps the two counters, it seems like the int counter was kept for performance for the method hasCategoryInCurrentThread since that's called frequently from the callsite logic.

          Show
          jwagenleitner John Wagenleitner added a comment - The updated patch looks good, though it does seem to me that if the counters are going to be synchronized that the DefaultMetaClassInfo.setCategoryUsed calls should be included in the block as well. Otherwise it could be called with false just after another thread got done calling with true . The updated attached patch keeps the two counters, it seems like the int counter was kept for performance for the method hasCategoryInCurrentThread since that's called frequently from the callsite logic.
          Hide
          blackdrag Jochen Theodorou added a comment -

          I don´t get the patch... isn´t the sole purpose of AtomicInteger to make those operations like incrementAndGet and decrementAndGet atomic enough to share it between threads? Isn´t it likely the race condition is actually something else and if this fixes the problem it does so by chance (maybe piggybacking synchronization)?

          Show
          blackdrag Jochen Theodorou added a comment - I don´t get the patch... isn´t the sole purpose of AtomicInteger to make those operations like incrementAndGet and decrementAndGet atomic enough to share it between threads? Isn´t it likely the race condition is actually something else and if this fixes the problem it does so by chance (maybe piggybacking synchronization)?
          Hide
          jwagenleitner John Wagenleitner added a comment -

          Jochen Theodorou - as Jochen Kemnade pointed out, there is a race condition between threads when writing to the private static int categoriesInUse counter as there is no guarantee that the values returned from calls to the Atomic will be written in the same order. The scenario is:

          1. T1 in endScope calls atomicCategoryUsageCounter.decrementAndGet() and receives 0
          2. T2 in newScope calls atomicCategoryUsageCounter.incrementAndGet() and receives 1
          3. T2 assigns categoriesInUse a value of 1
          4. T1 assigns categoriesInUse a value of 0
          5. Callsite called from T2 calls hasCategoryInCurrentThread and return is false

          The patch and PR both fix this by introducing ordering to the atomicCategoryUsageCounter return value and the write to categoriesInUse. Another way to fix without synchronizing would be to remove the int counter and just use the AtomicInteger. However, my hunch is that the int counter was retained because hasCategoryInCurrentThread is probably called frequently so there might have been some performance concern over requiring a method call and a volatile read that the AtomicInteger would incur in order to implement the quick exit check for that method if (categoriesInUse == 0) return false.

          There is also a race between scopes in calls to DefaultMetaClassInfo.setCategoryUsed, so I think it would also be good to add those calls to the synchronized blocks if they are added.

          Show
          jwagenleitner John Wagenleitner added a comment - Jochen Theodorou - as Jochen Kemnade pointed out, there is a race condition between threads when writing to the private static int categoriesInUse counter as there is no guarantee that the values returned from calls to the Atomic will be written in the same order. The scenario is: 1. T1 in endScope calls atomicCategoryUsageCounter.decrementAndGet() and receives 0 2. T2 in newScope calls atomicCategoryUsageCounter.incrementAndGet() and receives 1 3. T2 assigns categoriesInUse a value of 1 4. T1 assigns categoriesInUse a value of 0 5. Callsite called from T2 calls hasCategoryInCurrentThread and return is false The patch and PR both fix this by introducing ordering to the atomicCategoryUsageCounter return value and the write to categoriesInUse . Another way to fix without synchronizing would be to remove the int counter and just use the AtomicInteger. However, my hunch is that the int counter was retained because hasCategoryInCurrentThread is probably called frequently so there might have been some performance concern over requiring a method call and a volatile read that the AtomicInteger would incur in order to implement the quick exit check for that method if (categoriesInUse == 0) return false . There is also a race between scopes in calls to DefaultMetaClassInfo.setCategoryUsed , so I think it would also be good to add those calls to the synchronized blocks if they are added.
          Hide
          jkemnade Jochen Kemnade added a comment -

          Great explanation John. It took me a while to figure that out.
          By the way, since the PR synchronizes the modification of categoriesInUse, the AtomicInteger could be removed altogether.

          Show
          jkemnade Jochen Kemnade added a comment - Great explanation John. It took me a while to figure that out. By the way, since the PR synchronizes the modification of categoriesInUse , the AtomicInteger could be removed altogether.
          Hide
          blackdrag Jochen Theodorou added a comment - - edited

          John Wagenleitner and Jochen Kemnade maybe I am now looking stupid, but I still do not get it. If categoriesInUse is the problem, then why do you not have to synchronize all reads and writes to it? I am especially thinking here about hasCategoryInCurrentThread. Even if endScope and newScope are synchronized, since hasCategoryInCurrentThread is not, it can lead to T1 and T2 still having different values for it.

          The reason categoriesInUse even exists is for performance. reaplace it by directly using the AtomicInteger and see what happens. Or by synchronizing the read in hasCategoryInCurrentThread. And do not forget, that is a cost we basically pay on each method invocation during callsite caching. Indy is using the dEfaultMetaClassInfo logic, but I assume it has the same synchronizationn problem

          Show
          blackdrag Jochen Theodorou added a comment - - edited John Wagenleitner and Jochen Kemnade maybe I am now looking stupid, but I still do not get it. If categoriesInUse is the problem, then why do you not have to synchronize all reads and writes to it? I am especially thinking here about hasCategoryInCurrentThread. Even if endScope and newScope are synchronized, since hasCategoryInCurrentThread is not, it can lead to T1 and T2 still having different values for it. The reason categoriesInUse even exists is for performance. reaplace it by directly using the AtomicInteger and see what happens. Or by synchronizing the read in hasCategoryInCurrentThread. And do not forget, that is a cost we basically pay on each method invocation during callsite caching. Indy is using the dEfaultMetaClassInfo logic, but I assume it has the same synchronizationn problem
          Hide
          jkemnade Jochen Kemnade added a comment -

          With the PR, all writes are synchronized. Here's why I think that the reads are fine without synchronization/volatile (correct me if I'm wrong):
          Even if each thread uses its working copy of the categoriesInUse field , the copy may have the wrong value, but it can not be "significantly wrong".
          The other thread may have incremented the value, but not from 0 to 1 as the current thread will already have set it to a value >0, so it won't change the value of categoriesInUse == 0, which is the only thing that matters for the early exit in hasCategoryInCurrentThread.
          The other thread may also have decremented the value, but it cannot have set it to 0 if the current thread incremented it in newScope, so it would not have effected categoriesInUse == 0 either.

          Show
          jkemnade Jochen Kemnade added a comment - With the PR, all writes are synchronized. Here's why I think that the reads are fine without synchronization/ volatile (correct me if I'm wrong): Even if each thread uses its working copy of the categoriesInUse field , the copy may have the wrong value, but it can not be "significantly wrong". The other thread may have incremented the value, but not from 0 to 1 as the current thread will already have set it to a value >0 , so it won't change the value of categoriesInUse == 0 , which is the only thing that matters for the early exit in hasCategoryInCurrentThread . The other thread may also have decremented the value, but it cannot have set it to 0 if the current thread incremented it in newScope , so it would not have effected categoriesInUse == 0 either.
          Hide
          jkemnade Jochen Kemnade added a comment -

          Why don't we just use a ThreadLocal<Boolean> for hasCategoryInCurrentThread?

                  private void newScope () {
                      synchronized (lock) {
                          categoriesInUse++;
                          DefaultMetaClassInfo.setCategoryUsed(true);
                      }
                      hasCategoryInCurrentThread.set(Boolean.TRUE);
                      VMPluginFactory.getPlugin().invalidateCallSites();
                      level++;
                  }
          

          and

                  private void endScope () {
                      ...
                      synchronized (lock) {
                          categoriesInUse--;
                          if (categoriesInUse == 0) {
                              DefaultMetaClassInfo.setCategoryUsed(false);
                          }
                          hasCategoryInCurrentThread.remove();
                      }
                     ...
                  }
          

          and

              public static boolean hasCategoryInCurrentThread() {
                  if (hasCategoryInCurrentThread.get() != Boolean.TRUE) return false;
                  ThreadCategoryInfo infoNullable = THREAD_INFO.getInfoNullable();
                  return infoNullable != null && infoNullable.level != 0;
              }
          
          Show
          jkemnade Jochen Kemnade added a comment - Why don't we just use a ThreadLocal<Boolean> for hasCategoryInCurrentThread ? private void newScope () { synchronized (lock) { categoriesInUse++; DefaultMetaClassInfo.setCategoryUsed( true ); } hasCategoryInCurrentThread.set( Boolean .TRUE); VMPluginFactory.getPlugin().invalidateCallSites(); level++; } and private void endScope () { ... synchronized (lock) { categoriesInUse--; if (categoriesInUse == 0) { DefaultMetaClassInfo.setCategoryUsed( false ); } hasCategoryInCurrentThread.remove(); } ... } and public static boolean hasCategoryInCurrentThread() { if (hasCategoryInCurrentThread.get() != Boolean .TRUE) return false ; ThreadCategoryInfo infoNullable = THREAD_INFO.getInfoNullable(); return infoNullable != null && infoNullable.level != 0; }
          Hide
          blackdrag Jochen Theodorou added a comment -

          You said "The other thread may have incremented the value, but not from 0 to 1 as the current thread will already have set it to a value >0". Why is that not the case for the AtomicInteger alone and without synchronized? For me that should have been the same. Explain me this difference and I may be able to understand why it is a solution

          As for The ThreadLocal... well.. try it out. The speed was horrible last time I tried

          Show
          blackdrag Jochen Theodorou added a comment - You said "The other thread may have incremented the value, but not from 0 to 1 as the current thread will already have set it to a value >0". Why is that not the case for the AtomicInteger alone and without synchronized? For me that should have been the same. Explain me this difference and I may be able to understand why it is a solution As for The ThreadLocal... well.. try it out. The speed was horrible last time I tried
          Hide
          jwagenleitner John Wagenleitner added a comment -

          I agree that synchronizing the read in hasCategoryInCurrentThread() is not required because of the nature of how it's used. At worst stale non-zero values will cause it to fall back to checking the ThreadLocal with will result in the correct return value. And when in a Category the same thread would have incremented the counter so whatever version of the value it sees should be non-zero.

          By the way, since the PR synchronizes the modification of categoriesInUse, the AtomicInteger could be removed altogether.

          If that were done then hasCategoryInAnyThread() would need to synchronize on the same lock when reading the value of the counter.

          Why don't we just use a ThreadLocal<Boolean> for hasCategoryInCurrentThread

          I think the THREAD_INFO.getInfoNullable() retrieves a ThreadLocal value if within a category and so that alone should be enough (i.e., no reason to add another ThreadLocal). The int counter is a performance optimization to quicky exit since this method is called from each callsite.

          Show
          jwagenleitner John Wagenleitner added a comment - I agree that synchronizing the read in hasCategoryInCurrentThread() is not required because of the nature of how it's used. At worst stale non-zero values will cause it to fall back to checking the ThreadLocal with will result in the correct return value. And when in a Category the same thread would have incremented the counter so whatever version of the value it sees should be non-zero. By the way, since the PR synchronizes the modification of categoriesInUse, the AtomicInteger could be removed altogether. If that were done then hasCategoryInAnyThread() would need to synchronize on the same lock when reading the value of the counter. Why don't we just use a ThreadLocal<Boolean> for hasCategoryInCurrentThread I think the THREAD_INFO.getInfoNullable() retrieves a ThreadLocal value if within a category and so that alone should be enough (i.e., no reason to add another ThreadLocal). The int counter is a performance optimization to quicky exit since this method is called from each callsite.
          Hide
          jkemnade Jochen Kemnade added a comment - - edited

          It's the other case that makes the difference and that leads to the error, the one where the other thread sets the value to 0 even though the current thread is still using a category.
          I think John's explanation shows the problem pretty well, so I'll rather show you another possible way to fix the issue:

          diff --git a/src/main/org/codehaus/groovy/runtime/GroovyCategorySupport.java b/src/main/org/codehaus/groovy/runtime/GroovyCategorySupport.java
          index 4c36935df..42533a5f7 100644
          --- a/src/main/org/codehaus/groovy/runtime/GroovyCategorySupport.java
          +++ b/src/main/org/codehaus/groovy/runtime/GroovyCategorySupport.java
          @@ -72,7 +72,9 @@ public class GroovyCategorySupport {
                   private Map<String, String> propertySetterMap;
           
                   private void newScope () {
          +            synchronized(atomicCategoryUsageCounter){
                         categoriesInUse = atomicCategoryUsageCounter.incrementAndGet();
          +            }
                       DefaultMetaClassInfo.setCategoryUsed(true);
                       VMPluginFactory.getPlugin().invalidateCallSites();
                       level++;
          @@ -95,7 +97,9 @@ public class GroovyCategorySupport {
                           }
                       }
                       level--;
          +            synchronized(atomicCategoryUsageCounter){
                         categoriesInUse = atomicCategoryUsageCounter.decrementAndGet();
          +            }
                       VMPluginFactory.getPlugin().invalidateCallSites();
                       if (categoriesInUse==0) DefaultMetaClassInfo.setCategoryUsed(false);
                       if (level == 0) {
          

          This would also fix the problem that the modification of the AtomicInteger and the assignment of its current value to the int field are two separate steps.
          The problem is not that the other thread will have changed the AtomicInteger incorrectly but that it will have assigned an outdated value to the int field.

          As for The ThreadLocal... well.. try it out. The speed was horrible last time I tried

          yes, I guessed so. Well, if we remove the AtomicInteger from the PR version, the code will effectively have the same semantics as if we were using a ThreadLocal.

          Show
          jkemnade Jochen Kemnade added a comment - - edited It's the other case that makes the difference and that leads to the error, the one where the other thread sets the value to 0 even though the current thread is still using a category. I think John's explanation shows the problem pretty well, so I'll rather show you another possible way to fix the issue: diff --git a/src/main/org/codehaus/groovy/runtime/GroovyCategorySupport.java b/src/main/org/codehaus/groovy/runtime/GroovyCategorySupport.java index 4c36935df..42533a5f7 100644 --- a/src/main/org/codehaus/groovy/runtime/GroovyCategorySupport.java +++ b/src/main/org/codehaus/groovy/runtime/GroovyCategorySupport.java @@ -72,7 +72,9 @@ public class GroovyCategorySupport { private Map< String , String > propertySetterMap; private void newScope () { + synchronized (atomicCategoryUsageCounter){ categoriesInUse = atomicCategoryUsageCounter.incrementAndGet(); + } DefaultMetaClassInfo.setCategoryUsed( true ); VMPluginFactory.getPlugin().invalidateCallSites(); level++; @@ -95,7 +97,9 @@ public class GroovyCategorySupport { } } level--; + synchronized (atomicCategoryUsageCounter){ categoriesInUse = atomicCategoryUsageCounter.decrementAndGet(); + } VMPluginFactory.getPlugin().invalidateCallSites(); if (categoriesInUse==0) DefaultMetaClassInfo.setCategoryUsed( false ); if (level == 0) { This would also fix the problem that the modification of the AtomicInteger and the assignment of its current value to the int field are two separate steps. The problem is not that the other thread will have changed the AtomicInteger incorrectly but that it will have assigned an outdated value to the int field. As for The ThreadLocal... well.. try it out. The speed was horrible last time I tried yes, I guessed so. Well, if we remove the AtomicInteger from the PR version, the code will effectively have the same semantics as if we were using a ThreadLocal .
          Hide
          jkemnade Jochen Kemnade added a comment -

          If that were done then hasCategoryInAnyThread() would need to synchronize on the same lock when reading the value of the counter.

          I think we can do that without causing much trouble, hasCategoryInAnyThread() is unused anyway.

          Show
          jkemnade Jochen Kemnade added a comment - If that were done then hasCategoryInAnyThread() would need to synchronize on the same lock when reading the value of the counter. I think we can do that without causing much trouble, hasCategoryInAnyThread() is unused anyway.
          Hide
          jkemnade Jochen Kemnade added a comment -

          At worst stale non-zero values will cause it to fall back to checking the ThreadLocal with will result in the correct return value.

          I don't think that we will get relevant stale values. That's what I meant with "significantly wrong". A stale value would only lead to different behavior if it was 0 where the real value was != 0 or vice versa. And AFAICT, neither of that can happen.

          Show
          jkemnade Jochen Kemnade added a comment - At worst stale non-zero values will cause it to fall back to checking the ThreadLocal with will result in the correct return value. I don't think that we will get relevant stale values. That's what I meant with "significantly wrong". A stale value would only lead to different behavior if it was 0 where the real value was != 0 or vice versa. And AFAICT, neither of that can happen.
          Hide
          jkemnade Jochen Kemnade added a comment -

          The updated PR looks good to me.

          Show
          jkemnade Jochen Kemnade added a comment - The updated PR looks good to me.
          Hide
          blackdrag Jochen Theodorou added a comment -

          ok, let me propose an alternative fix... remove categoriesInUse and use atomicCategoryUsageCounter directly instead

          Show
          blackdrag Jochen Theodorou added a comment - ok, let me propose an alternative fix... remove categoriesInUse and use atomicCategoryUsageCounter directly instead
          Hide
          jwagenleitner John Wagenleitner added a comment -

          I updated the PR, my assumption still being that it is important that DefaultMetaClassInfo.setCategoryUsed(true) is set any time a thread is in a Category so that is why there is still a need to synchronize the atomic increment/decrement and the calls to DefaultMetaClassInfo.setCategoryUsed.

          Show
          jwagenleitner John Wagenleitner added a comment - I updated the PR, my assumption still being that it is important that DefaultMetaClassInfo.setCategoryUsed(true) is set any time a thread is in a Category so that is why there is still a need to synchronize the atomic increment/decrement and the calls to DefaultMetaClassInfo.setCategoryUsed .
          Hide
          blackdrag Jochen Theodorou added a comment -

          John, I thought this information is not used in the test

          Show
          blackdrag Jochen Theodorou added a comment - John, I thought this information is not used in the test
          Hide
          jwagenleitner John Wagenleitner added a comment -

          That is true, I have not been able to find a problem that is caused by the race condition if the flags set by setCategoryUsed called with false when another thread enters a Category. But without the sync it's very possible for one thread to set it to false just after another thread set it to true.

          Show
          jwagenleitner John Wagenleitner added a comment - That is true, I have not been able to find a problem that is caused by the race condition if the flags set by setCategoryUsed called with false when another thread enters a Category. But without the sync it's very possible for one thread to set it to false just after another thread set it to true.
          Hide
          blackdrag Jochen Theodorou added a comment -

          I think I am ok with the change in total now.

          Show
          blackdrag Jochen Theodorou added a comment - I think I am ok with the change in total now.
          Hide
          jkemnade Jochen Kemnade added a comment -

          I actually liked the previous version better. Synchronizing the write access to the AtomicInteger feels awkward. It shouldn't have a negative effect on the performance though, because the lock inside the AtomicInteger should always be available right away.
          But using an AtomicInteger instead of an int in hasCategoryInCurrentThread will add the potential for a lock congestion where no locking is actually required.
          I'll try to perform some benchmarks and try out another version that uses a ReentrantReadWriteLock.

          Show
          jkemnade Jochen Kemnade added a comment - I actually liked the previous version better. Synchronizing the write access to the AtomicInteger feels awkward. It shouldn't have a negative effect on the performance though, because the lock inside the AtomicInteger should always be available right away. But using an AtomicInteger instead of an int in hasCategoryInCurrentThread will add the potential for a lock congestion where no locking is actually required. I'll try to perform some benchmarks and try out another version that uses a ReentrantReadWriteLock.
          Hide
          jkemnade Jochen Kemnade added a comment -

          It doesn't make much of a difference. Pretty much all of the locking happens in IndyInterface.invalidateSwitchPoints.

          Show
          jkemnade Jochen Kemnade added a comment - It doesn't make much of a difference. Pretty much all of the locking happens in IndyInterface.invalidateSwitchPoints .
          Hide
          jkemnade Jochen Kemnade added a comment -

          Oh, that's GROOVY-7811

          Show
          jkemnade Jochen Kemnade added a comment - Oh, that's GROOVY-7811
          Hide
          jwagenleitner John Wagenleitner added a comment -

          The AtomicInteger#get shouldn't lock but would be a volatile read (and other methods use CAS and not locks). While I don't think having that volatile read in hasCategoryInCurrentThread will cause too much of a performance issue, I do agree that it's awkward to be using the AtomicInteger in the synchronized blocks. Given that there is only a need to observe 0 or non-Zero for categoriesInUse and a precise number does not matter, except when decrementing which is synchronized. I also think the int counter version is better as long as there's a some comment to explain why there is no synchronized read in hasCategoryInCurrentThread. In either case, I think one additional change should be to deprecate the hasCategoryInAnyThread method.

          I think the original fix that was put into 2.4.6 did not fix the issue and may have just reduced the number of instructions that the threads could interleave on in order to reproduce. I'd like to see what others think, if in agreement then as Paul King mentioned this should be re-opened.

          Show
          jwagenleitner John Wagenleitner added a comment - The AtomicInteger#get shouldn't lock but would be a volatile read (and other methods use CAS and not locks). While I don't think having that volatile read in hasCategoryInCurrentThread will cause too much of a performance issue, I do agree that it's awkward to be using the AtomicInteger in the synchronized blocks. Given that there is only a need to observe 0 or non-Zero for categoriesInUse and a precise number does not matter, except when decrementing which is synchronized. I also think the int counter version is better as long as there's a some comment to explain why there is no synchronized read in hasCategoryInCurrentThread . In either case, I think one additional change should be to deprecate the hasCategoryInAnyThread method. I think the original fix that was put into 2.4.6 did not fix the issue and may have just reduced the number of instructions that the threads could interleave on in order to reproduce. I'd like to see what others think, if in agreement then as Paul King mentioned this should be re-opened.
          Hide
          blackdrag Jochen Theodorou added a comment -

          In that case my suggestion would be to do the synchronization as is, but use int instead of AtomicInteger

          Show
          blackdrag Jochen Theodorou added a comment - In that case my suggestion would be to do the synchronization as is, but use int instead of AtomicInteger
          Hide
          jkemnade Jochen Kemnade added a comment -

          Agreed. The current PR looks good to me again/still.

          I think the original fix that was put into 2.4.6 did not fix the issue.

          Also agreed.

          Show
          jkemnade Jochen Kemnade added a comment - Agreed. The current PR looks good to me again/still. I think the original fix that was put into 2.4.6 did not fix the issue. Also agreed.
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user asfgit closed the pull request at:

          https://github.com/apache/groovy/pull/540

          Show
          githubbot ASF GitHub Bot added a comment - Github user asfgit closed the pull request at: https://github.com/apache/groovy/pull/540
          Hide
          jwagenleitner John Wagenleitner added a comment -

          Since there is agreement that the original fix PR 169 did not resolve the issue, re-opening in order to update the fix version to 2.4.12.

          Show
          jwagenleitner John Wagenleitner added a comment - Since there is agreement that the original fix PR 169 did not resolve the issue, re-opening in order to update the fix version to 2.4.12.
          Hide
          jwagenleitner John Wagenleitner added a comment -

          Thanks for reporting the issue and thanks to Jochen Kemnade for investigating the fix and patch.

          Show
          jwagenleitner John Wagenleitner added a comment - Thanks for reporting the issue and thanks to Jochen Kemnade for investigating the fix and patch.
          Hide
          paulk Paul King added a comment -

          Nice work everyone!

          Show
          paulk Paul King added a comment - Nice work everyone!
          Hide
          jkemnade Jochen Kemnade added a comment -

          Thanks guys, I learned some interesting things from this.

          Show
          jkemnade Jochen Kemnade added a comment - Thanks guys, I learned some interesting things from this.

            People

            • Assignee:
              jwagenleitner John Wagenleitner
              Reporter:
              joswald James Oswald
            • Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development