Uploaded image for project: 'Geode'
  1. Geode
  2. GEODE-9121

Regression Introduced Through GEODE-8905

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Won't Fix
    • 1.15.0
    • None
    • client/server
    • None

    Description

      The new implementation of the JarDeploymentService seems to be deleting resources when a member is gracefully shutdown, which in turns generates a race condition if there are functions being executed on the member during that time.
      In previous versions, a client application would simply retry the operation and no exception or loss of availability would be seen, right now the following exception is thrown on the client instead:

      Exception in thread "main" org.apache.geode.cache.execute.FunctionException: org.apache.geode.cache.client.ServerOperationException: remote server on 192.168.0.73(3985:loner):49836:c9f57ea7: The function, XXXXXXXX, has not been registered
              at org.apache.geode.internal.cache.execute.ServerRegionFunctionExecutor.executeOnServer(ServerRegionFunctionExecutor.java:237)
              at org.apache.geode.internal.cache.execute.ServerRegionFunctionExecutor.executeFunction(ServerRegionFunctionExecutor.java:184)
              at org.apache.geode.internal.cache.execute.ServerRegionFunctionExecutor.execute(ServerRegionFunctionExecutor.java:388)
              at org.apache.geode.internal.cache.execute.ServerRegionFunctionExecutor.execute(ServerRegionFunctionExecutor.java:351)
              at test.TestClient.main(TestClient.java:20)
      Caused by: org.apache.geode.cache.client.ServerOperationException: remote server on 192.168.0.73(3985:loner):49836:c9f57ea7: The function, XXXXXXXX, has not been registered
              at org.apache.geode.cache.client.internal.ExecuteRegionFunctionSingleHopOp$ExecuteRegionFunctionSingleHopOpImpl.processResponse(ExecuteRegionFunctionSingleHopOp.java:370)
              at org.apache.geode.cache.client.internal.AbstractOp.processResponse(AbstractOp.java:224)
              at org.apache.geode.cache.client.internal.AbstractOp.attemptReadResponse(AbstractOp.java:197)
              at org.apache.geode.cache.client.internal.AbstractOp.attempt(AbstractOp.java:384)
              at org.apache.geode.cache.client.internal.AbstractOpWithTimeout.attempt(AbstractOpWithTimeout.java:45)
              at org.apache.geode.cache.client.internal.ConnectionImpl.execute(ConnectionImpl.java:284)
              at org.apache.geode.cache.client.internal.pooling.PooledConnection.execute(PooledConnection.java:355)
              at org.apache.geode.cache.client.internal.OpExecutorImpl.executeWithPossibleReAuthentication(OpExecutorImpl.java:756)
              at org.apache.geode.cache.client.internal.OpExecutorImpl.executeOnServer(OpExecutorImpl.java:335)
              at org.apache.geode.cache.client.internal.OpExecutorImpl.executeOn(OpExecutorImpl.java:304)
              at org.apache.geode.cache.client.internal.PoolImpl.executeOn(PoolImpl.java:840)
              at org.apache.geode.cache.client.internal.SingleHopOperationCallable.call(SingleHopOperationCallable.java:49)
              at java.util.concurrent.FutureTask.run(FutureTask.java:266)
              at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
              at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
              at java.lang.Thread.run(Thread.java:748)
      

      This seems to be a regression introduced through GEODE-8905. I've tested the same scenario with version 1.13.2 (released), branch support/1.14 and commit b80094ec5e with no problems at all. When testing using commit 6f764a7046, on the other hand, the problem is easily reproducible.

      How to reproduce the issue:

      1. Download and extract workspace.zip.
      2. Execute the reproduce.sh script and follow the instructions on screen.

      The version of Geode to use on server side can be changed through the GEMFIRE variable within the reproduce.sh script.
      The version of Geode to use on client side can be changed through the GEODE_VERSION variable within the launch_client.sh script.

      The client application simply executes the TestFunction forever. When running the scenario using a version of Geode that doesn't include commit 6f764a7046, the client simply retries under the hood and no exception is thrown. When using the current develop branch, however, an exception is thrown and the client application terminates as soon as a server is restarted.

      ukohlmeyer, pjohnson: I'm tagging you both as you were both working on this feature, feel free to assign the ticket to however you consider necessary.

      Attachments

        1. workspace.zip
          63 kB
          Juan Ramos

        Activity

          People

            ukohlmeyer Udo Kohlmeyer
            jjramos Juan Ramos
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: