HBase
  1. HBase
  2. HBASE-10724

TestMultiParallel#testNonceCollision occasionally fails with OperationConflictException

    Details

    • Type: Test Test
    • Status: Open
    • Priority: Minor Minor
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None

      Description

      From https://builds.apache.org/job/HBase-0.98/220/testReport/junit/org.apache.hadoop.hbase.client/TestMultiParallel/testNonceCollision/ :

      org.apache.hadoop.hbase.exceptions.OperationConflictException: org.apache.hadoop.hbase.exceptions.OperationConflictException: The operation with nonce {-1778587827371821880, 5283077739350761367} on row [xxx] may have already completed
      	at org.apache.hadoop.hbase.regionserver.HRegionServer.startNonceOperation(HRegionServer.java:4159)
      	at org.apache.hadoop.hbase.regionserver.HRegionServer.increment(HRegionServer.java:4123)
      	at org.apache.hadoop.hbase.regionserver.HRegionServer.mutate(HRegionServer.java:2888)
      	at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:28452)
      	at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2012)
      	at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:98)
      	at org.apache.hadoop.hbase.ipc.SimpleRpcScheduler.consumerLoop(SimpleRpcScheduler.java:160)
      	at org.apache.hadoop.hbase.ipc.SimpleRpcScheduler.access$000(SimpleRpcScheduler.java:38)
      	at org.apache.hadoop.hbase.ipc.SimpleRpcScheduler$1.run(SimpleRpcScheduler.java:110)
      	at java.lang.Thread.run(Thread.java:662)
      
      	at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
      	at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
      	at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
      	at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
      	at org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:106)
      	at org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:95)
      	at org.apache.hadoop.hbase.protobuf.ProtobufUtil.getRemoteException(ProtobufUtil.java:284)
      	at org.apache.hadoop.hbase.client.HTable$7.call(HTable.java:1053)
      	at org.apache.hadoop.hbase.client.HTable$7.call(HTable.java:1043)
      	at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:114)
      	at org.apache.hadoop.hbase.client.HTable.increment(HTable.java:1057)
      	at org.apache.hadoop.hbase.client.TestMultiParallel.testNonceCollision(TestMultiParallel.java:516)
      	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
      	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
      	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
      	at java.lang.reflect.Method.invoke(Method.java:597)
      	at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
      	at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
      	at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
      	at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
      	at org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74)
      Caused by: org.apache.hadoop.hbase.ipc.RemoteWithExtrasException(org.apache.hadoop.hbase.exceptions.OperationConflictException): org.apache.hadoop.hbase.exceptions.OperationConflictException: The operation with nonce {-1778587827371821880, 5283077739350761367} on row [xxx] may have already completed
      	at org.apache.hadoop.hbase.regionserver.HRegionServer.startNonceOperation(HRegionServer.java:4159)
      	at org.apache.hadoop.hbase.regionserver.HRegionServer.increment(HRegionServer.java:4123)
      	at org.apache.hadoop.hbase.regionserver.HRegionServer.mutate(HRegionServer.java:2888)
      	at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:28452)
      	at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2012)
      	at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:98)
      	at org.apache.hadoop.hbase.ipc.SimpleRpcScheduler.consumerLoop(SimpleRpcScheduler.java:160)
      	at org.apache.hadoop.hbase.ipc.SimpleRpcScheduler.access$000(SimpleRpcScheduler.java:38)
      	at org.apache.hadoop.hbase.ipc.SimpleRpcScheduler$1.run(SimpleRpcScheduler.java:110)
      	at java.lang.Thread.run(Thread.java:662)
      
      	at org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1450)
      	at org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1654)
      	at org.apache.hadoop.hbase.ipc.RpcClient$BlockingRpcChannelImplementation.callBlockingMethod(RpcClient.java:1712)
      	at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$BlockingStub.mutate(ClientProtos.java:28845)
      	at org.apache.hadoop.hbase.client.HTable$7.call(HTable.java:1050)
      

      Exception came from the first call to table.increment():

          try {
            Increment inc = new Increment(ONE_ROW);
            inc.addColumn(BYTES_FAMILY, QUALIFIER, 1L);
            table.increment(inc);
      

      There seemed to be race between the first and second table.increment() calls.
      In the above case, the first call received OperationConflictException

        Activity

        Hide
        Ulli Berthold added a comment -

        Caught this problem in a real application during a mass insert process using several hundreds of python processes to load statistical back data:

        IOError(_message='org.apache.hadoop.hbase.exceptions.OperationConflictException: The operation with nonce {-2298743813494479781, -5039431054343659404} on row [0_741_2014081720_b325aea5092ebac621ab1c7d0144cf9c] may have already completed
        at org.apache.hadoop.hbase.regionserver.HRegionServer.startNonceOperation(HRegionServer.java:4199)
        at org.apache.hadoop.hbase.regionserver.HRegionServer.increment(HRegionServer.java:4163)
        at org.apache.hadoop.hbase.regionserver.HRegionServer.mutate(HRegionServer.java:2890)
        at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:29495)
        at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2012)
        at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:98)
        at org.apache.hadoop.hbase.ipc.SimpleRpcScheduler.consumerLoop(SimpleRpcScheduler.java:160)
        at org.apache.hadoop.hbase.ipc.SimpleRpcScheduler.access$000(SimpleRpcScheduler.java:38)
        at org.apache.hadoop.hbase.ipc.SimpleRpcScheduler$1.run(SimpleRpcScheduler.java:110)
        at java.lang.Thread.run(Thread.java:745)
        ')

        Show
        Ulli Berthold added a comment - Caught this problem in a real application during a mass insert process using several hundreds of python processes to load statistical back data: IOError(_message='org.apache.hadoop.hbase.exceptions.OperationConflictException: The operation with nonce {-2298743813494479781, -5039431054343659404} on row [0_741_2014081720_b325aea5092ebac621ab1c7d0144cf9c] may have already completed at org.apache.hadoop.hbase.regionserver.HRegionServer.startNonceOperation(HRegionServer.java:4199) at org.apache.hadoop.hbase.regionserver.HRegionServer.increment(HRegionServer.java:4163) at org.apache.hadoop.hbase.regionserver.HRegionServer.mutate(HRegionServer.java:2890) at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:29495) at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2012) at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:98) at org.apache.hadoop.hbase.ipc.SimpleRpcScheduler.consumerLoop(SimpleRpcScheduler.java:160) at org.apache.hadoop.hbase.ipc.SimpleRpcScheduler.access$000(SimpleRpcScheduler.java:38) at org.apache.hadoop.hbase.ipc.SimpleRpcScheduler$1.run(SimpleRpcScheduler.java:110) at java.lang.Thread.run(Thread.java:745) ')
        Hide
        stack added a comment -

        Patch?

        Show
        stack added a comment - Patch?

          People

          • Assignee:
            Unassigned
            Reporter:
            Ted Yu
          • Votes:
            1 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

            • Created:
              Updated:

              Development