Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-5623

Race condition when rolling the HLog and hlogFlush

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Critical
    • Resolution: Fixed
    • 0.94.0
    • 0.94.0
    • wal
    • None
    • Reviewed

    Description

      When doing a ycsb test with a large number of handlers (regionserver.handler.count=60), I get the following exceptions:

      Caused by: org.apache.hadoop.ipc.RemoteException: java.io.IOException: java.lang.NullPointerException
      	at org.apache.hadoop.io.SequenceFile$Writer.getLength(SequenceFile.java:1099)
      	at org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogWriter.getLength(SequenceFileLogWriter.java:314)
      	at org.apache.hadoop.hbase.regionserver.wal.HLog.syncer(HLog.java:1291)
      	at org.apache.hadoop.hbase.regionserver.wal.HLog.sync(HLog.java:1388)
      	at org.apache.hadoop.hbase.regionserver.HRegion.doMiniBatchPut(HRegion.java:2192)
      	at org.apache.hadoop.hbase.regionserver.HRegion.put(HRegion.java:1985)
      	at org.apache.hadoop.hbase.regionserver.HRegionServer.multi(HRegionServer.java:3400)
      	at sun.reflect.GeneratedMethodAccessor17.invoke(Unknown Source)
      	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
      	at java.lang.reflect.Method.invoke(Method.java:597)
      	at org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:366)
      	at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1351)
      
      	at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:920)
      	at org.apache.hadoop.hbase.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:152)
      	at $Proxy1.multi(Unknown Source)
      	at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$3$1.call(HConnectionManager.java:1691)
      	at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$3$1.call(HConnectionManager.java:1689)
      	at org.apache.hadoop.hbase.client.ServerCallable.withoutRetries(ServerCallable.java:214)
      

      and

      	java.lang.NullPointerException
      		at org.apache.hadoop.io.SequenceFile$Writer.checkAndWriteSync(SequenceFile.java:1026)
      		at org.apache.hadoop.io.SequenceFile$Writer.append(SequenceFile.java:1068)
      		at org.apache.hadoop.io.SequenceFile$Writer.append(SequenceFile.java:1035)
      		at org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogWriter.append(SequenceFileLogWriter.java:279)
      		at org.apache.hadoop.hbase.regionserver.wal.HLog$LogSyncer.hlogFlush(HLog.java:1237)
      		at org.apache.hadoop.hbase.regionserver.wal.HLog.syncer(HLog.java:1271)
      		at org.apache.hadoop.hbase.regionserver.wal.HLog.sync(HLog.java:1391)
      		at org.apache.hadoop.hbase.regionserver.HRegion.doMiniBatchPut(HRegion.java:2192)
      		at org.apache.hadoop.hbase.regionserver.HRegion.put(HRegion.java:1985)
      		at org.apache.hadoop.hbase.regionserver.HRegionServer.multi(HRegionServer.java:3400)
      		at sun.reflect.GeneratedMethodAccessor33.invoke(Unknown Source)
      		at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
      		at java.lang.reflect.Method.invoke(Method.java:597)
      		at org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:366)
      		at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1351)
      

      It seems the root cause of the issue is that we open a new log writer and close the old one at HLog#rollWriter() holding the updateLock, but the other threads doing syncer() calls

       
      logSyncerThread.hlogFlush(this.writer);
      

      without holding the updateLock. LogSyncer only synchronizes against concurrent appends and flush(), but not on the passed writer, which can be closed already by rollWriter(). In this case, since SequenceFile#Writer.close() sets it's out field as null, we get the NPE.

      Attachments

        1. 5623.txt
          2 kB
          Lars Hofhansl
        2. 5623-suggestion.txt
          8 kB
          Lars Hofhansl
        3. 5623v2.txt
          6 kB
          Michael Stack
        4. 5623-v7.txt
          10 kB
          Lars Hofhansl
        5. 5623-v8.txt
          10 kB
          Lars Hofhansl
        6. HBASE-5623_v0.patch
          1 kB
          Enis Soztutar
        7. HBASE-5623_v4.patch
          11 kB
          Enis Soztutar
        8. HBASE-5623_v5.patch
          16 kB
          Enis Soztutar
        9. HBASE-5623_v6-alt.patch
          13 kB
          Ted Yu
        10. HBASE-5623_v6-alt.patch
          13 kB
          Enis Soztutar

        Issue Links

          Activity

            People

              enis Enis Soztutar
              enis Enis Soztutar
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: