Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-12949

Scanner can be stuck in infinite loop if the HFile is corrupted

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 0.94.3, 0.98.10
    • 1.4.0, 2.0.0
    • None
    • None
    • Reviewed

    Description

      We've encountered problem where compaction hangs and never completes.
      After looking into it further, we found that the compaction scanner was stuck in a infinite loop. See stack below.

      org.apache.hadoop.hbase.regionserver.KeyValueHeap.generalizedSeek(KeyValueHeap.java:296)
      org.apache.hadoop.hbase.regionserver.KeyValueHeap.reseek(KeyValueHeap.java:257)
      org.apache.hadoop.hbase.regionserver.StoreScanner.reseek(StoreScanner.java:697)
      org.apache.hadoop.hbase.regionserver.StoreScanner.seekToNextRow(StoreScanner.java:672)
      org.apache.hadoop.hbase.regionserver.StoreScanner.next(StoreScanner.java:529)
      org.apache.hadoop.hbase.regionserver.compactions.Compactor.performCompaction(Compactor.java:223)
      

      We identified the hfile that seems to be corrupted. Using HFile tool shows the following:

      [biadmin@hdtest009 bin]$ hbase org.apache.hadoop.hbase.io.hfile.HFile -v -k -m -f /user/biadmin/CUMMINS_INSITE_V1/7106432d294dd844be15996ccbf2ba84/attributes/f1a7e3113c2c4047ac1fc8fbcb41d8b7
      15/01/23 11:53:17 INFO Configuration.deprecation: hadoop.native.lib is deprecated. Instead, use io.native.lib.available
      15/01/23 11:53:18 INFO util.ChecksumType: Checksum using org.apache.hadoop.util.PureJavaCrc32
      15/01/23 11:53:18 INFO util.ChecksumType: Checksum can use org.apache.hadoop.util.PureJavaCrc32C
      15/01/23 11:53:18 INFO Configuration.deprecation: fs.default.name is deprecated. Instead, use fs.defaultFS
      Scanning -> /user/biadmin/CUMMINS_INSITE_V1/7106432d294dd844be15996ccbf2ba84/attributes/f1a7e3113c2c4047ac1fc8fbcb41d8b7
      WARNING, previous row is greater then current row
              filename -> /user/biadmin/CUMMINS_INSITE_V1/7106432d294dd844be15996ccbf2ba84/attributes/f1a7e3113c2c4047ac1fc8fbcb41d8b7
              previous -> \x00/20110203-094231205-79442793-1410161293068203000\x0Aattributes16794406\x00\x00\x01\x00\x00\x00\x00\x00\x00
              current  ->
      Exception in thread "main" java.nio.BufferUnderflowException
              at java.nio.Buffer.nextGetIndex(Buffer.java:489)
              at java.nio.HeapByteBuffer.getInt(HeapByteBuffer.java:347)
              at org.apache.hadoop.hbase.io.hfile.HFileReaderV2$ScannerV2.readKeyValueLen(HFileReaderV2.java:856)
              at org.apache.hadoop.hbase.io.hfile.HFileReaderV2$ScannerV2.next(HFileReaderV2.java:768)
              at org.apache.hadoop.hbase.io.hfile.HFilePrettyPrinter.scanKeysValues(HFilePrettyPrinter.java:362)
              at org.apache.hadoop.hbase.io.hfile.HFilePrettyPrinter.processFile(HFilePrettyPrinter.java:262)
              at org.apache.hadoop.hbase.io.hfile.HFilePrettyPrinter.run(HFilePrettyPrinter.java:220)
              at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
              at org.apache.hadoop.hbase.io.hfile.HFilePrettyPrinter.main(HFilePrettyPrinter.java:539)
              at org.apache.hadoop.hbase.io.hfile.HFile.main(HFile.java:802)
      

      Turning on Java Assert shows the following:

      Exception in thread "main" java.lang.AssertionError: Key 20110203-094231205-79442793-1410161293068203000/attributes:16794406/1099511627776/Minimum/vlen=15/mvcc=0 followed by a smaller key //0/Minimum/vlen=0/mvcc=0 in cf attributes
              at org.apache.hadoop.hbase.regionserver.StoreScanner.checkScanOrder(StoreScanner.java:672)
      

      It shows that the hfile seems to be corrupted – the keys don't seem to be right.
      But Scanner is not able to give a meaningful error, but stuck in an infinite loop in here:

      KeyValueHeap.generalizedSeek()
      while ((scanner = heap.poll()) != null) {
      }
      

      Attachments

        1. HBASE-12949-branch-1-v3.patch
          2 kB
          Jerry He
        2. HBASE-12949-master.patch
          6 kB
          Jerry He
        3. HBASE-12949-master-v2.patch
          2 kB
          Michael Stack
        4. HBASE-12949-master-v2.patch
          2 kB
          Michael Stack
        5. HBASE-12949-master-v2.patch
          2 kB
          Jerry He
        6. HBASE-12949-master-v2 (1).patch
          2 kB
          Michael Stack
        7. HBASE-12949-master-v3.patch
          3 kB
          Jerry He

        Issue Links

          Activity

            People

              jinghe Jerry He
              jinghe Jerry He
              Votes:
              0 Vote for this issue
              Watchers:
              16 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: