Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-26938

Compaction failures after StoreFileTracker integration

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Blocker
    • Resolution: Fixed
    • 2.5.0, 3.0.0-alpha-2, 2.6.0
    • 2.5.0, 3.0.0-alpha-3
    • Compaction
    • None
    • Reviewed

    Description

      [ Currently this has only been tested with branch-2.5 and branch-2. Testing with master next, will update afterward. ]

      Test cluster of 10 regionservers is configured each RS with 5 flush threads, 5 large compaction threads, and 10 small compaction threads.

      Hadoop is 3.3.2. Java is 11. HFiles are on HDFS.

      All the StoreFileTracker implementations, DEFAULT or FILE, exhibit compaction time store writer errors in an ingest heavy use case. Unit tests don't seem to cover whatever this is. Most compactions succeed, but some do not. Those that do not are failing with state or sanity check assertions. Below errors are all from DEFAULT. They seem related... store writer instance usage/close/locking issues during compactions.

      Warnings like "writer exists when it should not":

      2022-04-07T23:13:11,351 WARN  [regionserver/ip-172-31-63-83:8120-shortCompactions-8]
      compactions.Compactor: Writer exists when it should not: {
        hdfs://ip-172-31-58-47.us-west-2.compute.internal:8020/hbase/data/default/IntegrationTestLoadCommonCrawl/b518f72941d4427e7e1923407643df67/.tmp/c/29d7b88c4c214ddcbba4f747514a2cf5 }
      

      Errors like:

      IllegalStateException thrown from HFileBlockIndex$BlockIndexWriter.shouldWriteBlock:

      2022-04-07T23:13:11,508 ERROR [regionserver/ip-172-31-63-83:8120-shortCompactions-6] regionserver.CompactSplit: Compaction failed region=IntegrationTestLoadCommonCrawl,,1649373172576.b518f72941d4427e7e1923407643df67., storeName=b518f72941d4427e7e1923407643df67/c, priority=10, startTime=1649373185476
      java.lang.IllegalStateException: curInlineChunk is null; has shouldWriteBlock been called with closing=true and then called again?
              at org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexWriter.shouldWriteBlock(HFileBlockIndex.java:1258) ~[hbase-server-2.5.0-SNAPSHOT.jar:2.5.0-SNAPSHOT]
              at org.apache.hadoop.hbase.io.hfile.HFileWriterImpl.writeInlineBlocks(HFileWriterImpl.java:523) ~[hbase-server-2.5.0-SNAPSHOT.jar:2.5.0-SNAPSHOT]
              at org.apache.hadoop.hbase.io.hfile.HFileWriterImpl.close(HFileWriterImpl.java:608) ~[hbase-server-2.5.0-SNAPSHOT.jar:2.5.0-SNAPSHOT]
              at org.apache.hadoop.hbase.regionserver.StoreFileWriter.close(StoreFileWriter.java:377) ~[hbase-server-2.5.0-SNAPSHOT.jar:2.5.0-SNAPSHOT]
              at org.apache.hadoop.hbase.regionserver.compactions.DefaultCompactor.abortWriter(DefaultCompactor.java:84) ~[hbase-server-2.5.0-SNAPSHOT.jar:2.5.0-SNAPSHOT]
              at org.apache.hadoop.hbase.regionserver.compactions.DefaultCompactor.abortWriter(DefaultCompactor.java:76) ~[hbase-server-2.5.0-SNAPSHOT.jar:2.5.0-SNAPSHOT]
              at org.apache.hadoop.hbase.regionserver.compactions.Compactor.compact(Compactor.java:384) ~[hbase-server-2.5.0-SNAPSHOT.jar:2.5.0-SNAPSHOT]
              at org.apache.hadoop.hbase.regionserver.compactions.DefaultCompactor.compact(DefaultCompactor.java:62) ~[hbase-server-2.5.0-SNAPSHOT.jar:2.5.0-SNAPSHOT]
              at org.apache.hadoop.hbase.regionserver.DefaultStoreEngine$DefaultCompactionContext.compact(DefaultStoreEngine.java:125) ~[hbase-server-2.5.0-SNAPSHOT.jar:2.5.0-SNAPSHOT]
      

      and IllegalStateException thrown from HFileBlock$Writer.expectState:

      2022-04-07T23:13:11,559 ERROR [regionserver/ip-172-31-63-83:8120-shortCompactions-8] regionserver.CompactSplit: Compaction failed region=IntegrationTestLoadCommonCrawl,,1649373172576.b518f72941d4427e7e1923407643df67., storeName=b518f72941d4427e7e1923407643df67/c, priority=0, startTime=1649373191325
      java.lang.IllegalStateException: Expected state: BLOCK_READY, actual state: WRITING
              at org.apache.hadoop.hbase.io.hfile.HFileBlock$Writer.expectState(HFileBlock.java:1190) ~[hbase-server-2.5.0-SNAPSHOT.jar:2.5.0-SNAPSHOT]
              at org.apache.hadoop.hbase.io.hfile.HFileBlock$Writer.getOnDiskSizeWithHeader(HFileBlock.java:1106) ~[hbase-server-2.5.0-SNAPSHOT.jar:2.5.0-SNAPSHOT]
              at org.apache.hadoop.hbase.io.hfile.HFileWriterImpl.finishBlock(HFileWriterImpl.java:346) ~[hbase-server-2.5.0-SNAPSHOT.jar:2.5.0-SNAPSHOT]
              at org.apache.hadoop.hbase.io.hfile.HFileWriterImpl.checkBlockBoundary(HFileWriterImpl.java:327) ~[hbase-server-2.5.0-SNAPSHOT.jar:2.5.0-SNAPSHOT]
              at org.apache.hadoop.hbase.io.hfile.HFileWriterImpl.append(HFileWriterImpl.java:739) ~[hbase-server-2.5.0-SNAPSHOT.jar:2.5.0-SNAPSHOT]
              at org.apache.hadoop.hbase.regionserver.StoreFileWriter.append(StoreFileWriter.java:301) ~[hbase-server-2.5.0-SNAPSHOT.jar:2.5.0-SNAPSHOT]
              at org.apache.hadoop.hbase.regionserver.compactions.Compactor.performCompaction(Compactor.java:456) ~[hbase-server-2.5.0-SNAPSHOT.jar:2.5.0-SNAPSHOT]
              at org.apache.hadoop.hbase.regionserver.compactions.Compactor.compact(Compactor.java:364) ~[hbase-server-2.5.0-SNAPSHOT.jar:2.5.0-SNAPSHOT]
              at org.apache.hadoop.hbase.regionserver.compactions.DefaultCompactor.compact(DefaultCompactor.java:62) ~[hbase-server-2.5.0-SNAPSHOT.jar:2.5.0-SNAPSHOT]
              at org.apache.hadoop.hbase.regionserver.DefaultStoreEngine$DefaultCompactionContext.compact(DefaultStoreEngine.java:125) ~[hbase-server-2.5.0-SNAPSHOT.jar:2.5.0-SNAPSHOT]
              at org.apache.hadoop.hbase.regionserver.HStore.compact(HStore.java:1138) ~[hbase-server-2.5.0-SNAPSHOT.jar:2.5.0-SNAPSHOT]
              at org.apache.hadoop.hbase.regionserver.HRegion.compact(HRegion.java:2392) ~[hbase-server-2.5.0-SNAPSHOT.jar:2.5.0-SNAPSHOT]
              at org.apache.hadoop.hbase.regionserver.CompactSplit$CompactionRunner.doCompaction(CompactSplit.java:656) ~[hbase-server-2.5.0-SNAPSHOT.jar:2.5.0-SNAPSHOT]
              at org.apache.hadoop.hbase.regionserver.CompactSplit$CompactionRunner.run(CompactSplit.java:702) ~[hbase-server-2.5.0-SNAPSHOT.jar:2.5.0-SNAPSHOT]
      

      and the dreaded "added a key not lexically larger than previous":

      2022-04-07T23:13:14,715 ERROR [regionserver/ip-172-31-63-83:8120-shortCompactions-8] regionserver.CompactSplit: Compaction failed region=IntegrationTestLoadCommonCrawl,de.bao,1649373172576.083eb1ede8bf8c82174f614b06d4741e., storeName=083eb1ede8bf8c82174f614b06d4741e/c, priority=-1, startTime=1649373194643
      java.io.IOException: Added a key not lexically larger than previous. 
      ...
              at org.apache.hadoop.hbase.util.BloomContext.sanityCheck(BloomContext.java:63) ~[hbase-server-2.5.0-SNAPSHOT.jar:2.5.0-SNAPSHOT]
              at org.apache.hadoop.hbase.util.BloomContext.writeBloom(BloomContext.java:54) ~[hbase-server-2.5.0-SNAPSHOT.jar:2.5.0-SNAPSHOT]
              at org.apache.hadoop.hbase.regionserver.StoreFileWriter.appendGeneralBloomfilter(StoreFileWriter.java:280) ~[hbase-server-2.5.0-SNAPSHOT.jar:2.5.0-SNAPSHOT]
              at org.apache.hadoop.hbase.regionserver.StoreFileWriter.append(StoreFileWriter.java:299) ~[hbase-server-2.5.0-SNAPSHOT.jar:2.5.0-SNAPSHOT]
              at org.apache.hadoop.hbase.regionserver.compactions.Compactor.performCompaction(Compactor.java:456) ~[hbase-server-2.5.0-SNAPSHOT.jar:2.5.0-SNAPSHOT]
              at org.apache.hadoop.hbase.regionserver.compactions.Compactor.compact(Compactor.java:364) ~[hbase-server-2.5.0-SNAPSHOT.jar:2.5.0-SNAPSHOT]
              at org.apache.hadoop.hbase.regionserver.compactions.DefaultCompactor.compact(DefaultCompactor.java:62) ~[hbase-server-2.5.0-SNAPSHOT.jar:2.5.0-SNAPSHOT]
              at org.apache.hadoop.hbase.regionserver.DefaultStoreEngine$DefaultCompactionContext.compact(DefaultStoreEngine.java:125) ~[hbase-server-2.5.0-SNAPSHOT.jar:2.5.0-SNAPSHOT]
              at org.apache.hadoop.hbase.regionserver.HStore.compact(HStore.java:1138) ~[hbase-server-2.5.0-SNAPSHOT.jar:2.5.0-SNAPSHOT]
              at org.apache.hadoop.hbase.regionserver.HRegion.compact(HRegion.java:2392) ~[hbase-server-2.5.0-SNAPSHOT.jar:2.5.0-SNAPSHOT]
              at org.apache.hadoop.hbase.regionserver.CompactSplit$CompactionRunner.doCompaction(CompactSplit.java:656) ~[hbase-server-2.5.0-SNAPSHOT.jar:2.5.0-SNAPSHOT]
              at org.apache.hadoop.hbase.regionserver.CompactSplit$CompactionRunner.run(CompactSplit.java:702) ~[hbase-server-2.5.0-SNAPSHOT.jar:2.5.0-SNAPSHOT]
      

      and another bloomfilter addition sanity check failure:

      2022-04-07T23:23:44,001 ERROR [regionserver/ip-172-31-49-8:8120-longCompactions-4] regionserver.CompactSplit: Compaction failed region=IntegrationTestLoadCommonCrawl,com.fc2.blog.idp,1649373775847.9d960338d85e2bb9c68669fad6a89f73., storeName=9d960338d85e2bb9c68669fad6a89f73/c, priority=-1, startTime=1649373823967
      java.lang.IllegalStateException: First key in chunk already set: com.filmestipo|/film/1335-algures-hoje-a-noite|1649373441720
              at org.apache.hadoop.hbase.io.hfile.CompoundBloomFilterWriter.append(CompoundBloomFilterWriter.java:174) ~[hbase-server-2.5.0-SNAPSHOT.jar:2.5.0-SNAPSHOT]
              at org.apache.hadoop.hbase.util.BloomContext.writeBloom(BloomContext.java:55) ~[hbase-server-2.5.0-SNAPSHOT.jar:2.5.0-SNAPSHOT]
              at org.apache.hadoop.hbase.regionserver.StoreFileWriter.appendGeneralBloomfilter(StoreFileWriter.java:280) ~[hbase-server-2.5.0-SNAPSHOT.jar:2.5.0-SNAPSHOT]
              at org.apache.hadoop.hbase.regionserver.StoreFileWriter.append(StoreFileWriter.java:299) ~[hbase-server-2.5.0-SNAPSHOT.jar:2.5.0-SNAPSHOT]
              at org.apache.hadoop.hbase.regionserver.compactions.Compactor.performCompaction(Compactor.java:456) ~[hbase-server-2.5.0-SNAPSHOT.jar:2.5.0-SNAPSHOT]
              at org.apache.hadoop.hbase.regionserver.compactions.Compactor.compact(Compactor.java:364) ~[hbase-server-2.5.0-SNAPSHOT.jar:2.5.0-SNAPSHOT]
              at org.apache.hadoop.hbase.regionserver.compactions.DefaultCompactor.compact(DefaultCompactor.java:62) ~[hbase-server-2.5.0-SNAPSHOT.jar:2.5.0-SNAPSHOT]
              at org.apache.hadoop.hbase.regionserver.DefaultStoreEngine$DefaultCompactionContext.compact(DefaultStoreEngine.java:125) ~[hbase-server-2.5.0-SNAPSHOT.jar:2.5.0-SNAPSHOT]
              at org.apache.hadoop.hbase.regionserver.HStore.compact(HStore.java:1138) ~[hbase-server-2.5.0-SNAPSHOT.jar:2.5.0-SNAPSHOT]
              at org.apache.hadoop.hbase.regionserver.HRegion.compact(HRegion.java:2392) ~[hbase-server-2.5.0-SNAPSHOT.jar:2.5.0-SNAPSHOT]
              at org.apache.hadoop.hbase.regionserver.CompactSplit$CompactionRunner.doCompaction(CompactSplit.java:656) ~[hbase-server-2.5.0-SNAPSHOT.jar:2.5.0-SNAPSHOT]
              at org.apache.hadoop.hbase.regionserver.CompactSplit$CompactionRunner.run(CompactSplit.java:702) ~[hbase-server-2.5.0-SNAPSHOT.jar:2.5.0-SNAPSHOT]
      

      and a NPE in HFileBlock$Writer.getEncodingState:

      2022-04-07T23:17:26,373 ERROR [regionserver/ip-172-31-63-83:8120-shortCompactions-6] regionserver.CompactSplit: Compaction failed region=IntegrationTestLoadCommonCrawl,com.fc2.blog.idp,1649373315303.2bd16fac0c088fa135a855f0165f1dba., storeName=2bd16fac0c088fa135a855f0165f1dba/c, priority=10, startTime=1649373438988
      java.lang.NullPointerException: null
              at org.apache.hadoop.hbase.io.hfile.HFileBlock$Writer.getEncodingState(HFileBlock.java:837) ~[hbase-server-2.5.0-SNAPSHOT.jar:2.5.0-SNAPSHOT]
              at org.apache.hadoop.hbase.io.hfile.HFileBlock$Writer.beforeShipped(HFileBlock.java:831) ~[hbase-server-2.5.0-SNAPSHOT.jar:2.5.0-SNAPSHOT]
              at org.apache.hadoop.hbase.io.hfile.HFileWriterImpl.beforeShipped(HFileWriterImpl.java:770) ~[hbase-server-2.5.0-SNAPSHOT.jar:2.5.0-SNAPSHOT]
              at org.apache.hadoop.hbase.regionserver.StoreFileWriter.beforeShipped(StoreFileWriter.java:309) ~[hbase-server-2.5.0-SNAPSHOT.jar:2.5.0-SNAPSHOT]
              at org.apache.hadoop.hbase.regionserver.compactions.Compactor.performCompaction(Compactor.java:515) ~[hbase-server-2.5.0-SNAPSHOT.jar:2.5.0-SNAPSHOT]
              at org.apache.hadoop.hbase.regionserver.compactions.Compactor.compact(Compactor.java:364) ~[hbase-server-2.5.0-SNAPSHOT.jar:2.5.0-SNAPSHOT]
              at org.apache.hadoop.hbase.regionserver.compactions.DefaultCompactor.compact(DefaultCompactor.java:62) ~[hbase-server-2.5.0-SNAPSHOT.jar:2.5.0-SNAPSHOT]
              at org.apache.hadoop.hbase.regionserver.DefaultStoreEngine$DefaultCompactionContext.compact(DefaultStoreEngine.java:125) ~[hbase-server-2.5.0-SNAPSHOT.jar:2.5.0-SNAPSHOT]
              at org.apache.hadoop.hbase.regionserver.HStore.compact(HStore.java:1138) ~[hbase-server-2.5.0-SNAPSHOT.jar:2.5.0-SNAPSHOT]
              at org.apache.hadoop.hbase.regionserver.HRegion.compact(HRegion.java:2392) ~[hbase-server-2.5.0-SNAPSHOT.jar:2.5.0-SNAPSHOT]
              at org.apache.hadoop.hbase.regionserver.CompactSplit$CompactionRunner.doCompaction(CompactSplit.java:656) ~[hbase-server-2.5.0-SNAPSHOT.jar:2.5.0-SNAPSHOT]
              at org.apache.hadoop.hbase.regionserver.CompactSplit$CompactionRunner.run(CompactSplit.java:702) ~[hbase-server-2.5.0-SNAPSHOT.jar:2.5.0-SNAPSHOT]
      

      Attachments

        Issue Links

          Activity

            People

              zhangduo Duo Zhang
              apurtell Andrew Kyle Purtell
              Votes:
              0 Vote for this issue
              Watchers:
              11 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: