HBase
  1. HBase
  2. HBASE-6679

RegionServer aborts due to race between compaction and split

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.94.2
    • Component/s: None
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      In our nightlies, we have seen RS aborts due to compaction and split racing. Original parent file gets deleted after the compaction, and hence, the daughters don't find the parent data file. The RS kills itself when this happens. Will attach a snippet of the relevant RS logs.

      1. rs-crash-parallel-compact-split.log
        376 kB
        Devaraj Das
      2. 6679-1.patch
        0.7 kB
        Devaraj Das
      3. 6679-1.094.patch
        0.6 kB
        stack

        Activity

        Hide
        stack added a comment -

        Fix up after bulk move overwrote some 0.94.2 fix versions w/ 0.95.0 (Noticed by Lars Hofhansl)

        Show
        stack added a comment - Fix up after bulk move overwrote some 0.94.2 fix versions w/ 0.95.0 (Noticed by Lars Hofhansl)
        Hide
        Hudson added a comment -

        Integrated in HBase-0.92-security #143 (See https://builds.apache.org/job/HBase-0.92-security/143/)
        HBASE-6679 RegionServer aborts due to race between compaction and split (Revision 1390787)

        Result = FAILURE
        stack :
        Files :

        • /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java
        Show
        Hudson added a comment - Integrated in HBase-0.92-security #143 (See https://builds.apache.org/job/HBase-0.92-security/143/ ) HBASE-6679 RegionServer aborts due to race between compaction and split (Revision 1390787) Result = FAILURE stack : Files : /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java
        Hide
        Hudson added a comment -

        Integrated in HBase-0.94-security-on-Hadoop-23 #8 (See https://builds.apache.org/job/HBase-0.94-security-on-Hadoop-23/8/)
        HBASE-6679 RegionServer aborts due to race between compaction and split (Revision 1390783)

        Result = FAILURE
        stack :
        Files :

        • /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java
        Show
        Hudson added a comment - Integrated in HBase-0.94-security-on-Hadoop-23 #8 (See https://builds.apache.org/job/HBase-0.94-security-on-Hadoop-23/8/ ) HBASE-6679 RegionServer aborts due to race between compaction and split (Revision 1390783) Result = FAILURE stack : Files : /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java
        Hide
        Hudson added a comment -

        Integrated in HBase-0.94-security #58 (See https://builds.apache.org/job/HBase-0.94-security/58/)
        HBASE-6679 RegionServer aborts due to race between compaction and split (Revision 1390783)

        Result = SUCCESS
        stack :
        Files :

        • /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java
        Show
        Hudson added a comment - Integrated in HBase-0.94-security #58 (See https://builds.apache.org/job/HBase-0.94-security/58/ ) HBASE-6679 RegionServer aborts due to race between compaction and split (Revision 1390783) Result = SUCCESS stack : Files : /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java
        Hide
        Hudson added a comment -

        Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #195 (See https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/195/)
        HBASE-6679 RegionServer aborts due to race between compaction and split (Revision 1390781)

        Result = FAILURE
        stack :
        Files :

        • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HStore.java
        Show
        Hudson added a comment - Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #195 (See https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/195/ ) HBASE-6679 RegionServer aborts due to race between compaction and split (Revision 1390781) Result = FAILURE stack : Files : /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HStore.java
        Hide
        Hudson added a comment -

        Integrated in HBase-TRUNK #3382 (See https://builds.apache.org/job/HBase-TRUNK/3382/)
        HBASE-6679 RegionServer aborts due to race between compaction and split (Revision 1390781)

        Result = FAILURE
        stack :
        Files :

        • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HStore.java
        Show
        Hudson added a comment - Integrated in HBase-TRUNK #3382 (See https://builds.apache.org/job/HBase-TRUNK/3382/ ) HBASE-6679 RegionServer aborts due to race between compaction and split (Revision 1390781) Result = FAILURE stack : Files : /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HStore.java
        Hide
        Hudson added a comment -

        Integrated in HBase-0.94 #491 (See https://builds.apache.org/job/HBase-0.94/491/)
        HBASE-6679 RegionServer aborts due to race between compaction and split (Revision 1390783)

        Result = FAILURE
        stack :
        Files :

        • /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java
        Show
        Hudson added a comment - Integrated in HBase-0.94 #491 (See https://builds.apache.org/job/HBase-0.94/491/ ) HBASE-6679 RegionServer aborts due to race between compaction and split (Revision 1390783) Result = FAILURE stack : Files : /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java
        Hide
        stack added a comment -

        Committed to 0.92, 0.94 and trunk. Thanks for the patch Devaraj.

        Show
        stack added a comment - Committed to 0.92, 0.94 and trunk. Thanks for the patch Devaraj.
        Hide
        stack added a comment -

        0.94 version

        Show
        stack added a comment - 0.94 version
        Hide
        Lars Hofhansl added a comment -

        I assume 0.94 and trunk have the issue, right?

        Show
        Lars Hofhansl added a comment - I assume 0.94 and trunk have the issue, right?
        Hide
        Devaraj Das added a comment -

        Patch as per the previous comment.

        Show
        Devaraj Das added a comment - Patch as per the previous comment.
        Hide
        Devaraj Das added a comment -

        I suppose we could make the reference volatile so all threads catch the update.

        Yeah, stack, I have the same opinion - we close the issue with a fix that makes the reference volatile (and that'd justify my hours of debugging <smile>).

        But you can't see how the two threads can run concurrently? (not to say it not possible)

        At least from the regionserver logs it is evident that this didn't happen. From the code, the compactions and splits happen in executors, where the split happens in an executor with a thread pool of at most one thread. Once a compaction completes, the executor fires off a request for split (that may or may not happen based on checks done within the request handler). The compaction executor doesn't wait for the split to complete, and so technically, it's possible that split & compaction could be running in parallel. But at a finer granularity, there are locks being taken at different points in split/compact (and the important places are protected with HRegion.lock). There are also checks for things like HRegion.writeState that are checked/set at places in compaction/split.

        So IMHO things are wired together okay (but yeah, usual disclaimer - may have missed something )

        Good on you Deva.

        You too

        Show
        Devaraj Das added a comment - I suppose we could make the reference volatile so all threads catch the update. Yeah, stack , I have the same opinion - we close the issue with a fix that makes the reference volatile (and that'd justify my hours of debugging <smile>). But you can't see how the two threads can run concurrently? (not to say it not possible) At least from the regionserver logs it is evident that this didn't happen. From the code, the compactions and splits happen in executors, where the split happens in an executor with a thread pool of at most one thread. Once a compaction completes, the executor fires off a request for split (that may or may not happen based on checks done within the request handler). The compaction executor doesn't wait for the split to complete, and so technically, it's possible that split & compaction could be running in parallel. But at a finer granularity, there are locks being taken at different points in split/compact (and the important places are protected with HRegion.lock). There are also checks for things like HRegion.writeState that are checked/set at places in compaction/split. So IMHO things are wired together okay (but yeah, usual disclaimer - may have missed something ) Good on you Deva. You too
        Hide
        stack added a comment -

        Yes.. but that still could be exposed to the problems of memory inconsistencies when multiple threads are accessing the object in unsynchronized/non-volatile ways, no?

        We swap in the new list when compaction completes. I suppose we could make the reference volatile so all threads catch the update.

        But you can't see how the two threads can run concurrently? (not to say it not possible)

        I'd say close this issue if you don't have the logs to disprove it a case of dbl-assign, a more plausible explanation to my mind?

        Good on you Deva.

        Show
        stack added a comment - Yes.. but that still could be exposed to the problems of memory inconsistencies when multiple threads are accessing the object in unsynchronized/non-volatile ways, no? We swap in the new list when compaction completes. I suppose we could make the reference volatile so all threads catch the update. But you can't see how the two threads can run concurrently? (not to say it not possible) I'd say close this issue if you don't have the logs to disprove it a case of dbl-assign, a more plausible explanation to my mind? Good on you Deva.
        Hide
        Devaraj Das added a comment -

        For sure the regions was not doubly-assigned? Split happened of the region on one server but on another server, the same region was being compacted? You'd need the master logs to figure it a dbl-assign

        Unfortunately, didn't save the master logs when the failure happened..

        Can you figure a place where we'd be running compactions on a region concurrent w/ our splitting it? Compacting we take out write lock. Doesnt look like any locks while SplitTransaction is running (closing parent, it'll need write lock... thats after daughters open though).

        I can't figure out a place where this could happen in the natural execution of the regionserver.

        Storefiles are an ImmutableList.

        Yes.. but that still could be exposed to the problems of memory inconsistencies when multiple threads are accessing the object in unsynchronized/non-volatile ways, no?

        @Deva

        After a long time, someone addressed me by that name

        So before this itself the region got closed. I feel the store file list should have been updated by the time. No ?

        Can't say Ram for sure. There is no guarantee unless the access (read/write) are synchronized or the field is declared volatile..

        Show
        Devaraj Das added a comment - For sure the regions was not doubly-assigned? Split happened of the region on one server but on another server, the same region was being compacted? You'd need the master logs to figure it a dbl-assign Unfortunately, didn't save the master logs when the failure happened.. Can you figure a place where we'd be running compactions on a region concurrent w/ our splitting it? Compacting we take out write lock. Doesnt look like any locks while SplitTransaction is running (closing parent, it'll need write lock... thats after daughters open though). I can't figure out a place where this could happen in the natural execution of the regionserver. Storefiles are an ImmutableList. Yes.. but that still could be exposed to the problems of memory inconsistencies when multiple threads are accessing the object in unsynchronized/non-volatile ways, no? @Deva After a long time, someone addressed me by that name So before this itself the region got closed. I feel the store file list should have been updated by the time. No ? Can't say Ram for sure. There is no guarantee unless the access (read/write) are synchronized or the field is declared volatile..
        Hide
        ramkrishna.s.vasudevan added a comment -

        @Deva
        Am not able to tell clearly what is the problem. I too went thro those logs and found that the region 5689a8785bbc9a8aa8e526cd7ef1542a has completed the compaction.

        2012-08-28 06:15:34,107 INFO org.apache.hadoop.hbase.regionserver.compactions.CompactionRequest: completed compaction: regionName=TestLoadAndVerify_1346120615716,\xD8\x0D\x03\x00\x00\x00\x00\x00/000007_0,1346125261573.5689a8785bbc9a8aa8e526cd7ef1542a., storeName=f1, fileCount=3, fileSize=27.3m, priority=3, time=14360293782301; duration=4sec
        
        

        and later the split has started for the region (after 2 ms)

        2012-08-28 06:15:34,109 INFO org.apache.hadoop.hbase.regionserver.SplitTransaction: Starting split of region TestLoadAndVerify_1346120615716,\xD8\x0D\x03\x00\x00\x00\x00\x00/000007_0,1346125261573.5689a8785bbc9a8aa8e526cd7ef1542a.
        

        The offlining of the region is done here

        2012-08-28 06:15:34,788 INFO org.apache.hadoop.hbase.catalog.MetaEditor: Offlined parent region TestLoadAndVerify_1346120615716,\xD8\x0D\x03\x00\x00\x00\x00\x00/000007_0,1346125261573.5689a8785bbc9a8aa8e526cd7ef1542a. in META
        

        So before this itself the region got closed. I feel the store file list should have been updated by the time. No ?

        Show
        ramkrishna.s.vasudevan added a comment - @Deva Am not able to tell clearly what is the problem. I too went thro those logs and found that the region 5689a8785bbc9a8aa8e526cd7ef1542a has completed the compaction. 2012-08-28 06:15:34,107 INFO org.apache.hadoop.hbase.regionserver.compactions.CompactionRequest: completed compaction: regionName=TestLoadAndVerify_1346120615716,\xD8\x0D\x03\x00\x00\x00\x00\x00/000007_0,1346125261573.5689a8785bbc9a8aa8e526cd7ef1542a., storeName=f1, fileCount=3, fileSize=27.3m, priority=3, time=14360293782301; duration=4sec and later the split has started for the region (after 2 ms) 2012-08-28 06:15:34,109 INFO org.apache.hadoop.hbase.regionserver.SplitTransaction: Starting split of region TestLoadAndVerify_1346120615716,\xD8\x0D\x03\x00\x00\x00\x00\x00/000007_0,1346125261573.5689a8785bbc9a8aa8e526cd7ef1542a. The offlining of the region is done here 2012-08-28 06:15:34,788 INFO org.apache.hadoop.hbase.catalog.MetaEditor: Offlined parent region TestLoadAndVerify_1346120615716,\xD8\x0D\x03\x00\x00\x00\x00\x00/000007_0,1346125261573.5689a8785bbc9a8aa8e526cd7ef1542a. in META So before this itself the region got closed. I feel the store file list should have been updated by the time. No ?
        Hide
        stack added a comment -

        For sure the regions was not doubly-assigned? Split happened of the region on one server but on another server, the same region was being compacted? You'd need the master logs to figure it a dbl-assign.

        Storefiles are an ImmutableList.

        Can you figure a place where we'd be running compactions on a region concurrent w/ our splitting it? Compacting we take out write lock. Doesnt look like any locks while SplitTransaction is running (closing parent, it'll need write lock... thats after daughters open though).

        Show
        stack added a comment - For sure the regions was not doubly-assigned? Split happened of the region on one server but on another server, the same region was being compacted? You'd need the master logs to figure it a dbl-assign. Storefiles are an ImmutableList. Can you figure a place where we'd be running compactions on a region concurrent w/ our splitting it? Compacting we take out write lock. Doesnt look like any locks while SplitTransaction is running (closing parent, it'll need write lock... thats after daughters open though).
        Hide
        Devaraj Das added a comment -

        Okay, did some digging into the logs (that was attached in the jira earlier) and the code. Doesn't seem like a race between compaction and split (apologies for the confusion I might have created). The two are sequential (at the end of a compaction, split is requested for). But I'll note that the split happens in a separate thread.

        The problem is that the daughter tries to open a reader to a file that doesn't exist.

        java.io.IOException: Failed ip-10-4-197-133.ec2.internal,60020,1346119706203-daughterOpener=4efb1c92918bbf3c54d0ead3345bb735
        	at org.apache.hadoop.hbase.regionserver.SplitTransaction.openDaughters(SplitTransaction.java:368)
        	at org.apache.hadoop.hbase.regionserver.SplitTransaction.execute(SplitTransaction.java:456)
        	at org.apache.hadoop.hbase.regionserver.SplitRequest.run(SplitRequest.java:67)
        	at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
        	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
        	at java.lang.Thread.run(Thread.java:662)
        Caused by: java.io.FileNotFoundException: File does not exist: /apps/hbase/data/TestLoadAndVerify_1346120615716/5689a8785bbc9a8aa8e526cd7ef1542a/f1/5a55df83829f401993d95ecf2e539ba1
        

        The method SplitTransaction.createDaughters creates the reference files (via a call to the method SplitTransaction.splitStoreFiles) that the daughter then tries to open. The list of files to create references to is the set of entries in the storeFiles field in Store.java (obtained via the call to this.parent.close in createDaughters). The storeFiles is last updated (in the thread doing the compaction) in the method Store.completeCompaction.

        My suspicion is that the problem is due to the fact that accesses to storeFiles is not synchronized, and it not volatile either. This leads to inconsistencies in the compaction-thread and split-thread and the split thread doesn't see the last updated value of the field.

        If the above theory is right (and I have this theory only), then the solution could be to make the storeFiles field volatile.

        Thoughts?

        Show
        Devaraj Das added a comment - Okay, did some digging into the logs (that was attached in the jira earlier) and the code. Doesn't seem like a race between compaction and split (apologies for the confusion I might have created). The two are sequential (at the end of a compaction, split is requested for). But I'll note that the split happens in a separate thread. The problem is that the daughter tries to open a reader to a file that doesn't exist. java.io.IOException: Failed ip-10-4-197-133.ec2.internal,60020,1346119706203-daughterOpener=4efb1c92918bbf3c54d0ead3345bb735 at org.apache.hadoop.hbase.regionserver.SplitTransaction.openDaughters(SplitTransaction.java:368) at org.apache.hadoop.hbase.regionserver.SplitTransaction.execute(SplitTransaction.java:456) at org.apache.hadoop.hbase.regionserver.SplitRequest.run(SplitRequest.java:67) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) Caused by: java.io.FileNotFoundException: File does not exist: /apps/hbase/data/TestLoadAndVerify_1346120615716/5689a8785bbc9a8aa8e526cd7ef1542a/f1/5a55df83829f401993d95ecf2e539ba1 The method SplitTransaction.createDaughters creates the reference files (via a call to the method SplitTransaction.splitStoreFiles) that the daughter then tries to open. The list of files to create references to is the set of entries in the storeFiles field in Store.java (obtained via the call to this.parent.close in createDaughters). The storeFiles is last updated (in the thread doing the compaction) in the method Store.completeCompaction. My suspicion is that the problem is due to the fact that accesses to storeFiles is not synchronized, and it not volatile either. This leads to inconsistencies in the compaction-thread and split-thread and the split thread doesn't see the last updated value of the field. If the above theory is right (and I have this theory only), then the solution could be to make the storeFiles field volatile. Thoughts?
        Hide
        Devaraj Das added a comment -

        Am looking, Ram. I will post some update soon.

        Show
        Devaraj Das added a comment - Am looking, Ram. I will post some update soon.
        Hide
        ramkrishna.s.vasudevan added a comment -

        Any updates on this issue?

        Show
        ramkrishna.s.vasudevan added a comment - Any updates on this issue?
        Hide
        stack added a comment -

        Did this happen w/ a non-standard config? Or is it just consequence of split/compaction refactoring and we should do as Andrew suggests?

        Show
        stack added a comment - Did this happen w/ a non-standard config? Or is it just consequence of split/compaction refactoring and we should do as Andrew suggests?
        Hide
        Andrew Purtell added a comment -

        Then why not move split handling back to the compaction threads, do away with the separate split handler threads. Can someone articulate a compelling benefit to the contrary?

        Show
        Andrew Purtell added a comment - Then why not move split handling back to the compaction threads, do away with the separate split handler threads. Can someone articulate a compelling benefit to the contrary?
        Hide
        Ted Yu added a comment -

        Looks like we need some locking mechanism to make compacting and splitting the same region mutually exclusive.

        Show
        Ted Yu added a comment - Looks like we need some locking mechanism to make compacting and splitting the same region mutually exclusive.
        Hide
        Devaraj Das added a comment -

        RS logs

        Show
        Devaraj Das added a comment - RS logs

          People

          • Assignee:
            Devaraj Das
            Reporter:
            Devaraj Das
          • Votes:
            0 Vote for this issue
            Watchers:
            10 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development