HBase
  1. HBase
  2. HBASE-5689

Skipping RecoveredEdits may cause data loss

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Critical Critical
    • Resolution: Fixed
    • Affects Version/s: 0.94.0
    • Fix Version/s: 0.94.0
    • Component/s: regionserver
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      Let's see the following scenario:
      1.Region is on the server A
      2.put KV(r1->v1) to the region
      3.move region from server A to server B
      4.put KV(r2->v2) to the region
      5.move region from server B to server A
      6.put KV(r3->v3) to the region
      7.kill -9 server B and start it
      8.kill -9 server A and start it
      9.scan the region, we could only get two KV(r1->v1,r2->v2), the third KV(r3->v3) is lost.

      Let's analyse the upper scenario from the code:
      1.the edit logs of KV(r1->v1) and KV(r3->v3) are both recorded in the same hlog file on server A.
      2.when we split server B's hlog file in the process of ServerShutdownHandler, we create one RecoveredEdits file f1 for the region.
      2.when we split server A's hlog file in the process of ServerShutdownHandler, we create another RecoveredEdits file f2 for the region.
      3.however, RecoveredEdits file f2 will be skiped when initializing region
      HRegion#replayRecoveredEditsIfAny

       for (Path edits: files) {
            if (edits == null || !this.fs.exists(edits)) {
              LOG.warn("Null or non-existent edits file: " + edits);
              continue;
            }
            if (isZeroLengthThenDelete(this.fs, edits)) continue;
      
            if (checkSafeToSkip) {
              Path higher = files.higher(edits);
              long maxSeqId = Long.MAX_VALUE;
              if (higher != null) {
                // Edit file name pattern, HLog.EDITFILES_NAME_PATTERN: "-?[0-9]+"
                String fileName = higher.getName();
                maxSeqId = Math.abs(Long.parseLong(fileName));
              }
              if (maxSeqId <= minSeqId) {
                String msg = "Maximum possible sequenceid for this log is " + maxSeqId
                    + ", skipped the whole file, path=" + edits;
                LOG.debug(msg);
                continue;
              } else {
                checkSafeToSkip = false;
              }
            }
      
      1. 5689-testcase.patch
        5 kB
        chunhui shen
      2. HBASE-5689.patch
        10 kB
        chunhui shen
      3. HBASE-5689.patch
        10 kB
        Lars Hofhansl
      4. HBASE-5689v2.patch
        10 kB
        chunhui shen
      5. HBASE-5689v3.patch
        12 kB
        chunhui shen
      6. 5689-v4.txt
        12 kB
        Ted Yu
      7. HBASE-5689v3.patch
        12 kB
        Ted Yu

        Activity

        Hide
        Hudson added a comment -

        Integrated in HBase-0.94-security #9 (See https://builds.apache.org/job/HBase-0.94-security/9/)
        HBASE-5689 Skipping RecoveredEdits may cause data loss (Chunhui) (Revision 1310787)

        Result = SUCCESS
        tedyu :
        Files :

        • /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
        • /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLogSplitter.java
        • /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegion.java
        Show
        Hudson added a comment - Integrated in HBase-0.94-security #9 (See https://builds.apache.org/job/HBase-0.94-security/9/ ) HBASE-5689 Skipping RecoveredEdits may cause data loss (Chunhui) (Revision 1310787) Result = SUCCESS tedyu : Files : /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLogSplitter.java /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegion.java
        Hide
        Hudson added a comment -

        Integrated in HBase-TRUNK-security #162 (See https://builds.apache.org/job/HBase-TRUNK-security/162/)
        HBASE-5689 Skipping RecoveredEdits may cause data loss (Chunhui) (Revision 1310788)

        Result = FAILURE
        tedyu :
        Files :

        • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
        • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLogSplitter.java
        • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegion.java
        Show
        Hudson added a comment - Integrated in HBase-TRUNK-security #162 (See https://builds.apache.org/job/HBase-TRUNK-security/162/ ) HBASE-5689 Skipping RecoveredEdits may cause data loss (Chunhui) (Revision 1310788) Result = FAILURE tedyu : Files : /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLogSplitter.java /hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegion.java
        Hide
        Hudson added a comment -

        Integrated in HBase-0.94 #97 (See https://builds.apache.org/job/HBase-0.94/97/)
        HBASE-5689 Skipping RecoveredEdits may cause data loss (Chunhui) (Revision 1310787)

        Result = SUCCESS
        tedyu :
        Files :

        • /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
        • /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLogSplitter.java
        • /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegion.java
        Show
        Hudson added a comment - Integrated in HBase-0.94 #97 (See https://builds.apache.org/job/HBase-0.94/97/ ) HBASE-5689 Skipping RecoveredEdits may cause data loss (Chunhui) (Revision 1310787) Result = SUCCESS tedyu : Files : /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLogSplitter.java /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegion.java
        Hide
        Hudson added a comment -

        Integrated in HBase-TRUNK #2726 (See https://builds.apache.org/job/HBase-TRUNK/2726/)
        HBASE-5689 Skipping RecoveredEdits may cause data loss (Chunhui) (Revision 1310788)

        Result = SUCCESS
        tedyu :
        Files :

        • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
        • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLogSplitter.java
        • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegion.java
        Show
        Hudson added a comment - Integrated in HBase-TRUNK #2726 (See https://builds.apache.org/job/HBase-TRUNK/2726/ ) HBASE-5689 Skipping RecoveredEdits may cause data loss (Chunhui) (Revision 1310788) Result = SUCCESS tedyu : Files : /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLogSplitter.java /hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegion.java
        Hide
        Ted Yu added a comment -

        Integrated patch v3 to 0.94 and trunk.

        Thanks for the patch Chunhui.

        Thanks for the review, Ram, Lars, Stack and Ming.

        Show
        Ted Yu added a comment - Integrated patch v3 to 0.94 and trunk. Thanks for the patch Chunhui. Thanks for the review, Ram, Lars, Stack and Ming.
        Hide
        Hadoop QA added a comment -

        +1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12521820/HBASE-5689v3.patch
        against trunk revision .

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 3 new or modified tests.

        +1 javadoc. The javadoc tool did not generate any warning messages.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        +1 core tests. The patch passed unit tests in .

        Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1445//testReport/
        Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1445//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
        Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1445//console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12521820/HBASE-5689v3.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1445//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1445//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1445//console This message is automatically generated.
        Hide
        Ted Yu added a comment -

        Re-attaching patch v3 for hadoop QA.

        Show
        Ted Yu added a comment - Re-attaching patch v3 for hadoop QA.
        Hide
        stack added a comment -

        Looking at the patch, it looks good to me.

        On the Math.abs, its probably not needed any more given current file formats but its not doing any harm. I tried to go back through src history to see how we used name the recovered edits files – they used be called stuff like oldlogfile.log, etc – and they have been named for first seqid in the file since 0.90 so its unlikely this will ever go negative or that this code will encouter a file named negatively.

        I also thought that perhaps this patch only for trunk because what about the case where we have existing wal files that are still named by the first seqid in the file? But this should be fine. We'll over-replay rather than under-replay edits in this case so it should be fine.

        +1 on commit adding back the Math.abs on commit (though reasoning would have it we don't need, I think we should add it back just-in-case).

        Show
        stack added a comment - Looking at the patch, it looks good to me. On the Math.abs, its probably not needed any more given current file formats but its not doing any harm. I tried to go back through src history to see how we used name the recovered edits files – they used be called stuff like oldlogfile.log, etc – and they have been named for first seqid in the file since 0.90 so its unlikely this will ever go negative or that this code will encouter a file named negatively. I also thought that perhaps this patch only for trunk because what about the case where we have existing wal files that are still named by the first seqid in the file? But this should be fine. We'll over-replay rather than under-replay edits in this case so it should be fine. +1 on commit adding back the Math.abs on commit (though reasoning would have it we don't need, I think we should add it back just-in-case).
        Hide
        stack added a comment -

        @Ming (Hey, long time no hear from you).

        On 1., yes, sounds good. I do not think 2. holds. It is more robust replaying any file that is > 'current max seqid of the region' at least until we do more thinking on this topic. The WAL split can be messy or just as bad, we can be inventive around how we might split WALs and how we might write files to replay.

        Show
        stack added a comment - @Ming (Hey, long time no hear from you). On 1., yes, sounds good. I do not think 2. holds. It is more robust replaying any file that is > 'current max seqid of the region' at least until we do more thinking on this topic. The WAL split can be messy or just as bad, we can be inventive around how we might split WALs and how we might write files to replay.
        Hide
        Hadoop QA added a comment -

        +1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12521642/5689-v4.txt
        against trunk revision .

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 3 new or modified tests.

        +1 javadoc. The javadoc tool did not generate any warning messages.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        +1 core tests. The patch passed unit tests in .

        Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1425//testReport/
        Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1425//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
        Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1425//console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12521642/5689-v4.txt against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1425//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1425//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1425//console This message is automatically generated.
        Hide
        Ted Yu added a comment -

        Consider my comment above and the following method in HLogSplitter:

          static String formatRecoveredEditsFileName(final long seqid) {
            return String.format("%019d", seqid);
          }
        
        Show
        Ted Yu added a comment - Consider my comment above and the following method in HLogSplitter: static String formatRecoveredEditsFileName( final long seqid) { return String .format( "%019d" , seqid); }
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12521635/HBASE-5689v3.patch
        against trunk revision .

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 3 new or modified tests.

        +1 javadoc. The javadoc tool did not generate any warning messages.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        -1 core tests. The patch failed these unit tests:

        Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1422//testReport/
        Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1422//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
        Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1422//console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12521635/HBASE-5689v3.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1422//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1422//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1422//console This message is automatically generated.
        Hide
        Lars Hofhansl added a comment -

        How do you know it's safe to remove Math.abs()?

        Show
        Lars Hofhansl added a comment - How do you know it's safe to remove Math.abs()?
        Hide
        Ted Yu added a comment -

        Patch v4 removes Math.abs() call.

        Show
        Ted Yu added a comment - Patch v4 removes Math.abs() call.
        Hide
        Ted Yu added a comment -

        Please take a look at the updated javadoc for getCompletedRecoveredEditsFilePath():

        +   * Get the completed recovered edits file path, renaming it to be by last edit
        +   * in the file from its first edit. Then we could use the name to skip
        +   * recovered edits when doing {@link HRegion#replayRecoveredEditsIfAny}.
        
        Show
        Ted Yu added a comment - Please take a look at the updated javadoc for getCompletedRecoveredEditsFilePath(): + * Get the completed recovered edits file path, renaming it to be by last edit + * in the file from its first edit. Then we could use the name to skip + * recovered edits when doing {@link HRegion#replayRecoveredEditsIfAny}.
        Hide
        Lars Hofhansl added a comment -

        Thanks Chunhui. Patch looks good to me. +1
        Re: Math.abs(), the comment in the code says that the file name pattern is -?[0-9]+, so negative is possible?

        Show
        Lars Hofhansl added a comment - Thanks Chunhui. Patch looks good to me. +1 Re: Math.abs(), the comment in the code says that the file name pattern is -? [0-9] +, so negative is possible?
        Hide
        chunhui shen added a comment -

        The Math.abs() call is not needed, right ?

        I think so, but I don't know why Math.abs() is needed before.

        Show
        chunhui shen added a comment - The Math.abs() call is not needed, right ? I think so, but I don't know why Math.abs() is needed before.
        Hide
        Ted Yu added a comment -

        For patch v3:

        +      maxSeqId = Math.abs(Long.parseLong(fileName));
        

        The Math.abs() call is not needed, right ?

        Show
        Ted Yu added a comment - For patch v3: + maxSeqId = Math .abs( Long .parseLong(fileName)); The Math.abs() call is not needed, right ?
        Hide
        chunhui shen added a comment -

        In the patchV3, append the optimize suggested by Ming Ma :replayRecoveredEditsIfAny simply walk through all the recoverededit files and skip those files that have "max sequence id" <= "current sequence id of the region".

        Show
        chunhui shen added a comment - In the patchV3, append the optimize suggested by Ming Ma :replayRecoveredEditsIfAny simply walk through all the recoverededit files and skip those files that have "max sequence id" <= "current sequence id of the region".
        Hide
        Hadoop QA added a comment -

        +1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12521627/HBASE-5689v2.patch
        against trunk revision .

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 3 new or modified tests.

        +1 javadoc. The javadoc tool did not generate any warning messages.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        +1 core tests. The patch passed unit tests in .

        Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1419//testReport/
        Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1419//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
        Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1419//console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12521627/HBASE-5689v2.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1419//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1419//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1419//console This message is automatically generated.
        Hide
        chunhui shen added a comment -

        @Ming
        I like the optimize that we simply walk through all the recoverededit files and skip those files that have "max sequence id" <= "current sequence id of the region".

        For the second point, I think it's right and the first optimize is enough and simle to understand

        Show
        chunhui shen added a comment - @Ming I like the optimize that we simply walk through all the recoverededit files and skip those files that have "max sequence id" <= "current sequence id of the region". For the second point, I think it's right and the first optimize is enough and simle to understand
        Hide
        Ming Ma added a comment -

        Really nice finding and fix. Some comments regarding performance:

        1. replayRecoveredEditsIfAny still sorts and uses the sequence id of next recoverededit file as the max sequence id of the current recoverededit file. With this fix, is that still necessary? replayRecoveredEditsIfAny can simply walk through all the recoverededit files and skip those files that have "max sequence id" <= "current sequence id of the region".
        2. If we have multiple recoverededit files for a given region, is it safe to say that "In maximum, there could only be one recoverededit file whose max sequence id is larger than the current sequence id of the region"? If yes, replayRecoveredEditsIfAny can skip the rest of recoverededit files once it detects that special recoverededit file.

        Show
        Ming Ma added a comment - Really nice finding and fix. Some comments regarding performance: 1. replayRecoveredEditsIfAny still sorts and uses the sequence id of next recoverededit file as the max sequence id of the current recoverededit file. With this fix, is that still necessary? replayRecoveredEditsIfAny can simply walk through all the recoverededit files and skip those files that have "max sequence id" <= "current sequence id of the region". 2. If we have multiple recoverededit files for a given region, is it safe to say that "In maximum, there could only be one recoverededit file whose max sequence id is larger than the current sequence id of the region"? If yes, replayRecoveredEditsIfAny can skip the rest of recoverededit files once it detects that special recoverededit file.
        Hide
        chunhui shen added a comment -

        Sorry for the later response.

        In the patchV2, I add some doc for getCompletedRecoveredEditsFilePath() as the comment by Stack. All the other is same with patchV1.

        Please review again.
        Thanks!

        Show
        chunhui shen added a comment - Sorry for the later response. In the patchV2, I add some doc for getCompletedRecoveredEditsFilePath() as the comment by Stack. All the other is same with patchV1. Please review again. Thanks!
        Hide
        Lars Hofhansl added a comment -

        @chunhui: Could you respond to Stack's last comment? I would like to get 0.94.0RC1 out soon.

        Show
        Lars Hofhansl added a comment - @chunhui: Could you respond to Stack's last comment? I would like to get 0.94.0RC1 out soon.
        Hide
        Lars Hofhansl added a comment -

        All except
        org.apache.hadoop.hbase.mapreduce.TestImportTsv.testMROnTable and org.apache.hadoop.hbase.mapreduce.TestImportTsv.testMROnTableWithCustomMapper fail with the known NumberFormatException.

        Show
        Lars Hofhansl added a comment - All except org.apache.hadoop.hbase.mapreduce.TestImportTsv.testMROnTable and org.apache.hadoop.hbase.mapreduce.TestImportTsv.testMROnTableWithCustomMapper fail with the known NumberFormatException.
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12521382/HBASE-5689.patch
        against trunk revision .

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 3 new or modified tests.

        +1 javadoc. The javadoc tool did not generate any warning messages.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        -1 findbugs. The patch appears to introduce 2 new Findbugs (version 1.3.9) warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        -1 core tests. The patch failed these unit tests:
        org.apache.hadoop.hbase.mapreduce.TestMultithreadedTableMapper
        org.apache.hadoop.hbase.mapreduce.TestImportTsv
        org.apache.hadoop.hbase.mapred.TestTableMapReduce
        org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat
        org.apache.hadoop.hbase.mapreduce.TestTableMapReduce

        Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1390//testReport/
        Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1390//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
        Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1390//console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12521382/HBASE-5689.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 2 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.mapreduce.TestMultithreadedTableMapper org.apache.hadoop.hbase.mapreduce.TestImportTsv org.apache.hadoop.hbase.mapred.TestTableMapReduce org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat org.apache.hadoop.hbase.mapreduce.TestTableMapReduce Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1390//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1390//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1390//console This message is automatically generated.
        Hide
        Lars Hofhansl added a comment -

        Reattaching for test run.

        Show
        Lars Hofhansl added a comment - Reattaching for test run.
        Hide
        Lars Hofhansl added a comment -

        +1 on patch.

        Show
        Lars Hofhansl added a comment - +1 on patch.
        Hide
        stack added a comment -

        Good one Chunhui. I think the patch good.

        Nice reproduction of the problem in a test. Where in the test do you find that we've lost the third edit?

        So we name the file when we write it for its first edit, then when we move it into place, we rename it to be by last edit in the file? Add a comment to that effect I'd say else could be confusing. Hmm... I suppose you have it here on the doc for getCompletedRecoveredEditsFilePath. Thats probably good enough.. but no harm explaining why we go from naming file w/ first edit to instead name it for the last edit.

        Show
        stack added a comment - Good one Chunhui. I think the patch good. Nice reproduction of the problem in a test. Where in the test do you find that we've lost the third edit? So we name the file when we write it for its first edit, then when we move it into place, we rename it to be by last edit in the file? Add a comment to that effect I'd say else could be confusing. Hmm... I suppose you have it here on the doc for getCompletedRecoveredEditsFilePath. Thats probably good enough.. but no harm explaining why we go from naming file w/ first edit to instead name it for the last edit.
        Hide
        Ted Yu added a comment -

        @Chunhui:
        Hadoop QA isn't picking up any patches from this JIRA.

        Please run through test suite and let us know the result.

        Show
        Ted Yu added a comment - @Chunhui: Hadoop QA isn't picking up any patches from this JIRA. Please run through test suite and let us know the result.
        Hide
        ramkrishna.s.vasudevan added a comment -

        @Chunhui
        +1 on patch Chunhui. I was bit hesitant to change that main logic. But after checking different scenarios this change is necessary to have both performance and to address this JIRA. Good one. Thanks a lot.

        Show
        ramkrishna.s.vasudevan added a comment - @Chunhui +1 on patch Chunhui. I was bit hesitant to change that main logic. But after checking different scenarios this change is necessary to have both performance and to address this JIRA. Good one. Thanks a lot.
        Hide
        Ted Yu added a comment -

        Re-attaching Chunhui's patch

        Show
        Ted Yu added a comment - Re-attaching Chunhui's patch
        Hide
        Ted Yu added a comment -

        Using a TreeMap is common practice.

        Please attach test suite result - Hadoop QA is not working.

        Show
        Ted Yu added a comment - Using a TreeMap is common practice. Please attach test suite result - Hadoop QA is not working.
        Hide
        chunhui shen added a comment -

        Is a TreeMap needed above ? We're just remembering the mapping, right ?

        I first used ConcurrentHashMap, but for the same region name, they mapped to different values because of byte[]

        Show
        chunhui shen added a comment - Is a TreeMap needed above ? We're just remembering the mapping, right ? I first used ConcurrentHashMap, but for the same region name, they mapped to different values because of byte[]
        Hide
        Ted Yu added a comment -

        Some minor comments about Chunhui's patch:

        +          LOG.debug("Rename " + wap.p + " to " + dst);
        

        The log should begin with "Renamed ".

        +    private final Map<byte[], Long> regionMaximumEditLogSeqNum = Collections
        +        .synchronizedMap(new TreeMap<byte[], Long>(Bytes.BYTES_COMPARATOR));
        

        Is a TreeMap needed above ? We're just remembering the mapping, right ?

        Show
        Ted Yu added a comment - Some minor comments about Chunhui's patch: + LOG.debug( "Rename " + wap.p + " to " + dst); The log should begin with "Renamed ". + private final Map< byte [], Long > regionMaximumEditLogSeqNum = Collections + .synchronizedMap( new TreeMap< byte [], Long >(Bytes.BYTES_COMPARATOR)); Is a TreeMap needed above ? We're just remembering the mapping, right ?
        Hide
        chunhui shen added a comment -

        @Ram
        The analysis is correct.
        However, I think your solution is not use the optimization made in HBASE-4797.
        In my HBASE-5689.patch, I just change the file name of edit log to MaximumEditLogSeqNum。
        In this issue, with the patch it will not skip RecoveredEdits file because we shouldn't, however, in other case, for example, we first put many data to RS1, then move regon to RS2 and put many data again. If RS1 and RS2 died, we would skip the edit log file from RS1.

        Chunhui's patch also makes sense but any way the idea is not to skip any recovered.edits file.

        So, it is wrong, my way is keeping skip recovered.edits file, except the case such as this issue.

        Show
        chunhui shen added a comment - @Ram The analysis is correct. However, I think your solution is not use the optimization made in HBASE-4797 . In my HBASE-5689 .patch, I just change the file name of edit log to MaximumEditLogSeqNum。 In this issue, with the patch it will not skip RecoveredEdits file because we shouldn't, however, in other case, for example, we first put many data to RS1, then move regon to RS2 and put many data again. If RS1 and RS2 died, we would skip the edit log file from RS1. Chunhui's patch also makes sense but any way the idea is not to skip any recovered.edits file. So, it is wrong, my way is keeping skip recovered.edits file, except the case such as this issue.
        Hide
        Ted Yu added a comment -

        Removing the check should be enough.

        With the attached patch, TestHRegion#testDataCorrectnessReplayingRecoveredEdits passes.

        Show
        Ted Yu added a comment - Removing the check should be enough. With the attached patch, TestHRegion#testDataCorrectnessReplayingRecoveredEdits passes.
        Hide
        ramkrishna.s.vasudevan added a comment -

        Making it critical as it is data loss related.

        Show
        ramkrishna.s.vasudevan added a comment - Making it critical as it is data loss related.
        Hide
        ramkrishna.s.vasudevan added a comment -

        @Ted
        What do you feel Ted? Chunhui's patch also makes sense but any way the idea is not to skip any recovered.edits file. So i felt removing the check will be enough.

        Show
        ramkrishna.s.vasudevan added a comment - @Ted What do you feel Ted? Chunhui's patch also makes sense but any way the idea is not to skip any recovered.edits file. So i felt removing the check will be enough.
        Hide
        Ted Yu added a comment -

        From the above analysis, it looks like this bug was caused by the optimization made in HBASE-4797.

        Show
        Ted Yu added a comment - From the above analysis, it looks like this bug was caused by the optimization made in HBASE-4797 .
        Hide
        ramkrishna.s.vasudevan added a comment -

        I just want to understand here like removing the

              if (maxSeqId <= minSeqId) {
                  String msg = "Maximum possible sequenceid for this log is " + maxSeqId
                      + ", skipped the whole file, path=" + edits;
                  LOG.debug(msg);
                  continue;
        

        'continue' here should solve the problem. Inside replayRecoveredEdits we have this code

                 // Now, figure if we should skip this edit.
                  if (key.getLogSeqNum() <= currentEditSeqId) {
                    skippedEdits++;
                    continue;
                  }
        

        which will any way skip unnecessary edits. I tried by commenting the 'continue' in the code and the test case that you gave passed. It is my thought, your comments are welcome.

        Show
        ramkrishna.s.vasudevan added a comment - I just want to understand here like removing the if (maxSeqId <= minSeqId) { String msg = "Maximum possible sequenceid for this log is " + maxSeqId + ", skipped the whole file, path=" + edits; LOG.debug(msg); continue ; 'continue' here should solve the problem. Inside replayRecoveredEdits we have this code // Now, figure if we should skip this edit. if (key.getLogSeqNum() <= currentEditSeqId) { skippedEdits++; continue ; } which will any way skip unnecessary edits. I tried by commenting the 'continue' in the code and the test case that you gave passed. It is my thought, your comments are welcome.
        Hide
        ramkrishna.s.vasudevan added a comment -

        @Chunhui
        Thanks for the patch.
        I would like to tell my analysis in this
        -> Suppose the current seq for RS 1 is 4
        When the first row kv was inserted KV(r1->v1), the current seq is 5.
        On moving the region to RS2 the store gets flushed and when RS2 opens the region the next seq he can use will be 6.
        Now we make the next kv entry KV(r2->v1), now the current seq for this entry is 6. Now when the region is moved again to RS1, another store file is created by RS2.
        Now when RS1 opens the region the seq number which he can use will be 7.

        We now add an entry KV(r3->v1) again in RS1 so it will have 7 in it (in WAL).

        Kill the RS2 first. This will create a recovered.edits with file name 000006.

        Kill RS1. This will create a recovered.edits with file name 00005.

        Now when the region is finally opened in a new RS i will be having the 2 store files and the max seq id from them will be 6. Now the recovered.edits will also give me 6 as highest seq.

              if (maxSeqId <= minSeqId) {
                  String msg = "Maximum possible sequenceid for this log is " + maxSeqId
                      + ", skipped the whole file, path=" + edits;
                  LOG.debug(msg);
                  continue;
        

        Correct me my analysis is wrong.

        Show
        ramkrishna.s.vasudevan added a comment - @Chunhui Thanks for the patch. I would like to tell my analysis in this -> Suppose the current seq for RS 1 is 4 When the first row kv was inserted KV(r1->v1), the current seq is 5. On moving the region to RS2 the store gets flushed and when RS2 opens the region the next seq he can use will be 6. Now we make the next kv entry KV(r2->v1), now the current seq for this entry is 6. Now when the region is moved again to RS1, another store file is created by RS2. Now when RS1 opens the region the seq number which he can use will be 7. We now add an entry KV(r3->v1) again in RS1 so it will have 7 in it (in WAL). Kill the RS2 first. This will create a recovered.edits with file name 000006. Kill RS1. This will create a recovered.edits with file name 00005. Now when the region is finally opened in a new RS i will be having the 2 store files and the max seq id from them will be 6. Now the recovered.edits will also give me 6 as highest seq. if (maxSeqId <= minSeqId) { String msg = "Maximum possible sequenceid for this log is " + maxSeqId + ", skipped the whole file, path=" + edits; LOG.debug(msg); continue ; Correct me my analysis is wrong.
        Hide
        chunhui shen added a comment -

        rather than the current MinimumEditLogSeqNum as the file name.

        Show
        chunhui shen added a comment - rather than the current MinimumEditLogSeqNum as the file name.
        Hide
        chunhui shen added a comment -

        In the patch,
        I make the region's MaximumEditLogSeqNum in the RecoveredEdit file as the file name.

        Show
        chunhui shen added a comment - In the patch, I make the region's MaximumEditLogSeqNum in the RecoveredEdit file as the file name.
        Hide
        ramkrishna.s.vasudevan added a comment - - edited

        @Chunhui

        Good test case. Yes able to reproduce the problem .. Just trying to understand more on what is happening there.

        Show
        ramkrishna.s.vasudevan added a comment - - edited @Chunhui Good test case. Yes able to reproduce the problem .. Just trying to understand more on what is happening there.
        Hide
        chunhui shen added a comment -

        I have written a test case for the issue to make a data loss scenario when skip RecoveredEdits.

        Show
        chunhui shen added a comment - I have written a test case for the issue to make a data loss scenario when skip RecoveredEdits.

          People

          • Assignee:
            chunhui shen
            Reporter:
            chunhui shen
          • Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development