Hadoop HDFS
  1. Hadoop HDFS
  2. HDFS-1910

when dfs.name.dir and dfs.name.edits.dir are same fsimage will be saved twice every time

    Details

    • Target Version/s:
    • Hadoop Flags:
      Reviewed

      Description

      when image and edits dir are configured same, the fsimage flushing from memory to disk will be done twice whenever saveNamespace is done. this may impact the performance of backupnode/snn where it does a saveNamespace during every checkpointing time.

      1. saveImageOnce-v1.1.patch
        0.7 kB
        Konstantin Shvachko
      2. saveImageOnce-v0.22.patch
        0.7 kB
        Konstantin Shvachko

        Issue Links

          Activity

          Hide
          Konstantin Shvachko added a comment -

          This is indeed a critical issue, which slows down cluster start up, because the image is written twice for every storage directory, which is of type IMAGE_AND_EDITS.
          I traced this problem from 0.22 all the way back to 0.20. On trunk it is not a problem as far as I can see.
          In 0.22 branch and below saveNamespace() iterates over all storage directories and updates either image or edits or both files depending on the directory type. That way if a failure occurs in the middle only one directory will need to be recovered, the rest will maintain consistent state.
          The trunk differs because saveNamespace() independently updates first image directories then edits directories, which is valid because of txIds. If a failure occurs in the middle of updating the matching pair of image and edits in each directory should be found based on txIds.

          Show
          Konstantin Shvachko added a comment - This is indeed a critical issue, which slows down cluster start up, because the image is written twice for every storage directory, which is of type IMAGE_AND_EDITS. I traced this problem from 0.22 all the way back to 0.20. On trunk it is not a problem as far as I can see. In 0.22 branch and below saveNamespace() iterates over all storage directories and updates either image or edits or both files depending on the directory type. That way if a failure occurs in the middle only one directory will need to be recovered, the rest will maintain consistent state. The trunk differs because saveNamespace() independently updates first image directories then edits directories, which is valid because of txIds. If a failure occurs in the middle of updating the matching pair of image and edits in each directory should be found based on txIds.
          Hide
          Matt Foley added a comment -

          Please use the "Target Versions" field, not the "Fix Version" field, to express where a bug should be fixed or where you intend to fix it. The "Fix Version" field is for saying where it has been fixed, and should only be set after committing a fix to a branch.

          Also, since 1.0.0 is already in vote, this should be targeted for 1.1.0.

          Show
          Matt Foley added a comment - Please use the "Target Versions" field, not the "Fix Version" field, to express where a bug should be fixed or where you intend to fix it. The "Fix Version" field is for saying where it has been fixed, and should only be set after committing a fix to a branch. Also, since 1.0.0 is already in vote, this should be targeted for 1.1.0.
          Hide
          Konstantin Shvachko added a comment -

          Here is a simple fix for branch 0.22.

          Show
          Konstantin Shvachko added a comment - Here is a simple fix for branch 0.22.
          Hide
          Ted Yu added a comment -

          Patch makes sense.

          Show
          Ted Yu added a comment - Patch makes sense.
          Hide
          Konstantin Boudnik added a comment -

          +1 patch looks good.

          Show
          Konstantin Boudnik added a comment - +1 patch looks good.
          Hide
          Aaron T. Myers added a comment -

          Shouldn't we include a test to ensure this issue doesn't regress?

          Show
          Aaron T. Myers added a comment - Shouldn't we include a test to ensure this issue doesn't regress?
          Hide
          Konstantin Shvachko added a comment -

          Sorry had a bad commit day here.
          I did not include a test here because this was revealed through watching log files during NN startup, which indicates to me that the condition is easily traceable. And the test would have been based either on logs analysis or some instrumentation of FSImage, which would count methods calls. Neither looks appealing to me, but feel free to file a bug if you feel strongly about it.

          Show
          Konstantin Shvachko added a comment - Sorry had a bad commit day here. I did not include a test here because this was revealed through watching log files during NN startup, which indicates to me that the condition is easily traceable. And the test would have been based either on logs analysis or some instrumentation of FSImage, which would count methods calls. Neither looks appealing to me, but feel free to file a bug if you feel strongly about it.
          Hide
          Konstantin Shvachko added a comment -

          I just committed this to 0.22 branch. Leaving open pending decision on 1.1.0

          Show
          Konstantin Shvachko added a comment - I just committed this to 0.22 branch. Leaving open pending decision on 1.1.0
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Hdfs-22-branch #123 (See https://builds.apache.org/job/Hadoop-Hdfs-22-branch/123/)
          Remove erroneously added file while commit to HDFS-1910.
          HDFS-1910. NameNdoe should not save fsimage twice. Contributed by Konstantin Shvachko.
          Revert. Refers to wrong jira HDFS-1910.
          HDFS-1910. NameNdoe should not save fsimage twice. Contributed by Konstantin Shvachko.

          shv : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1225342
          Files :

          • /hadoop/common/branches/branch-0.22/hdfs/bin/hadoop

          shv : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1225337
          Files :

          • /hadoop/common/branches/branch-0.22/hdfs/CHANGES.txt
          • /hadoop/common/branches/branch-0.22/hdfs/bin/hadoop
          • /hadoop/common/branches/branch-0.22/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/FSImage.java

          shv : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1225336
          Files :

          • /hadoop/common/branches/branch-0.22/hdfs/CHANGES.txt
          • /hadoop/common/branches/branch-0.22/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/FSImage.java
          • /hadoop/common/branches/branch-0.22/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/TestBackupNode.java

          shv : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1225333
          Files :

          • /hadoop/common/branches/branch-0.22/hdfs/CHANGES.txt
          • /hadoop/common/branches/branch-0.22/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/FSImage.java
          • /hadoop/common/branches/branch-0.22/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/TestBackupNode.java
          Show
          Hudson added a comment - Integrated in Hadoop-Hdfs-22-branch #123 (See https://builds.apache.org/job/Hadoop-Hdfs-22-branch/123/ ) Remove erroneously added file while commit to HDFS-1910 . HDFS-1910 . NameNdoe should not save fsimage twice. Contributed by Konstantin Shvachko. Revert. Refers to wrong jira HDFS-1910 . HDFS-1910 . NameNdoe should not save fsimage twice. Contributed by Konstantin Shvachko. shv : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1225342 Files : /hadoop/common/branches/branch-0.22/hdfs/bin/hadoop shv : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1225337 Files : /hadoop/common/branches/branch-0.22/hdfs/CHANGES.txt /hadoop/common/branches/branch-0.22/hdfs/bin/hadoop /hadoop/common/branches/branch-0.22/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/FSImage.java shv : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1225336 Files : /hadoop/common/branches/branch-0.22/hdfs/CHANGES.txt /hadoop/common/branches/branch-0.22/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/FSImage.java /hadoop/common/branches/branch-0.22/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/TestBackupNode.java shv : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1225333 Files : /hadoop/common/branches/branch-0.22/hdfs/CHANGES.txt /hadoop/common/branches/branch-0.22/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/FSImage.java /hadoop/common/branches/branch-0.22/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/TestBackupNode.java
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Hdfs-22-branch #124 (See https://builds.apache.org/job/Hadoop-Hdfs-22-branch/124/)
          Remove erroneously added file while commit to HDFS-1910.
          HDFS-1910. NameNdoe should not save fsimage twice. Contributed by Konstantin Shvachko.
          Revert. Refers to wrong jira HDFS-1910.
          HDFS-1910. NameNdoe should not save fsimage twice. Contributed by Konstantin Shvachko.

          shv : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1225342
          Files :

          • /hadoop/common/branches/branch-0.22/hdfs/bin/hadoop

          shv : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1225337
          Files :

          • /hadoop/common/branches/branch-0.22/hdfs/CHANGES.txt
          • /hadoop/common/branches/branch-0.22/hdfs/bin/hadoop
          • /hadoop/common/branches/branch-0.22/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/FSImage.java

          shv : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1225336
          Files :

          • /hadoop/common/branches/branch-0.22/hdfs/CHANGES.txt
          • /hadoop/common/branches/branch-0.22/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/FSImage.java
          • /hadoop/common/branches/branch-0.22/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/TestBackupNode.java

          shv : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1225333
          Files :

          • /hadoop/common/branches/branch-0.22/hdfs/CHANGES.txt
          • /hadoop/common/branches/branch-0.22/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/FSImage.java
          • /hadoop/common/branches/branch-0.22/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/TestBackupNode.java
          Show
          Hudson added a comment - Integrated in Hadoop-Hdfs-22-branch #124 (See https://builds.apache.org/job/Hadoop-Hdfs-22-branch/124/ ) Remove erroneously added file while commit to HDFS-1910 . HDFS-1910 . NameNdoe should not save fsimage twice. Contributed by Konstantin Shvachko. Revert. Refers to wrong jira HDFS-1910 . HDFS-1910 . NameNdoe should not save fsimage twice. Contributed by Konstantin Shvachko. shv : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1225342 Files : /hadoop/common/branches/branch-0.22/hdfs/bin/hadoop shv : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1225337 Files : /hadoop/common/branches/branch-0.22/hdfs/CHANGES.txt /hadoop/common/branches/branch-0.22/hdfs/bin/hadoop /hadoop/common/branches/branch-0.22/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/FSImage.java shv : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1225336 Files : /hadoop/common/branches/branch-0.22/hdfs/CHANGES.txt /hadoop/common/branches/branch-0.22/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/FSImage.java /hadoop/common/branches/branch-0.22/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/TestBackupNode.java shv : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1225333 Files : /hadoop/common/branches/branch-0.22/hdfs/CHANGES.txt /hadoop/common/branches/branch-0.22/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/FSImage.java /hadoop/common/branches/branch-0.22/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/TestBackupNode.java
          Hide
          Jakob Homan added a comment -

          I'm fine with this going into 1.1 without a test. The error is obvious and the fix equally so. Konstantin, do you want to commit it there?

          Show
          Jakob Homan added a comment - I'm fine with this going into 1.1 without a test. The error is obvious and the fix equally so. Konstantin, do you want to commit it there?
          Hide
          Aaron T. Myers added a comment -

          The error is obvious and the fix equally so.

          The error clearly isn't all that obvious, given that this bug existed for quite a while without being noticed/fixed.

          I think we should have a test mostly to ensure that this issue doesn't regress. This is a bad habit to get into. That said, I'm certainly not going to block this fix going in for lack of test coverage.

          Show
          Aaron T. Myers added a comment - The error is obvious and the fix equally so. The error clearly isn't all that obvious, given that this bug existed for quite a while without being noticed/fixed. I think we should have a test mostly to ensure that this issue doesn't regress. This is a bad habit to get into. That said, I'm certainly not going to block this fix going in for lack of test coverage.
          Hide
          Jakob Homan added a comment -

          The error clearly isn't all that obvious, given that this bug existed for quite a while without being noticed/fixed.

          Not really. Something being done twice is easier to overlook than something not being done. Assuming the second time overwrites the first, you end up with what you expect: one copy of the fsimage.

          Show
          Jakob Homan added a comment - The error clearly isn't all that obvious, given that this bug existed for quite a while without being noticed/fixed. Not really. Something being done twice is easier to overlook than something not being done. Assuming the second time overwrites the first, you end up with what you expect: one copy of the fsimage.
          Hide
          Aaron T. Myers added a comment -

          @Jakob - sorry, I wasn't clear. The error in the code is indeed fairly obvious. My point was that the reintroduction of this bug would be non-obvious since, as you point out, this operation is idempotent.

          Show
          Aaron T. Myers added a comment - @Jakob - sorry, I wasn't clear. The error in the code is indeed fairly obvious. My point was that the reintroduction of this bug would be non-obvious since, as you point out, this operation is idempotent.
          Hide
          Jakob Homan added a comment -

          It's up to Konstantin to commit to 1.1 if he wants; he has a committer's +1 to do so. This will be a serious pain to test in a true unit test, so rather than wait for someone to write that code, I'd rather get the signficant startup time improvement in. Yeah, it'd be great to have a test - and if it were at all possible this could cause a data loss, I'd insist on it - but people's time is limited. I'd recommend filing a JIRA to add one; it would be a good mid-level newbie issue.

          Show
          Jakob Homan added a comment - It's up to Konstantin to commit to 1.1 if he wants; he has a committer's +1 to do so. This will be a serious pain to test in a true unit test, so rather than wait for someone to write that code, I'd rather get the signficant startup time improvement in. Yeah, it'd be great to have a test - and if it were at all possible this could cause a data loss, I'd insist on it - but people's time is limited. I'd recommend filing a JIRA to add one; it would be a good mid-level newbie issue.
          Hide
          Aaron T. Myers added a comment -

          It's up to Konstantin to commit to 1.1 if he wants; he has a committer's +1 to do so.

          I know that. Like I said - I'm not going to block it.

          This will be a serious pain to test in a true unit test,

          We do these sorts of tests all the time with Mockito.spy, Mockito.verify, and Mockito.times.

          so rather than wait for someone to write that code, I'd rather get the signficant startup time improvement in. Yeah, it'd be great to have a test - and if it were at all possible this could cause a data loss, I'd insist on it - but people's time is limited.

          Sure, no objection. I just asked for a test as I do for all code I review which doesn't have test coverage. It's your prerogative if you want to ignore that request, or punt it to another JIRA.

          Show
          Aaron T. Myers added a comment - It's up to Konstantin to commit to 1.1 if he wants; he has a committer's +1 to do so. I know that. Like I said - I'm not going to block it. This will be a serious pain to test in a true unit test, We do these sorts of tests all the time with Mockito.spy, Mockito.verify, and Mockito.times. so rather than wait for someone to write that code, I'd rather get the signficant startup time improvement in. Yeah, it'd be great to have a test - and if it were at all possible this could cause a data loss, I'd insist on it - but people's time is limited. Sure, no objection. I just asked for a test as I do for all code I review which doesn't have test coverage. It's your prerogative if you want to ignore that request, or punt it to another JIRA.
          Hide
          Konstantin Shvachko added a comment -

          I will commit it to 1.1 as you guys suggested.

          Show
          Konstantin Shvachko added a comment - I will commit it to 1.1 as you guys suggested.
          Hide
          Konstantin Shvachko added a comment -

          I just committed this to branch 1.

          Show
          Konstantin Shvachko added a comment - I just committed this to branch 1.
          Hide
          Matt Foley added a comment -

          Closed upon release of Hadoop-1.1.0.

          Show
          Matt Foley added a comment - Closed upon release of Hadoop-1.1.0.

            People

            • Assignee:
              Unassigned
              Reporter:
              Gokul
            • Votes:
              0 Vote for this issue
              Watchers:
              11 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development