Hadoop HDFS
  1. Hadoop HDFS
  2. HDFS-5464

Simplify block report diff calculation

    Details

    • Type: Improvement Improvement
    • Status: Resolved
    • Priority: Minor Minor
    • Resolution: Won't Fix
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: namenode
    • Labels:
      None

      Description

      The current calculation in BlockManager.reportDiff(..) is unnecessarily complicated. We could simplify the calculation.

      1. h5464_20131105.patch
        12 kB
        Tsz Wo Nicholas Sze
      2. h5464_20131105b.patch
        12 kB
        Tsz Wo Nicholas Sze
      3. h5464_20131105c.patch
        11 kB
        Tsz Wo Nicholas Sze

        Activity

        Hide
        Tsz Wo Nicholas Sze added a comment -

        The simplification does not work well – it uses more memory and longer running time. Resolving as Won't Fix.

        Show
        Tsz Wo Nicholas Sze added a comment - The simplification does not work well – it uses more memory and longer running time. Resolving as Won't Fix.
        Hide
        Konstantin Shvachko added a comment -

        > you won't argue that the new code is simpler than the existing

        Agreed on simpler.

        > see if I could come up a better solution

        Sure would be interesting to see. I doubt much can be done in this respect. We need to find blocks that did not appear in the report: in one pass and with constant memory overhead. May be an interview question.

        Show
        Konstantin Shvachko added a comment - > you won't argue that the new code is simpler than the existing Agreed on simpler. > see if I could come up a better solution Sure would be interesting to see. I doubt much can be done in this respect. We need to find blocks that did not appear in the report: in one pass and with constant memory overhead. May be an interview question.
        Hide
        Tsz Wo Nicholas Sze added a comment -

        h5464_20131105[bc].patch won't work. h5464_20131105.patch works but it uses more memory and longer running time.

        I agree with Konstantin that the original code is better although it is complicated. I will leave this for awhile and see if I could come up a better solution.

        Show
        Tsz Wo Nicholas Sze added a comment - h5464_20131105 [bc] .patch won't work. h5464_20131105.patch works but it uses more memory and longer running time. I agree with Konstantin that the original code is better although it is complicated. I will leave this for awhile and see if I could come up a better solution.
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12612309/h5464_20131105b.patch
        against trunk revision .

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 1 new or modified test files.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 javadoc. The javadoc tool did not generate any warning messages.

        +1 eclipse:eclipse. The patch built with eclipse:eclipse.

        -1 findbugs. The patch appears to introduce 1 new Findbugs (version 1.3.9) warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        -1 core tests. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs:

        org.apache.hadoop.hdfs.server.blockmanagement.TestBlocksWithNotEnoughRacks
        org.apache.hadoop.hdfs.TestLeaseRecovery2

        The following test timeouts occurred in hadoop-hdfs-project/hadoop-hdfs:

        org.apache.hadoop.hdfs.server.namenode.TestListCorruptFileBlocks
        org.apache.hadoop.hdfs.server.namenode.TestFsck

        +1 contrib tests. The patch passed contrib unit tests.

        Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/5342//testReport/
        Findbugs warnings: https://builds.apache.org/job/PreCommit-HDFS-Build/5342//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-hdfs.html
        Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/5342//console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12612309/h5464_20131105b.patch against trunk revision . +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 1 new or modified test files. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . The javadoc tool did not generate any warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. -1 findbugs . The patch appears to introduce 1 new Findbugs (version 1.3.9) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. -1 core tests . The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.server.blockmanagement.TestBlocksWithNotEnoughRacks org.apache.hadoop.hdfs.TestLeaseRecovery2 The following test timeouts occurred in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.server.namenode.TestListCorruptFileBlocks org.apache.hadoop.hdfs.server.namenode.TestFsck +1 contrib tests . The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/5342//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HDFS-Build/5342//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-hdfs.html Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/5342//console This message is automatically generated.
        Hide
        Tsz Wo Nicholas Sze added a comment -

        h5464_20131105c.patch: reverts the changes causing the findbugs warning.

        Show
        Tsz Wo Nicholas Sze added a comment - h5464_20131105c.patch: reverts the changes causing the findbugs warning.
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12612253/h5464_20131105.patch
        against trunk revision .

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 1 new or modified test files.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 javadoc. The javadoc tool did not generate any warning messages.

        -1 eclipse:eclipse. The patch failed to build with eclipse:eclipse.

        -1 findbugs. The patch appears to introduce 1 new Findbugs (version 1.3.9) warnings.

        -1 release audit. The applied patch generated 1 release audit warnings.

        +1 core tests. The patch passed unit tests in hadoop-hdfs-project/hadoop-hdfs.

        +1 contrib tests. The patch passed contrib unit tests.

        Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/5341//testReport/
        Release audit warnings: https://builds.apache.org/job/PreCommit-HDFS-Build/5341//artifact/trunk/patchprocess/patchReleaseAuditProblems.txt
        Findbugs warnings: https://builds.apache.org/job/PreCommit-HDFS-Build/5341//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-hdfs.html
        Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/5341//console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12612253/h5464_20131105.patch against trunk revision . +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 1 new or modified test files. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . The javadoc tool did not generate any warning messages. -1 eclipse:eclipse . The patch failed to build with eclipse:eclipse. -1 findbugs . The patch appears to introduce 1 new Findbugs (version 1.3.9) warnings. -1 release audit . The applied patch generated 1 release audit warnings. +1 core tests . The patch passed unit tests in hadoop-hdfs-project/hadoop-hdfs. +1 contrib tests . The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/5341//testReport/ Release audit warnings: https://builds.apache.org/job/PreCommit-HDFS-Build/5341//artifact/trunk/patchprocess/patchReleaseAuditProblems.txt Findbugs warnings: https://builds.apache.org/job/PreCommit-HDFS-Build/5341//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-hdfs.html Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/5341//console This message is automatically generated.
        Hide
        Tsz Wo Nicholas Sze added a comment -

        Here is the correct file: h5464_20131105b.patch

        Show
        Tsz Wo Nicholas Sze added a comment - Here is the correct file: h5464_20131105b.patch
        Hide
        Tsz Wo Nicholas Sze added a comment -

        Actually, it is unnecessarily to add all the blocks to the remove list. Here is a new patch.

        h5466_20131105b.patch

        Show
        Tsz Wo Nicholas Sze added a comment - Actually, it is unnecessarily to add all the blocks to the remove list. Here is a new patch. h5466_20131105b.patch
        Hide
        Tsz Wo Nicholas Sze added a comment -

        Hi Konstantin,

        You may be correct that the new code use more memory, however, I beg you won't argue that the new code is simpler than the existing code.

        I will think about how to reduce the memory usage. Thanks for the input.

        Show
        Tsz Wo Nicholas Sze added a comment - Hi Konstantin, You may be correct that the new code use more memory, however, I beg you won't argue that the new code is simpler than the existing code. I will think about how to reduce the memory usage. Thanks for the input.
        Hide
        Konstantin Shvachko added a comment -

        So in regular case you will be adding 100,000 replicas to toRemove list only in order to delete most of them later. How does it make things simpler?
        The delimiter lets you keep the calculated lists as small as possible, reducing memory consumption, avoiding frequent GCs.

        Show
        Konstantin Shvachko added a comment - So in regular case you will be adding 100,000 replicas to toRemove list only in order to delete most of them later. How does it make things simpler? The delimiter lets you keep the calculated lists as small as possible, reducing memory consumption, avoiding frequent GCs.
        Hide
        Tsz Wo Nicholas Sze added a comment -

        h5464_20131105.patch: remove the delimiter logic.

        Show
        Tsz Wo Nicholas Sze added a comment - h5464_20131105.patch: remove the delimiter logic.

          People

          • Assignee:
            Tsz Wo Nicholas Sze
            Reporter:
            Tsz Wo Nicholas Sze
          • Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development