Hadoop HDFS
  1. Hadoop HDFS
  2. HDFS-853

The HDFS webUI should show a metric that summarizes whether the cluster is balanced regarding disk space usage

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.22.0
    • Component/s: namenode
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      It is desirable to know how much the datanodes vary form one another in terms of space utilization to get a sense of how well a HDFS cluster is balanced.

      1. Screen shot 2010-04-27 at 5.52.05 PM.png
        53 kB
        Dmytro Molkov
      2. HDFS-853.patch
        3 kB
        Dmytro Molkov

        Activity

        Hide
        dhruba borthakur added a comment -

        We can calculate the variance of space-usage by parsing "dfsadmin -report" but it will be good to have it in the webUI. It will also be good to expose it via a Hadoop Metric.

        Show
        dhruba borthakur added a comment - We can calculate the variance of space-usage by parsing "dfsadmin -report" but it will be good to have it in the webUI. It will also be good to expose it via a Hadoop Metric.
        Hide
        Konstantin Shvachko added a comment -

        How do you propose to measure the variance? Calculate the average utilization and then print in a separate column the difference in %% from that average for each node?

        Show
        Konstantin Shvachko added a comment - How do you propose to measure the variance? Calculate the average utilization and then print in a separate column the difference in %% from that average for each node?
        Hide
        dhruba borthakur added a comment -

        I would like a cluster-wide measure that describes how much the cluster is balanced. The Cluster Summary prints total capacity, used disk space, etc. and it would be good to include another line that reflects how-balanced the cluster is.

        Show
        dhruba borthakur added a comment - I would like a cluster-wide measure that describes how much the cluster is balanced. The Cluster Summary prints total capacity, used disk space, etc. and it would be good to include another line that reflects how-balanced the cluster is.
        Hide
        Konstantin Shvachko added a comment -

        OK, so you want one line that reflects the cluster-wide balance rather than a deviation from the average for each data-node.
        How do you measure the balance? Do you have any particular measurement in mind?

        Show
        Konstantin Shvachko added a comment - OK, so you want one line that reflects the cluster-wide balance rather than a deviation from the average for each data-node. How do you measure the balance? Do you have any particular measurement in mind?
        Hide
        Andrew Ryan added a comment -

        We're currently graphing both mean and standard deviation of datanodes from that mean, using a script that parses the output of 'dfsadmin -report'. Our DFS cluster nodes all have the same amount of disk space, so you'd expect mean of individual datanodes to be the same as % DFS full, but it's not quite the same. Haven't yet looked into why this is so.

        To directly answer Konstantin's question, the one line we're using is standard deviation.

        Show
        Andrew Ryan added a comment - We're currently graphing both mean and standard deviation of datanodes from that mean, using a script that parses the output of 'dfsadmin -report'. Our DFS cluster nodes all have the same amount of disk space, so you'd expect mean of individual datanodes to be the same as % DFS full, but it's not quite the same. Haven't yet looked into why this is so. To directly answer Konstantin's question, the one line we're using is standard deviation.
        Hide
        Konstantin Shvachko added a comment -

        May be we should use the mean and standard deviation of utilization rather than directly disk space. This would work for heterogeneous clusters as well. By utilization I mean the percentage of disk space used for blocks on a data-node. We should also make sure this is consistent with the Balancer: balancing should improve the metrics.

        Show
        Konstantin Shvachko added a comment - May be we should use the mean and standard deviation of utilization rather than directly disk space. This would work for heterogeneous clusters as well. By utilization I mean the percentage of disk space used for blocks on a data-node. We should also make sure this is consistent with the Balancer: balancing should improve the metrics.
        Hide
        Dmytro Molkov added a comment -

        Attaching a patch and a screenshot of what it looks like. Please have a look at it.
        No unittest included since it is a UI change rather than anything else, but if you can think of a way to test it - let me know.

        Show
        Dmytro Molkov added a comment - Attaching a patch and a screenshot of what it looks like. Please have a look at it. No unittest included since it is a UI change rather than anything else, but if you can think of a way to test it - let me know.
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12443028/HDFS-853.patch
        against trunk revision 937914.

        +1 @author. The patch does not contain any @author tags.

        -1 tests included. The patch doesn't appear to include any new or modified tests.
        Please justify why no new tests are needed for this patch.
        Also please list what manual steps were performed to verify this patch.

        +1 javadoc. The javadoc tool did not generate any warning messages.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 findbugs. The patch does not introduce any new Findbugs warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        +1 core tests. The patch passed core unit tests.

        -1 contrib tests. The patch failed contrib unit tests.

        Test results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/330/testReport/
        Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/330/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
        Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/330/artifact/trunk/build/test/checkstyle-errors.html
        Console output: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/330/console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12443028/HDFS-853.patch against trunk revision 937914. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. -1 contrib tests. The patch failed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/330/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/330/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/330/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/330/console This message is automatically generated.
        Hide
        dhruba borthakur added a comment -

        Konstantin: does this patch address your concern of whether it will handle heterogeneous clusters?

        Show
        dhruba borthakur added a comment - Konstantin: does this patch address your concern of whether it will handle heterogeneous clusters?
        Hide
        Dmytro Molkov added a comment -

        This patch should address any clusters nicely since what we are operating with is the load percentage, not absolute values.

        Show
        Dmytro Molkov added a comment - This patch should address any clusters nicely since what we are operating with is the load percentage, not absolute values.
        Hide
        dhruba borthakur added a comment -

        +1 code looks good.

        Show
        dhruba borthakur added a comment - +1 code looks good.
        Hide
        dhruba borthakur added a comment -

        I just committed this. Thanks Dmytro!

        Show
        dhruba borthakur added a comment - I just committed this. Thanks Dmytro!

          People

          • Assignee:
            Dmytro Molkov
            Reporter:
            dhruba borthakur
          • Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development