Hadoop HDFS
  1. Hadoop HDFS
  2. HDFS-1121

Allow HDFS client to measure distribution of blocks across devices for a specific DataNode

    Details

    • Type: Improvement Improvement
    • Status: Open
    • Priority: Major Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: hdfs-client
    • Labels:
      None

      Description

      As discussed on the mailing list, it would be useful if the DfsClient could measure the distribution of blocks across devices for an individual DataNode.

        Issue Links

          Activity

          Hide
          Travis Crawford added a comment -

          Should clients care where their blocks get placed on disk? That seems like something a datanode should be solely responsible for.

          Show
          Travis Crawford added a comment - Should clients care where their blocks get placed on disk? That seems like something a datanode should be solely responsible for.
          Hide
          Steve Loughran added a comment -

          The DN should do placement but we need to be able to see where stuff has been placed as a precursor to any attempts to manage placement better. Ideally in some web ui that shows how balanced/unbalanced servers have become

          Clients need to be able to do it so when your sort of 150TB of data takes longer than the spreadsheet says, your ops team need to be confident that the problem isn't due to back block placement on your 12 HDD servers, but to something else (networking, etc).

          Show
          Steve Loughran added a comment - The DN should do placement but we need to be able to see where stuff has been placed as a precursor to any attempts to manage placement better. Ideally in some web ui that shows how balanced/unbalanced servers have become Clients need to be able to do it so when your sort of 150TB of data takes longer than the spreadsheet says, your ops team need to be confident that the problem isn't due to back block placement on your 12 HDD servers, but to something else (networking, etc).
          Hide
          Wang Xu added a comment -

          Hi folks,

          As discussed in HDFS-1312, we need a jsp on datanode that shows local filesystem usage information. Beyond that, do we need a centralized collection mechanism, with which we can got alerts for unbalanced datanodes?

          Show
          Wang Xu added a comment - Hi folks, As discussed in HDFS-1312 , we need a jsp on datanode that shows local filesystem usage information. Beyond that, do we need a centralized collection mechanism, with which we can got alerts for unbalanced datanodes?
          Hide
          Wang Xu added a comment -

          Is a servlet that reports df and du ok? I just added a servlet which could give an xml. And each volume in it would like this:

          <org.apache.hadoop.hdfs.server.datanode.FSDataset$FSVolume MountPoint="/state/partion1" Capacity="223625277440" DfsUsed="24576" Available="40881012736"/>

          Show
          Wang Xu added a comment - Is a servlet that reports df and du ok? I just added a servlet which could give an xml. And each volume in it would like this: <org.apache.hadoop.hdfs.server.datanode.FSDataset$FSVolume MountPoint="/state/partion1" Capacity="223625277440" DfsUsed="24576" Available="40881012736"/>
          Hide
          Wang Xu added a comment -

          Init version of servlet code. I wonder whether it implements the function required by this issue.

          Show
          Wang Xu added a comment - Init version of servlet code. I wonder whether it implements the function required by this issue.
          Hide
          Harsh J added a comment -

          Isn't monitoring http://DNHOST:50075/jmx?qry=hadoop:service=DataNode,name=DataNodeInfo individually sufficient for getting volume-info results, to measure disk-level balance for your clusters?

          I don't think we should increase the DN heartbeat payload when this is already exposed at the DN level?

          Show
          Harsh J added a comment - Isn't monitoring http://DNHOST:50075/jmx?qry=hadoop:service=DataNode,name=DataNodeInfo individually sufficient for getting volume-info results, to measure disk-level balance for your clusters? I don't think we should increase the DN heartbeat payload when this is already exposed at the DN level?

            People

            • Assignee:
              Unassigned
              Reporter:
              Jeff Hammerbacher
            • Votes:
              0 Vote for this issue
              Watchers:
              13 Start watching this issue

              Dates

              • Created:
                Updated:

                Development