Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-6292

Display HDFS per user and per group usage on the webUI

    Details

    • Type: Improvement
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: 2.4.0
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None

      Description

      It would be nice to show HDFS usage per user and per group on a web ui.

      1. HDFS-6292.png
        99 kB
        Ravi Prakash
      2. HDFS-6292.patch
        356 kB
        Ravi Prakash
      3. HDFS-6292.01.patch
        363 kB
        Ravi Prakash

        Activity

        Hide
        raviprak Ravi Prakash added a comment -

        Since we probably don't want to place the burden of calculating this on an already busy NN, we can do this on the fsimage on the SNN. Ofcourse this has the downside of being refreshed only when a checkpoint happens.

        Show
        raviprak Ravi Prakash added a comment - Since we probably don't want to place the burden of calculating this on an already busy NN, we can do this on the fsimage on the SNN. Ofcourse this has the downside of being refreshed only when a checkpoint happens.
        Hide
        vinayrpet Vinayakumar B added a comment -

        Good one Ravi.

        I think calculating in Secondary NN side is OK. But I have a feeling like, just to get these statistics user needs to navigate to SNN page is not a good idea.
        How about keeping track of these in NameNode side from the starting itself and update these statistics (same as other metrics.) for every operation which modifies these and avoid re-calculation of whole statistics in between to avoid holding namesystem lock for more time.

        Show
        vinayrpet Vinayakumar B added a comment - Good one Ravi. I think calculating in Secondary NN side is OK. But I have a feeling like, just to get these statistics user needs to navigate to SNN page is not a good idea. How about keeping track of these in NameNode side from the starting itself and update these statistics (same as other metrics.) for every operation which modifies these and avoid re-calculation of whole statistics in between to avoid holding namesystem lock for more time.
        Hide
        raviprak Ravi Prakash added a comment -

        Hi Vinayakumar!
        Thanks for your feedback! I considered that option, and I wondered what the overhead might be (during startup + every modifying op). I guess we won't really know unless we have a working prototype/solution.
        I did this as a side hack, so I can try to continue hacking on this at a very slow pace, or if you / someone wants to take it over and get it done sooner, please feel free to assign it to yourself.

        Show
        raviprak Ravi Prakash added a comment - Hi Vinayakumar! Thanks for your feedback! I considered that option, and I wondered what the overhead might be (during startup + every modifying op). I guess we won't really know unless we have a working prototype/solution. I did this as a side hack, so I can try to continue hacking on this at a very slow pace, or if you / someone wants to take it over and get it done sooner, please feel free to assign it to yourself.
        Hide
        raviprak Ravi Prakash added a comment -

        Ok! Here's the skeleton code that has come out of my attempt to add this functionality to the NameNode. DISCLAIMER: This patch is not ready and I'm uploading it only so that you folks can see what I'm thinking so far.

        I would request feedback on the following (and whatever else you think of):
        1. Should HdfsUsageMetricsSource be thread safe? Should I just assume the FSN write lock is always held when calling into here?
        2. I understand that we need to plug into a LOT of places to correctly update the stats. I have only plugged into 2-3 places (so obviously the usage will be incorrect if you venture out of those ops: create / delete / chown files+dirs and even these have wrinkles I need to smooth) . I propose we do this all as another sub-task after the framework gets committed.
        3. I still need to figure out how best to let this be configurable for any of the HDFS daemons: NameNode/Standby/SecondaryNamenode
        4. Enable and disable this feature dynamically.

        Show
        raviprak Ravi Prakash added a comment - Ok! Here's the skeleton code that has come out of my attempt to add this functionality to the NameNode. DISCLAIMER: This patch is not ready and I'm uploading it only so that you folks can see what I'm thinking so far. I would request feedback on the following (and whatever else you think of): 1. Should HdfsUsageMetricsSource be thread safe? Should I just assume the FSN write lock is always held when calling into here? 2. I understand that we need to plug into a LOT of places to correctly update the stats. I have only plugged into 2-3 places (so obviously the usage will be incorrect if you venture out of those ops: create / delete / chown files+dirs and even these have wrinkles I need to smooth) . I propose we do this all as another sub-task after the framework gets committed. 3. I still need to figure out how best to let this be configurable for any of the HDFS daemons: NameNode/Standby/SecondaryNamenode 4. Enable and disable this feature dynamically.
        Hide
        szetszwo Tsz Wo Nicholas Sze added a comment -

        > I think calculating in Secondary NN side is OK. But I have a feeling like, just to get these statistics user needs to navigate to SNN page is not a good idea.

        How about adding a link to SNN in the namenode web page?

        Show
        szetszwo Tsz Wo Nicholas Sze added a comment - > I think calculating in Secondary NN side is OK. But I have a feeling like, just to get these statistics user needs to navigate to SNN page is not a good idea. How about adding a link to SNN in the namenode web page?

          People

          • Assignee:
            raviprak Ravi Prakash
            Reporter:
            raviprak Ravi Prakash
          • Votes:
            0 Vote for this issue
            Watchers:
            12 Start watching this issue

            Dates

            • Created:
              Updated:

              Development