Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-10285 Storage Policy Satisfier in HDFS
  3. HDFS-13166

[SPS]: Implement caching mechanism to keep LIVE datanodes to minimize costly getLiveDatanodeStorageReport() calls

    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • HDFS-10285, 3.2.0
    • None
    • None

    Description

      Presently #getLiveDatanodeStorageReport() is fetched for every file and does the computation. This Jira sub-task is to discuss and implement a cache mechanism which in turn reduces the number of function calls. Also, could define a configurable refresh interval and periodically refresh the DN cache by fetching latest #getLiveDatanodeStorageReport on this interval.

       Following comments taken from HDFS-10285, here
      Comment-7)

      Adding getDatanodeStorageReport is concerning. getDatanodeListForReport is already a very bad method that should be avoided for anything but jmx – even then it’s a concern. I eliminated calls to it years ago. All it takes is a nscd/dns hiccup and you’re left holding the fsn lock for an excessive length of time. Beyond that, the response is going to be pretty large and tagging all the storage reports is not going to be cheap.

      verifyTargetDatanodeHasSpaceForScheduling does it really need the namesystem lock? Can’t DatanodeDescriptor#chooseStorage4Block synchronize on its storageMap?

      Appears to be calling getLiveDatanodeStorageReport for every file. As mentioned earlier, this is NOT cheap. The SPS should be able to operate on a fuzzy/cached state of the world. Then it gets another datanode report to determine the number of live nodes to decide if it should sleep before processing the next path. The number of nodes from the prior cached view of the world should suffice.

      Attachments

        1. HDFS-13166-HDFS-10285-03.patch
          57 kB
          Rakesh Radhakrishnan
        2. HDFS-13166-HDFS-10285-02.patch
          55 kB
          Rakesh Radhakrishnan
        3. HDFS-13166-HDFS-10285-01.patch
          54 kB
          Rakesh Radhakrishnan
        4. HDFS-13166-HDFS-10285-00.patch
          46 kB
          Rakesh Radhakrishnan

        Activity

          People

            rakeshr Rakesh Radhakrishnan
            rakeshr Rakesh Radhakrishnan
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: