Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-8710

Always read DU value from the cached "dfsUsed" file on datanode startup

    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Major
    • Resolution: Won't Fix
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None

      Description

      Currently, DataNode will cache DU value in "dfsUsed" file termly. When DataNode starts or restarts, it will read in the cached DU value from "dfsUsed" file if the value is less than 600 seconds old, otherwise, it will run DU command, which is a very time-consuming operation(may up to dozens of minutes) when DataNode has huge number of blocks.

      Since slight imprecision of dfsUsed is not critical, and the DU value will be updated every 600 seconds (the default DU interval) after DataNode started, we can always read DU value from the cached file (Regardless of whether this value is less than 600 seconds old or not) and skip DU operation on DataNode startup to significantly shorten the startup time.

        Attachments

        1. HDFS-8710.001.patch
          2 kB
          Xinwei Qin

          Issue Links

            Activity

              People

              • Assignee:
                xinwei Xinwei Qin
                Reporter:
                xinwei Xinwei Qin
              • Votes:
                0 Vote for this issue
                Watchers:
                2 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: