Uploaded image for project: 'Hadoop YARN'
  1. Hadoop YARN
  2. YARN-11057

NodeManager may generate too many empty log dirs when we configure many log dirs

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 2.7.7, 3.3.1
    • None
    • None

    Description

        NodeManager may generate too many empty log dirs when we configure many log dirs in NonAggregationLogHandler mode.For example: We have 24 disks, 512G memory,hypothesis that average time cost is 1 min for every container  and average container's size is 4g.Then parallel running containers in one server are 512G / 4G = 128. Every container will generate more than 24 directories in current policy.Then total directories in one week is 128 * 24 * (60 * 24 * 7) = 30 965 760 .This is not conside the tmp directories. Which will consume too many inods in server and affect the disk's io utils.This is because so many inodes will consume too many memory cached in linux.When the memory is not insufficience the cached inodes will remove from the memory.Which will increase the incidence of scan disk and the disk io utils will become high.Actually, this directories only one is used for container logs for every container. The others is empty.So we can delete the empty directories when the job is finished.Which will reduce too many inodes.

      Attachments

        1. YARN-11507.0001.patch
          4 kB
          Yao Guangdong

        Activity

          People

            yaoguangdong Yao Guangdong
            yaoguangdong Yao Guangdong
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated: