Uploaded image for project: 'Hadoop YARN'
  1. Hadoop YARN
  2. YARN-9634

Make yarn submit dir and log aggregation dir more evenly distributed

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 3.2.0
    • None
    • None
    • None

    Description

      When the cluster size is large, the dir which user submits the job, and the dir which container log aggregate, and other information will fill the HDFS directory, because the HDFS directory has a default storage limit, this can be configured by "yarn.log-aggregation.retain-seconds" to solve. But  the FSNamesystemLock#writeLock and rpc operation which these dir operation triggered will affect the namespace which these dirs are located, in order to get this better we have let this dir in one single HDFS federation namespace, but with the cluster become huge, the single namespace will also affect the rpc performance. In response to this situation, we can change these dirs more distributed among multi namespace dirs, with some policy to choose, such as hash policy and round robin policy.

      Attachments

        Activity

          People

            zhuqi Qi Zhu
            zhuqi Qi Zhu
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

              Created:
              Updated: