Uploaded image for project: 'Hadoop YARN'
  1. Hadoop YARN
  2. YARN-10025 Various improvements in YARN log servlets
  3. YARN-10284

Add lazy initialization of LogAggregationFileControllerFactory in LogServlet

    XMLWordPrintableJSON

    Details

    • Type: Sub-task
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 3.3.0
    • Fix Version/s: 3.4.0, 3.3.1
    • Component/s: log-aggregation, yarn
    • Labels:
      None
    • Target Version/s:
    • Hadoop Flags:
      Reviewed

      Description

      Suppose the mapred user has no access to the remote folder. Pinging the JHS if it's online in every few seconds will produce the following entry in the log:

      2020-05-19 00:17:20,331 WARN org.apache.hadoop.yarn.logaggregation.filecontroller.ifile.LogAggregationIndexedFileController: Unable to determine if the filesystem supports append operation
      java.nio.file.AccessDeniedException: test-bucket: org.apache.hadoop.fs.s3a.auth.NoAuthWithAWSException: There is no mapped role for the group(s) associated with the authenticated user. (user: mapred)
      	at org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:204)
      [...]
      	at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:513)
      	at org.apache.hadoop.yarn.logaggregation.filecontroller.ifile.LogAggregationIndexedFileController.getRollOverLogMaxSize(LogAggregationIndexedFileController.java:1157)
      	at org.apache.hadoop.yarn.logaggregation.filecontroller.ifile.LogAggregationIndexedFileController.initInternal(LogAggregationIndexedFileController.java:149)
      	at org.apache.hadoop.yarn.logaggregation.filecontroller.LogAggregationFileController.initialize(LogAggregationFileController.java:135)
      	at org.apache.hadoop.yarn.logaggregation.filecontroller.LogAggregationFileControllerFactory.<init>(LogAggregationFileControllerFactory.java:139)
      	at org.apache.hadoop.yarn.server.webapp.LogServlet.<init>(LogServlet.java:66)
      	at org.apache.hadoop.mapreduce.v2.hs.webapp.HsWebServices.<init>(HsWebServices.java:99)
      	at org.apache.hadoop.mapreduce.v2.hs.webapp.HsWebServices$$FastClassByGuice$$1eb8d5d6.newInstance(<generated>)
      	at com.google.inject.internal.cglib.reflect.$FastConstructor.newInstance(FastConstructor.java:40)
      [...]
      	at org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:938)
      	at java.lang.Thread.run(Thread.java:748)
      

      We should only create the LogAggregationFactory instance when we actually need it, not every time the LogServlet object is instantiated (so definitely not in the constructor). In this way we prevent pressure on the S3A auth side, especially if the authentication request is a costly operation.

        Attachments

        1. YARN-10284.branch-3.3.001.patch
          6 kB
          Adam Antal
        2. YARN-10284.004.patch
          6 kB
          Adam Antal
        3. YARN-10284.003.patch
          6 kB
          Adam Antal
        4. YARN-10284.002.patch
          6 kB
          Adam Antal
        5. YARN-10284.001.patch
          3 kB
          Adam Antal

          Activity

            People

            • Assignee:
              adam.antal Adam Antal
              Reporter:
              adam.antal Adam Antal
            • Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: