Uploaded image for project: 'Hadoop Map/Reduce'
  1. Hadoop Map/Reduce
  2. MAPREDUCE-4557

With default settings, log aggregation service creates aggregated log dirs with ownership not matching JH server run-as user and group

Add voteVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Minor
    • Resolution: Unresolved
    • 2.0.0-alpha
    • None
    • nodemanager
    • None
    • CDH 4.0.1 (ZK, HDFS, YARN, HBase) managed by Cloudera Manager 4.0.3

    Description

      In order to read aggregated logs, JH server, running as mapred:hadoop by default, tries to access hdfs://<host:port>/tmp/logs/<user>/logs/<appId>/...

      NodeManager runs as yarn:hadoop by default, but creates /tmp/logs initially as user yarn, and group unchanged. E.g., if /tmp as ownership hdfs:supergroup, /tmp/logs will have ownership yarn:supergroup.

      Upon running a job, /tmp/logs/<username> is created by LogAggregationService as the user who submitted the job and leaves the group unchanged, e.g., /tmp/logs/<user> will have ownership <user>:supergroup, and permissions 750.

      Like this, JH server, which runs as user and group mapred:hadoop by default, cannot access the aggregated logs.

      I'm not sure what is a good way of fixing this.

      There does not seem to be a way to fix this behavior through the configuration. While run-as groups can be specified, they do not seem to affect the created directories.

      LogAggregationService should probably use the Nodemanager's run-as user AND group (which default to yarn:hadoop) to create /tmp/logs rather than leave the group unchanged.

      On the other hand, the user and app dirs should better be created with the group unchanged (i.e., hadoop).

      Attachments

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            Unassigned Unassigned
            martin.gerlach Martin Gerlach

            Dates

              Created:
              Updated:

              Slack

                Issue deployment