Uploaded image for project: 'Hadoop Map/Reduce'
  1. Hadoop Map/Reduce
  2. MAPREDUCE-4557

With default settings, log aggregation service creates aggregated log dirs with ownership not matching JH server run-as user and group

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Minor
    • Resolution: Unresolved
    • 2.0.0-alpha
    • None
    • nodemanager
    • None
    • CDH 4.0.1 (ZK, HDFS, YARN, HBase) managed by Cloudera Manager 4.0.3

    Description

      In order to read aggregated logs, JH server, running as mapred:hadoop by default, tries to access hdfs://<host:port>/tmp/logs/<user>/logs/<appId>/...

      NodeManager runs as yarn:hadoop by default, but creates /tmp/logs initially as user yarn, and group unchanged. E.g., if /tmp as ownership hdfs:supergroup, /tmp/logs will have ownership yarn:supergroup.

      Upon running a job, /tmp/logs/<username> is created by LogAggregationService as the user who submitted the job and leaves the group unchanged, e.g., /tmp/logs/<user> will have ownership <user>:supergroup, and permissions 750.

      Like this, JH server, which runs as user and group mapred:hadoop by default, cannot access the aggregated logs.

      I'm not sure what is a good way of fixing this.

      There does not seem to be a way to fix this behavior through the configuration. While run-as groups can be specified, they do not seem to affect the created directories.

      LogAggregationService should probably use the Nodemanager's run-as user AND group (which default to yarn:hadoop) to create /tmp/logs rather than leave the group unchanged.

      On the other hand, the user and app dirs should better be created with the group unchanged (i.e., hadoop).

      Attachments

        Activity

          People

            Unassigned Unassigned
            martin.gerlach Martin Gerlach
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

              Created:
              Updated: