[MAPREDUCE-4557] With default settings, log aggregation service creates aggregated log dirs with ownership not matching JH server run-as user and group - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Open
Priority: Minor
Resolution: Unresolved
Affects Version/s: 2.0.0-alpha
Fix Version/s: None
Component/s: nodemanager
Labels:
None
Environment:

CDH 4.0.1 (ZK, HDFS, YARN, HBase) managed by Cloudera Manager 4.0.3

Description

In order to read aggregated logs, JH server, running as mapred:hadoop by default, tries to access hdfs://<host:port>/tmp/logs/<user>/logs/<appId>/...

NodeManager runs as yarn:hadoop by default, but creates /tmp/logs initially as user yarn, and group unchanged. E.g., if /tmp as ownership hdfs:supergroup, /tmp/logs will have ownership yarn:supergroup.

Upon running a job, /tmp/logs/<username> is created by LogAggregationService as the user who submitted the job and leaves the group unchanged, e.g., /tmp/logs/<user> will have ownership <user>:supergroup, and permissions 750.

Like this, JH server, which runs as user and group mapred:hadoop by default, cannot access the aggregated logs.

I'm not sure what is a good way of fixing this.

There does not seem to be a way to fix this behavior through the configuration. While run-as groups can be specified, they do not seem to affect the created directories.

LogAggregationService should probably use the Nodemanager's run-as user AND group (which default to yarn:hadoop) to create /tmp/logs rather than leave the group unchanged.

On the other hand, the user and app dirs should better be created with the group unchanged (i.e., hadoop).

Attachments

Activity

People

Assignee:: Unassigned

Reporter:: Martin Gerlach

Votes:: 0 Vote for this issue

Watchers:: 6 Start watching this issue

Dates

Created:: 15/Aug/12 15:42

Updated:: 15/Aug/12 15:43