JobHistoryServer can't delete aggregated files, if remote app root directory is created by NodeManager



      If remote-app-log-dir is not created before starting Yarn processes, the NodeManager creates it during the init of AppLogAggregator service. In a custom system the primary group of the yarn user (which starts the NM/RM daemons) is not hadoop, but set to a more restricted group (say yarn). If NodeManager creates the folder it derives the group of the folder from the primary group of the login user (which is yarn:yarn in this case), thus setting the root log folder and all its subfolders to yarn group, ultimately making it unaccessible to other processes - e.g. the JobHistoryServer's AggregatedLogDeletionService.

      I suggest to make this group configurable. If this new configuration is not set then we can still stick to the existing behaviour.

      Creating the root app-log-dir each time during the setup of this system is a bit error prone, and an end user can easily forget it. I think the best to put this step is the LogAggregationService, which was responsible for creating the folder already.


