Details

    • Type: Sub-task Sub-task
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.21.0
    • Component/s: security
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      Currently the TaskTracker spawns the map/reduce tasks, resulting in them running as the user who started the TaskTracker.

      For security and accounting purposes the tasks should be run as the job-owner.

      1. hadoop-4490-design.pdf
        796 kB
        Hemanth Yamijala
      2. hadoop-4490-br20-3.patch
        43 kB
        Robert Chansler
      3. hadoop-4490-br20-3.2.patch
        25 kB
        Nigel Daley
      4. hadoop-4490-9.patch
        45 kB
        Sreekanth Ramakrishnan
      5. hadoop-4490-8.patch
        45 kB
        Sreekanth Ramakrishnan
      6. hadoop-4490-7.patch
        45 kB
        Sreekanth Ramakrishnan
      7. hadoop-4490-6.patch
        44 kB
        Sreekanth Ramakrishnan
      8. hadoop-4490-5.patch
        44 kB
        Sreekanth Ramakrishnan
      9. hadoop-4490-4.patch
        40 kB
        Sreekanth Ramakrishnan
      10. HADOOP-4490-3.patch
        40 kB
        Sreekanth Ramakrishnan
      11. HADOOP-4490-2.patch
        40 kB
        Sreekanth Ramakrishnan
      12. hadoop-4490-14.patch
        45 kB
        Sreekanth Ramakrishnan
      13. hadoop-4490-13.patch
        45 kB
        Sreekanth Ramakrishnan
      14. hadoop-4490-12.patch
        45 kB
        Sreekanth Ramakrishnan
      15. hadoop-4490-11.patch
        45 kB
        Sreekanth Ramakrishnan
      16. hadoop-4490-10.patch
        45 kB
        Sreekanth Ramakrishnan
      17. HADOOP-4490-1.patch
        39 kB
        Sreekanth Ramakrishnan
      18. HADOOP-4490-1.patch
        40 kB
        Sreekanth Ramakrishnan
      19. HADOOP-4490.patch
        18 kB
        Hemanth Yamijala
      20. HADOOP-4490.patch
        31 kB
        Hemanth Yamijala
      21. HADOOP-4490.patch
        38 kB
        Hemanth Yamijala
      22. HADOOP-4490.patch
        47 kB
        Hemanth Yamijala
      23. HADOOP-4490.patch
        52 kB
        Hemanth Yamijala
      24. HADOOP-4490.patch
        53 kB
        Hemanth Yamijala
      25. HADOOP-4490.patch
        53 kB
        Hemanth Yamijala
      26. HADOOP-4490_streaming.patch
        0.5 kB
        Mahadev konar
      27. cluster_setup.pdf
        41 kB
        Sreekanth Ramakrishnan
      28. cluster_setup.pdf
        41 kB
        Sreekanth Ramakrishnan
      29. 4490-ydist.patch
        43 kB
        Sreekanth Ramakrishnan

        Issue Links

          Activity

          Tom White made changes -
          Status Resolved [ 5 ] Closed [ 6 ]
          Robert Chansler made changes -
          Release Note Added the ability to run tasks as users who submitted the job.
          Hide
          Robert Chansler added a comment -

          Editorial pass over all release notes prior to publication of 0.21. Subtask.

          Show
          Robert Chansler added a comment - Editorial pass over all release notes prior to publication of 0.21. Subtask.
          Sreekanth Ramakrishnan made changes -
          Attachment 4490-ydist.patch [ 12416877 ]
          Hide
          Sreekanth Ramakrishnan added a comment -

          Attaching new Yahoo! Distribution patch.

          Show
          Sreekanth Ramakrishnan added a comment - Attaching new Yahoo! Distribution patch.
          Amar Kamat made changes -
          Attachment 2701949.4490.9.patch [ 12414351 ]
          Amar Kamat made changes -
          Attachment 2701949.4490.9.patch [ 12414351 ]
          Hide
          Amar Kamat added a comment -

          Attaching an example patch for branch 20 not to be committed.

          Show
          Amar Kamat added a comment - Attaching an example patch for branch 20 not to be committed.
          Owen O'Malley made changes -
          Component/s mapred [ 12310690 ]
          Vinod Kumar Vavilapalli made changes -
          Link This issue relates to HADOOP-4493 [ HADOOP-4493 ]
          Nigel Daley made changes -
          Attachment hadoop-4490-br20-3.2.patch [ 12410170 ]
          Hide
          Nigel Daley added a comment -

          Another patch needed if applying this to the 0.20 branch.

          Show
          Nigel Daley added a comment - Another patch needed if applying this to the 0.20 branch.
          Robert Chansler made changes -
          Attachment hadoop-4490-br20-3.patch [ 12409318 ]
          Hide
          Robert Chansler added a comment -

          Attached example for earlier version not to be committed.

          Show
          Robert Chansler added a comment - Attached example for earlier version not to be committed.
          Hide
          Hudson added a comment -
          Show
          Hudson added a comment - Integrated in Hadoop-trunk #811 (See http://hudson.zones.apache.org/hudson/job/Hadoop-trunk/811/ )
          Hemanth Yamijala made changes -
          Status Patch Available [ 10002 ] Resolved [ 5 ]
          Hadoop Flags [Reviewed]
          Release Note Added the ability to run tasks as users who submitted the job.
          Resolution Fixed [ 1 ]
          Hide
          Hemanth Yamijala added a comment -

          I just committed this. Thanks, Sreekanth !

          Show
          Hemanth Yamijala added a comment - I just committed this. Thanks, Sreekanth !
          Hide
          Hemanth Yamijala added a comment -

          The release audit warning is due to a new public API in FileUtil. This is expected.

          This patch has been tested extensively manually, and a version of this patch is also deployed in production environment for a while now. There is a plan to develop unit tests as a followup.

          Given these, I am committing this patch, as it is blocking some other jiras in M/R.

          Show
          Hemanth Yamijala added a comment - The release audit warning is due to a new public API in FileUtil. This is expected. This patch has been tested extensively manually, and a version of this patch is also deployed in production environment for a while now. There is a plan to develop unit tests as a followup. Given these, I am committing this patch, as it is blocking some other jiras in M/R.
          Hide
          Sreekanth Ramakrishnan added a comment -

          Attaching output of the ant test-patch

               [exec] 
               [exec] -1 overall.  
               [exec] 
               [exec]     +1 @author.  The patch does not contain any @author tags.
               [exec] 
               [exec]     -1 tests included.  The patch doesn't appear to include any new or modified tests.
               [exec]                         Please justify why no tests are needed for this patch.
               [exec] 
               [exec]     +1 javadoc.  The javadoc tool did not generate any warning messages.
               [exec] 
               [exec]     +1 javac.  The applied patch does not increase the total number of javac compiler warnings.
               [exec] 
               [exec]     +1 findbugs.  The patch does not introduce any new Findbugs warnings.
               [exec] 
               [exec]     +1 Eclipse classpath. The patch retains Eclipse classpath integrity.
               [exec] 
               [exec]     -1 release audit.  The applied patch generated 652 release audit warnings (more than the trunk's current 651 warnings).
               [exec] 
               [exec] 
          

          All core and contrib tests passed on the local machine except TestQueueCapacities which is being handled in seperate JIRA.

          Show
          Sreekanth Ramakrishnan added a comment - Attaching output of the ant test-patch [exec] [exec] -1 overall. [exec] [exec] +1 @author. The patch does not contain any @author tags. [exec] [exec] -1 tests included. The patch doesn't appear to include any new or modified tests. [exec] Please justify why no tests are needed for this patch. [exec] [exec] +1 javadoc. The javadoc tool did not generate any warning messages. [exec] [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings. [exec] [exec] +1 findbugs. The patch does not introduce any new Findbugs warnings. [exec] [exec] +1 Eclipse classpath. The patch retains Eclipse classpath integrity. [exec] [exec] -1 release audit. The applied patch generated 652 release audit warnings (more than the trunk's current 651 warnings). [exec] [exec] All core and contrib tests passed on the local machine except TestQueueCapacities which is being handled in seperate JIRA.
          Sreekanth Ramakrishnan made changes -
          Status Open [ 1 ] Patch Available [ 10002 ]
          Sreekanth Ramakrishnan made changes -
          Attachment hadoop-4490-14.patch [ 12405502 ]
          Hide
          Sreekanth Ramakrishnan added a comment -

          Making the initalizeTask call explicit in runChild.

          Show
          Sreekanth Ramakrishnan added a comment - Making the initalizeTask call explicit in runChild.
          Hide
          Hemanth Yamijala added a comment -

          Check before TaskController.initializeTask() in JvmManager.getTaskForJVM() to see to that there is no double initalization of same task being done.

          I think we should move the call to initializeTask into JvmRunner.runChild(). This way it is more explicit why the check is required.

          Other than this +1. Please make this PA.

          Show
          Hemanth Yamijala added a comment - Check before TaskController.initializeTask() in JvmManager.getTaskForJVM() to see to that there is no double initalization of same task being done. I think we should move the call to initializeTask into JvmRunner.runChild(). This way it is more explicit why the check is required. Other than this +1. Please make this PA.
          Sreekanth Ramakrishnan made changes -
          Attachment hadoop-4490-13.patch [ 12405494 ]
          Hide
          Sreekanth Ramakrishnan added a comment -

          Attaching patch with following changes:

          • FileUtil now uses ShellCommandExecutor for chmod operations. It supresses the IOException thrown while doing chmod and logs it if debug enabled.
          • Check before TaskController.initializeTask() in JvmManager.getTaskForJVM() to see to that there is no double initalization of same task being done.
          Show
          Sreekanth Ramakrishnan added a comment - Attaching patch with following changes: FileUtil now uses ShellCommandExecutor for chmod operations. It supresses the IOException thrown while doing chmod and logs it if debug enabled. Check before TaskController.initializeTask() in JvmManager.getTaskForJVM() to see to that there is no double initalization of same task being done.
          Sreekanth Ramakrishnan made changes -
          Attachment hadoop-4490-12.patch [ 12405395 ]
          Hide
          Sreekanth Ramakrishnan added a comment -

          Attaching patch fixing an issue found while testing on large clusters while task trackers are re-inited.

          • Changed FileUtil.chmod() to use ShellCommandExecutor instead of building process directly and executing the same.
          Show
          Sreekanth Ramakrishnan added a comment - Attaching patch fixing an issue found while testing on large clusters while task trackers are re-inited. Changed FileUtil.chmod() to use ShellCommandExecutor instead of building process directly and executing the same.
          Sreekanth Ramakrishnan made changes -
          Attachment hadoop-4490-11.patch [ 12405300 ]
          Hide
          Sreekanth Ramakrishnan added a comment -

          After offline discussion with Hemanth and Vinod the changes to LinuxTaskController for resolving HADOOP-5420 can be addressed in a different JIRA.

          Reverting to previous version of patch.

          Show
          Sreekanth Ramakrishnan added a comment - After offline discussion with Hemanth and Vinod the changes to LinuxTaskController for resolving HADOOP-5420 can be addressed in a different JIRA. Reverting to previous version of patch.
          Sreekanth Ramakrishnan made changes -
          Status Patch Available [ 10002 ] Open [ 1 ]
          Sreekanth Ramakrishnan made changes -
          Status Open [ 1 ] Patch Available [ 10002 ]
          Hide
          Sreekanth Ramakrishnan added a comment -

          Running thro' Hudson.

          Show
          Sreekanth Ramakrishnan added a comment - Running thro' Hudson.
          Sreekanth Ramakrishnan made changes -
          Attachment hadoop-4490-10.patch [ 12405297 ]
          Hide
          Sreekanth Ramakrishnan added a comment -

          Attaching new patch, the patch modifies the kill task part from the previous patch. This passes on the configuration value mapred.tasktracker.tasks.sleeptime-before-sigkill to task-controller binary. Current trunk version of the binary would ignore this value. Once fix for HADOOP-5420 goes in, the binary would sleep the interval after issuing SIGTERM then pass SIGTERM to child task.

          Show
          Sreekanth Ramakrishnan added a comment - Attaching new patch, the patch modifies the kill task part from the previous patch. This passes on the configuration value mapred.tasktracker.tasks.sleeptime-before-sigkill to task-controller binary. Current trunk version of the binary would ignore this value. Once fix for HADOOP-5420 goes in, the binary would sleep the interval after issuing SIGTERM then pass SIGTERM to child task.
          Sreekanth Ramakrishnan made changes -
          Status Patch Available [ 10002 ] Open [ 1 ]
          Sreekanth Ramakrishnan made changes -
          Link This issue blocks HADOOP-5420 [ HADOOP-5420 ]
          Hide
          Sreekanth Ramakrishnan added a comment -

          The contrib test failures are not related to the patch.

          Show
          Sreekanth Ramakrishnan added a comment - The contrib test failures are not related to the patch.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12404914/hadoop-4490-9.patch
          against trunk revision 763247.

          +1 @author. The patch does not contain any @author tags.

          -1 tests included. The patch doesn't appear to include any new or modified tests.
          Please justify why no tests are needed for this patch.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs warnings.

          +1 Eclipse classpath. The patch retains Eclipse classpath integrity.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed core unit tests.

          -1 contrib tests. The patch failed contrib unit tests.

          Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/166/testReport/
          Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/166/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/166/artifact/trunk/build/test/checkstyle-errors.html
          Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/166/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12404914/hadoop-4490-9.patch against trunk revision 763247. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no tests are needed for this patch. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 Eclipse classpath. The patch retains Eclipse classpath integrity. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. -1 contrib tests. The patch failed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/166/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/166/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/166/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/166/console This message is automatically generated.
          Hide
          Sreekanth Ramakrishnan added a comment -

          Attaching output of the ant test patch:

               [exec] 
               [exec] -1 overall.  
               [exec] 
               [exec]     +1 @author.  The patch does not contain any @author tags.
               [exec] 
               [exec]     -1 tests included.  The patch doesn't appear to include any new or modified tests.
               [exec]                         Please justify why no tests are needed for this patch.
               [exec] 
               [exec]     +1 javadoc.  The javadoc tool did not generate any warning messages.
               [exec] 
               [exec]     +1 javac.  The applied patch does not increase the total number of javac compiler warnings.
               [exec] 
               [exec]     +1 findbugs.  The patch does not introduce any new Findbugs warnings.
               [exec] 
               [exec]     +1 Eclipse classpath. The patch retains Eclipse classpath integrity.
               [exec] 
               [exec]     +1 release audit.  The applied patch does not increase the total number of release audit warnings.
          
          Show
          Sreekanth Ramakrishnan added a comment - Attaching output of the ant test patch: [exec] [exec] -1 overall. [exec] [exec] +1 @author. The patch does not contain any @author tags. [exec] [exec] -1 tests included. The patch doesn't appear to include any new or modified tests. [exec] Please justify why no tests are needed for this patch. [exec] [exec] +1 javadoc. The javadoc tool did not generate any warning messages. [exec] [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings. [exec] [exec] +1 findbugs. The patch does not introduce any new Findbugs warnings. [exec] [exec] +1 Eclipse classpath. The patch retains Eclipse classpath integrity. [exec] [exec] +1 release audit. The applied patch does not increase the total number of release audit warnings.
          Sreekanth Ramakrishnan made changes -
          Status Open [ 1 ] Patch Available [ 10002 ]
          Sreekanth Ramakrishnan made changes -
          Attachment hadoop-4490-9.patch [ 12404914 ]
          Hide
          Sreekanth Ramakrishnan added a comment -

          Attaching new patch with documentation changes.

          Show
          Sreekanth Ramakrishnan added a comment - Attaching new patch with documentation changes.
          Hide
          Hemanth Yamijala added a comment -

          Code changes look fine to me. Looking at the documentation changes.

          Show
          Hemanth Yamijala added a comment - Code changes look fine to me. Looking at the documentation changes.
          Sreekanth Ramakrishnan made changes -
          Attachment hadoop-4490-8.patch [ 12404708 ]
          Hide
          Sreekanth Ramakrishnan added a comment -
          • Removing an unused variable in JVM Manager.
          Show
          Sreekanth Ramakrishnan added a comment - Removing an unused variable in JVM Manager.
          Sreekanth Ramakrishnan made changes -
          Attachment cluster_setup.pdf [ 12404706 ]
          Hide
          Sreekanth Ramakrishnan added a comment -

          Uploading built pdf documentation for the change made by the patch for review.

          Show
          Sreekanth Ramakrishnan added a comment - Uploading built pdf documentation for the change made by the patch for review.
          Sreekanth Ramakrishnan made changes -
          Attachment hadoop-4490-7.patch [ 12404703 ]
          Hide
          Sreekanth Ramakrishnan added a comment -

          Attaching patch addressing Review comments by Hemanth.

          Show
          Sreekanth Ramakrishnan added a comment - Attaching patch addressing Review comments by Hemanth.
          Hide
          Hemanth Yamijala added a comment -

          Some comments:

          • Use getLocalJobDir in LinuxTaskController.localizeJob
          • Whenever mkdir or mkdirs fails, we should continue from loops.
          • Changes in TaskRunner seem unnecessary.
          • Changes in DistributedCache to pass the baseDir seems unnecessary. Note that localizeCache already takes a CacheStatus object that has the baseDir.
          • This comment is not incorporated: "Modify TaskController.launchTaskJVM to set up permissions for the log dir and the pid dir associated with that task. This will remove the call to initializeTask from the JvmManager.runChild API." I think this can be done by calling setup*FileAccess from launchTaskJVM
          • localizeJob can be called initializeJob
          • In setupTaskCacheFileAccess, we are setting permissions recursively from the job directory. But this is what we do in LinuxTaskController.localizeJob. So we should be setting permissions from the taskCacheDirectory only.
          • writeCommand should ideally check for existence of file before it tries to change permissions in the finally clause.
          • JvmManagerForType.getTaskForJvm() - "Incase of JVM reuse, tasks returned previously launched" - some grammatical mistake here.
          • In the kill part, I think it will be nice to add a info level log message when we are not doing the kill - both in JVM manager and in LinuxTaskController.
          • TaskLog.getLogDir() - Make this getUserLogDir(), and the javadoc need not mention TaskControllers. It should be a generic documentation that it returns the base location for the user logs.
          • mapred-defaults.xml should have the config variable for the task controller along with documentation.

          Comments on documentation:

          • I think we should first describe the use case for the Task controllers are trying to solve - as in the requirement to run tasks as a job owners.
          • It would be nice to give a little description of how the LinuxTaskController works - just saying something like we use a setuid executable, the tasktracker uses this exe to launch and kill tasks.
          • We should definitely mention that until other JIRAs like H-4491 etc are fixed, we open up permissions on the intermediate, localized and log files in the Linux TaskController case.
          • making the executable a setuid exe is a deployment step. It is currently added as a build step.
          • The path to the taskcontroller cfg - mention that this should be the path on the cluster nodes where the deployment of the taskcontroller.cfg file will happen
          • We should also mention that the LinuxTaskController is currently supported only on Linux. (though it sounds obvious)
          • Should we mention about permissions regarding mapred.local.dir and hadoop.log.dir (should be 777 and path leading up to them be 755) ?
          Show
          Hemanth Yamijala added a comment - Some comments: Use getLocalJobDir in LinuxTaskController.localizeJob Whenever mkdir or mkdirs fails, we should continue from loops. Changes in TaskRunner seem unnecessary. Changes in DistributedCache to pass the baseDir seems unnecessary. Note that localizeCache already takes a CacheStatus object that has the baseDir. This comment is not incorporated: "Modify TaskController.launchTaskJVM to set up permissions for the log dir and the pid dir associated with that task. This will remove the call to initializeTask from the JvmManager.runChild API." I think this can be done by calling setup*FileAccess from launchTaskJVM localizeJob can be called initializeJob In setupTaskCacheFileAccess, we are setting permissions recursively from the job directory. But this is what we do in LinuxTaskController.localizeJob. So we should be setting permissions from the taskCacheDirectory only. writeCommand should ideally check for existence of file before it tries to change permissions in the finally clause. JvmManagerForType.getTaskForJvm() - "Incase of JVM reuse, tasks returned previously launched" - some grammatical mistake here. In the kill part, I think it will be nice to add a info level log message when we are not doing the kill - both in JVM manager and in LinuxTaskController. TaskLog.getLogDir() - Make this getUserLogDir(), and the javadoc need not mention TaskControllers. It should be a generic documentation that it returns the base location for the user logs. mapred-defaults.xml should have the config variable for the task controller along with documentation. Comments on documentation: I think we should first describe the use case for the Task controllers are trying to solve - as in the requirement to run tasks as a job owners. It would be nice to give a little description of how the LinuxTaskController works - just saying something like we use a setuid executable, the tasktracker uses this exe to launch and kill tasks. We should definitely mention that until other JIRAs like H-4491 etc are fixed, we open up permissions on the intermediate, localized and log files in the Linux TaskController case. making the executable a setuid exe is a deployment step. It is currently added as a build step. The path to the taskcontroller cfg - mention that this should be the path on the cluster nodes where the deployment of the taskcontroller.cfg file will happen We should also mention that the LinuxTaskController is currently supported only on Linux. (though it sounds obvious) Should we mention about permissions regarding mapred.local.dir and hadoop.log.dir (should be 777 and path leading up to them be 755) ?
          Sreekanth Ramakrishnan made changes -
          Attachment hadoop-4490-6.patch [ 12404318 ]
          Hide
          Sreekanth Ramakrishnan added a comment -

          Mixed up merging in the last patch. Attaching new patch which corrects the issue.

          Show
          Sreekanth Ramakrishnan added a comment - Mixed up merging in the last patch. Attaching new patch which corrects the issue.
          Sreekanth Ramakrishnan made changes -
          Attachment hadoop-4490-6.patch [ 12404223 ]
          Sreekanth Ramakrishnan made changes -
          Attachment hadoop-4490-6.patch [ 12404223 ]
          Hide
          Sreekanth Ramakrishnan added a comment -

          Latest Merged patch.

          Show
          Sreekanth Ramakrishnan added a comment - Latest Merged patch.
          Sreekanth Ramakrishnan made changes -
          Attachment cluster_setup.pdf [ 12404206 ]
          Attachment hadoop-4490-5.patch [ 12404207 ]
          Hide
          Sreekanth Ramakrishnan added a comment -

          Attaching latest patch with following changes:

          • Merged the patch with the trunk.
          • Added documentation to the cluster_setup page. Created a subsection in Site configuration to list task controller configuration after real world cluster configuration. Attaching the cluster setup document pdf for checking output and text of documentation.
          Show
          Sreekanth Ramakrishnan added a comment - Attaching latest patch with following changes: Merged the patch with the trunk. Added documentation to the cluster_setup page. Created a subsection in Site configuration to list task controller configuration after real world cluster configuration. Attaching the cluster setup document pdf for checking output and text of documentation.
          Sreekanth Ramakrishnan made changes -
          Attachment hadoop-4490-4.patch [ 12403706 ]
          Hide
          Sreekanth Ramakrishnan added a comment -

          Attaching patch modifiying a minor thing, to check jvm env member of TaskControllerContext. The JVM env needs to be set for both controllers. So the check has been moved into JVMRunner.

          Show
          Sreekanth Ramakrishnan added a comment - Attaching patch modifiying a minor thing, to check jvm env member of TaskControllerContext. The JVM env needs to be set for both controllers. So the check has been moved into JVMRunner.
          Sreekanth Ramakrishnan made changes -
          Attachment HADOOP-4490-3.patch [ 12403432 ]
          Hide
          Sreekanth Ramakrishnan added a comment -

          Attaching a patch which resolves the NullPointerException raised when TT is re-inted. The problem was with race condition in which JVMRunner was issued kill before it launched the task. The patch now checks if the inital context is set. i.e task is launched if not, it merely updates the data structures.

          Show
          Sreekanth Ramakrishnan added a comment - Attaching a patch which resolves the NullPointerException raised when TT is re-inted. The problem was with race condition in which JVMRunner was issued kill before it launched the task. The patch now checks if the inital context is set. i.e task is launched if not, it merely updates the data structures.
          Hide
          Vinod Kumar Vavilapalli added a comment -

          Found a problem while this patch was being tested. When a TT was being re-inited by JT after it's lost for some time, TT got the following NPE and it crashed completely.

          2009-03-22 22:22:36,155 ERROR org.apache.hadoop.mapred.TaskTracker: Can not start task tracker because java.lang.NullPointerException
                  at org.apache.hadoop.mapred.LinuxTaskController.buildTaskCommandArgs(LinuxTaskController.java:183)
                  at org.apache.hadoop.mapred.LinuxTaskController.killTaskJVM(LinuxTaskController.java:237)
                  at org.apache.hadoop.mapred.JvmManager$JvmManagerForType$JvmRunner.kill(JvmManager.java:401)
                  at org.apache.hadoop.mapred.JvmManager$JvmManagerForType.stop(JvmManager.java:211)
                  at org.apache.hadoop.mapred.JvmManager.stop(JvmManager.java:61)
                  at org.apache.hadoop.mapred.TaskTracker.close(TaskTracker.java:925)
                  at org.apache.hadoop.mapred.TaskTracker.run(TaskTracker.java:1815)
                  at org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:2899)
          
          Show
          Vinod Kumar Vavilapalli added a comment - Found a problem while this patch was being tested. When a TT was being re-inited by JT after it's lost for some time, TT got the following NPE and it crashed completely. 2009-03-22 22:22:36,155 ERROR org.apache.hadoop.mapred.TaskTracker: Can not start task tracker because java.lang.NullPointerException at org.apache.hadoop.mapred.LinuxTaskController.buildTaskCommandArgs(LinuxTaskController.java:183) at org.apache.hadoop.mapred.LinuxTaskController.killTaskJVM(LinuxTaskController.java:237) at org.apache.hadoop.mapred.JvmManager$JvmManagerForType$JvmRunner.kill(JvmManager.java:401) at org.apache.hadoop.mapred.JvmManager$JvmManagerForType.stop(JvmManager.java:211) at org.apache.hadoop.mapred.JvmManager.stop(JvmManager.java:61) at org.apache.hadoop.mapred.TaskTracker.close(TaskTracker.java:925) at org.apache.hadoop.mapred.TaskTracker.run(TaskTracker.java:1815) at org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:2899)
          Sreekanth Ramakrishnan made changes -
          Attachment HADOOP-4490-2.patch [ 12402253 ]
          Hide
          Sreekanth Ramakrishnan added a comment -

          Attaching patch incorporating Hemanth's Comment. Tested the patch and exception was not thrown.

          Show
          Sreekanth Ramakrishnan added a comment - Attaching patch incorporating Hemanth's Comment. Tested the patch and exception was not thrown.
          Hide
          Hemanth Yamijala added a comment -

          One more change required. I've noticed that when the JVM Manager calls a killTaskJVM, the current working directory for the LinuxTaskController is set to a task attempt directory. This will fail, because this directory will no longer exist. While this is not creating problems as well behaved JVMs will exit themselves, it is not something to be relied on.

          The relevant piece of code is this:

              private ShellCommandExecutor buildTaskControllerExecutor(TaskCommands command, 
                                                      String userName, 
                                                      List<String> cmdArgs, JvmEnv env) 
                                                        throws IOException {
          
                ...
                if (env != null) {
                 shExec = new ShellCommandExecutor(taskControllerCmd,
                      env.workDir, env.env);
                }else {
                  shExec = new ShellCommandExecutor(taskControllerCmd);
                }
              }
          

          Setting env.workDir will set the working directory to that particular task attempt directory which may no longer exist. I get the following error when this happens:

          2009-03-13 10:39:52,750 WARN org.apache.hadoop.mapred.LinuxTaskController: IOException in killing task: Cannot run program "/path/to/bin/task-controller" (in directory "/path/to/mapred-local/taskTracker/jobcache/job_200903130908_0051/attempt_200903130908_0051_m_001012_1000/work"): java.io.IOException: error=2, No such file or directory

          Show
          Hemanth Yamijala added a comment - One more change required. I've noticed that when the JVM Manager calls a killTaskJVM, the current working directory for the LinuxTaskController is set to a task attempt directory. This will fail, because this directory will no longer exist. While this is not creating problems as well behaved JVMs will exit themselves, it is not something to be relied on. The relevant piece of code is this: private ShellCommandExecutor buildTaskControllerExecutor(TaskCommands command, String userName, List< String > cmdArgs, JvmEnv env) throws IOException { ... if (env != null ) { shExec = new ShellCommandExecutor(taskControllerCmd, env.workDir, env.env); } else { shExec = new ShellCommandExecutor(taskControllerCmd); } } Setting env.workDir will set the working directory to that particular task attempt directory which may no longer exist. I get the following error when this happens: 2009-03-13 10:39:52,750 WARN org.apache.hadoop.mapred.LinuxTaskController: IOException in killing task: Cannot run program "/path/to/bin/task-controller" (in directory "/path/to/mapred-local/taskTracker/jobcache/job_200903130908_0051/attempt_200903130908_0051_m_001012_1000/work"): java.io.IOException: error=2, No such file or directory
          Sreekanth Ramakrishnan made changes -
          Attachment HADOOP-4490-1.patch [ 12401611 ]
          Hide
          Sreekanth Ramakrishnan added a comment -

          Fixing, findbugs warning. Plus had made mistake in localizejob in LinuxTaskController corrected it.

          Show
          Sreekanth Ramakrishnan added a comment - Fixing, findbugs warning. Plus had made mistake in localizejob in LinuxTaskController corrected it.
          Sreekanth Ramakrishnan made changes -
          Attachment HADOOP-4490-1.patch [ 12401597 ]
          Hide
          Sreekanth Ramakrishnan added a comment -

          Attaching patch incorporating Comments from Hemanth and Arun. The only change which has been done which move away from the above are:

          • No seperate permissions being set for pid directories, as they are not being used in the current trunk.
          • Assumption that path to jobcache directory, log directory the user who runs task would have read and execute permission along the path.
          Show
          Sreekanth Ramakrishnan added a comment - Attaching patch incorporating Comments from Hemanth and Arun. The only change which has been done which move away from the above are: No seperate permissions being set for pid directories, as they are not being used in the current trunk. Assumption that path to jobcache directory, log directory the user who runs task would have read and execute permission along the path.
          Hide
          Hemanth Yamijala added a comment -

          Mahadev, thanks for debugging this issue. The problem was that the permissions being used in LinuxTaskController were making an assumption that files submitted to the cluster, say via streaming jobs, or in distributed cache, will already be executable if they need to be executed as part of the tasks. However, there seem to be scenarios in which this is not mandated, and the framework handles this in the default case (when user tasks are run as the tasktracker itself). The right fix is to modify the permissions without this assumption. If we do that, I guess the change in RunJar is not necessary. One more concern is that the RunJar change would fix the problem only for jar files, but not for other types of archives. We are testing this (with the failed streaming test case) with the modified permissions in LinuxTaskController. If that works, I suggest we just use the new version of the LinuxTaskController itself.

          Again, thanks for debugging this issue !

          Show
          Hemanth Yamijala added a comment - Mahadev, thanks for debugging this issue. The problem was that the permissions being used in LinuxTaskController were making an assumption that files submitted to the cluster, say via streaming jobs, or in distributed cache, will already be executable if they need to be executed as part of the tasks. However, there seem to be scenarios in which this is not mandated, and the framework handles this in the default case (when user tasks are run as the tasktracker itself). The right fix is to modify the permissions without this assumption. If we do that, I guess the change in RunJar is not necessary. One more concern is that the RunJar change would fix the problem only for jar files, but not for other types of archives. We are testing this (with the failed streaming test case) with the modified permissions in LinuxTaskController. If that works, I suggest we just use the new version of the LinuxTaskController itself. Again, thanks for debugging this issue !
          Mahadev konar made changes -
          Attachment HADOOP-4490_streaming.patch [ 12401466 ]
          Hide
          Mahadev konar added a comment -

          we were running with this patch and had problems with streaming scripts being set to 666. This patch fixes that to make all the files unjarred to be readable and executable by all.

          Show
          Mahadev konar added a comment - we were running with this patch and had problems with streaming scripts being set to 666. This patch fixes that to make all the files unjarred to be readable and executable by all.
          Hemanth Yamijala made changes -
          Fix Version/s 0.21.0 [ 12313563 ]
          Hide
          Arun C Murthy added a comment -

          If it is felt that this should be per TaskController, then we can easily move this to the TaskController API.

          On second thoughts - I'd go there right-away, since it means that we don't need to open it up to 777 for mapred.local.dir if the DefaultTaskController is in effect.

          Introduce a TaskController.initializeJob [...]

          +1

          Modify TaskController.launchTaskJVM to set up permissions [...]

          +1

          Modify TaskController.initializeTask to set up permissions for [...]

          +1

          Modify DistributedCache.localizeCache to set up permissions for the localized files. [...]

          +1

          Show
          Arun C Murthy added a comment - If it is felt that this should be per TaskController, then we can easily move this to the TaskController API. On second thoughts - I'd go there right-away, since it means that we don't need to open it up to 777 for mapred.local.dir if the DefaultTaskController is in effect. Introduce a TaskController.initializeJob [...] +1 Modify TaskController.launchTaskJVM to set up permissions [...] +1 Modify TaskController.initializeTask to set up permissions for [...] +1 Modify DistributedCache.localizeCache to set up permissions for the localized files. [...] +1
          Hide
          Hemanth Yamijala added a comment -

          Thanks for the review Arun.

          I had a discussion with Sreekanth about the changes, and we are proposing the following:

          • Introduce a TaskTracker.initializeSystemDirs. This will create $mapred.local.dir/taskTracker/jobCache/, $mapred.local.dir/taskTracker/archives, and $hadoop.log.dir/userlogs on all relevant disks.
            Currently, as per Arun's comments, we'll have this API in TaskTracker, which will be called at Tracker initialization time. If it is felt that this should be per TaskController, then we can easily move this to the TaskController API. I think this may need 777 on the $mapred.local.dir/taskTracker/jobCache/ directory currently because the files would be created both by the task and the tracker - for e.g. the task could create the output directories on a new disk which has yet not been touched by the tracker.
          • Introduce a TaskController.initializeJob. This will be called from TaskTracker.localizeJob, with the jobid as parameter. This will set up the access for $mapred.local.dir/taskTracker/jobCache/jobid directories on all disks which have been touched by localization.
          • Modify TaskController.launchTaskJVM to set up permissions for the log dir and the pid dir associated with that task. This will remove the call to initializeTask from the JvmManager.runChild API
          • Modify TaskController.initializeTask to set up permissions for the log dir, pid dir, and task cache dir for the task. There is no need to set up things for the job, because it's been done in initializeJob already. We will need to repeat the permission setting for the log dir and pid dir.
          • Modify DistributedCache.localizeCache to set up permissions for the localized files. We propose to recursively set up 755 permissions (hardcoded) for all files under the $mapred.local.dir/taskTracker/archive/ directory for now. This might repeatedly set up permissions for files that are already correctly setup. However, it will keep things simple. If there's a performance issue, it is easy to address it, by setting it only for the files being localized, and by walking up its parent paths. Please let me know if this seems a bad choice.

          The above changes will mean we can remove:

          • DistributedCacheFileAccessInfo
          • FileUtil.setPermissionsForPathComponents
          • TaskController.cleanup
          • The runningJobs state maintained by LinuxTaskController.

          Arun, does this tie in with your expectations ?

          Show
          Hemanth Yamijala added a comment - Thanks for the review Arun. I had a discussion with Sreekanth about the changes, and we are proposing the following: Introduce a TaskTracker.initializeSystemDirs . This will create $mapred.local.dir/taskTracker/jobCache/, $mapred.local.dir/taskTracker/archives, and $hadoop.log.dir/userlogs on all relevant disks. Currently, as per Arun's comments, we'll have this API in TaskTracker, which will be called at Tracker initialization time. If it is felt that this should be per TaskController, then we can easily move this to the TaskController API. I think this may need 777 on the $mapred.local.dir/taskTracker/jobCache/ directory currently because the files would be created both by the task and the tracker - for e.g. the task could create the output directories on a new disk which has yet not been touched by the tracker. Introduce a TaskController.initializeJob . This will be called from TaskTracker.localizeJob , with the jobid as parameter. This will set up the access for $mapred.local.dir/taskTracker/jobCache/jobid directories on all disks which have been touched by localization. Modify TaskController.launchTaskJVM to set up permissions for the log dir and the pid dir associated with that task. This will remove the call to initializeTask from the JvmManager.runChild API Modify TaskController.initializeTask to set up permissions for the log dir, pid dir, and task cache dir for the task. There is no need to set up things for the job, because it's been done in initializeJob already. We will need to repeat the permission setting for the log dir and pid dir. Modify DistributedCache.localizeCache to set up permissions for the localized files. We propose to recursively set up 755 permissions (hardcoded) for all files under the $mapred.local.dir/taskTracker/archive/ directory for now. This might repeatedly set up permissions for files that are already correctly setup. However, it will keep things simple. If there's a performance issue, it is easy to address it, by setting it only for the files being localized, and by walking up its parent paths. Please let me know if this seems a bad choice. The above changes will mean we can remove: DistributedCacheFileAccessInfo FileUtil.setPermissionsForPathComponents TaskController.cleanup The runningJobs state maintained by LinuxTaskController. Arun, does this tie in with your expectations ?
          Arun C Murthy made changes -
          Status Patch Available [ 10002 ] Open [ 1 ]
          Hide
          Arun C Murthy added a comment -

          Some comments after a discussion with Hemanth:

          1. We agree that DistributedCacheFileAccessInfo isn't necessary. We'll wait for HADOOP-4493 to fix access control to the DistributedCache, until which we will just allow requisite access to all files in the cache (755).
          2. FileUtil.setPermissionsForPathComponents and opening of permissions to mapred.local.dir via LinuxTaskController.setConf bother me. I see 4 different hooks we that we need to provide for setup/cleanup:
            • per-tracker (i.e. at TaskTracker initialization e.g. setting up mapred.local.dir)
            • per-job (job jars)
            • per-jvm (task log files)
            • per-task
              Of course we might not need all the cleanup hooks.
          3. The TaskController itself should be stateless, the above hooks should be plugged into at appropriate places e.g. TaskTracker.localizeJob should call TaskController.initializeJob rather than having LinuxTaskController maintaining state as in the patch.
          Show
          Arun C Murthy added a comment - Some comments after a discussion with Hemanth: We agree that DistributedCacheFileAccessInfo isn't necessary. We'll wait for HADOOP-4493 to fix access control to the DistributedCache, until which we will just allow requisite access to all files in the cache (755). FileUtil.setPermissionsForPathComponents and opening of permissions to mapred.local.dir via LinuxTaskController.setConf bother me. I see 4 different hooks we that we need to provide for setup/cleanup: per-tracker (i.e. at TaskTracker initialization e.g. setting up mapred.local.dir) per-job (job jars) per-jvm (task log files) per-task Of course we might not need all the cleanup hooks. The TaskController itself should be stateless, the above hooks should be plugged into at appropriate places e.g. TaskTracker.localizeJob should call TaskController.initializeJob rather than having LinuxTaskController maintaining state as in the patch.
          Hide
          Hemanth Yamijala added a comment -

          The failed contrib test is in Chukwa and is the same as HADOOP-5172.

          Show
          Hemanth Yamijala added a comment - The failed contrib test is in Chukwa and is the same as HADOOP-5172 .
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12399725/HADOOP-4490.patch
          against trunk revision 741776.

          +1 @author. The patch does not contain any @author tags.

          -1 tests included. The patch doesn't appear to include any new or modified tests.
          Please justify why no tests are needed for this patch.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs warnings.

          +1 Eclipse classpath. The patch retains Eclipse classpath integrity.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed core unit tests.

          -1 contrib tests. The patch failed contrib unit tests.

          Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3812/testReport/
          Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3812/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3812/artifact/trunk/build/test/checkstyle-errors.html
          Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3812/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12399725/HADOOP-4490.patch against trunk revision 741776. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no tests are needed for this patch. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 Eclipse classpath. The patch retains Eclipse classpath integrity. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. -1 contrib tests. The patch failed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3812/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3812/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3812/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3812/console This message is automatically generated.
          Hemanth Yamijala made changes -
          Status Open [ 1 ] Patch Available [ 10002 ]
          Hide
          Hemanth Yamijala added a comment -

          The attached file fixes the streaming test failures. I'd made changes to FileUtil.java to run the chmod command using the ShellCommandExecutor. Previously this was using the Process class. There was no reason to change it, so I moved back to using Process and the tests passed locally.

          I thought ant test runs contrib tests as well, which is why I missed these on the first patch.

          Show
          Hemanth Yamijala added a comment - The attached file fixes the streaming test failures. I'd made changes to FileUtil.java to run the chmod command using the ShellCommandExecutor. Previously this was using the Process class. There was no reason to change it, so I moved back to using Process and the tests passed locally. I thought ant test runs contrib tests as well, which is why I missed these on the first patch.
          Hemanth Yamijala made changes -
          Status Patch Available [ 10002 ] Open [ 1 ]
          Hemanth Yamijala made changes -
          Attachment HADOOP-4490.patch [ 12399725 ]
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12399627/HADOOP-4490.patch
          against trunk revision 741330.

          +1 @author. The patch does not contain any @author tags.

          -1 tests included. The patch doesn't appear to include any new or modified tests.
          Please justify why no tests are needed for this patch.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs warnings.

          +1 Eclipse classpath. The patch retains Eclipse classpath integrity.

          -1 release audit. The applied patch generated 821 release audit warnings (more than the trunk's current 819 warnings).

          +1 core tests. The patch passed core unit tests.

          -1 contrib tests. The patch failed contrib unit tests.

          Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3806/testReport/
          Release audit warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3806/artifact/trunk/current/releaseAuditDiffWarnings.txt
          Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3806/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3806/artifact/trunk/build/test/checkstyle-errors.html
          Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3806/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12399627/HADOOP-4490.patch against trunk revision 741330. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no tests are needed for this patch. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 Eclipse classpath. The patch retains Eclipse classpath integrity. -1 release audit. The applied patch generated 821 release audit warnings (more than the trunk's current 819 warnings). +1 core tests. The patch passed core unit tests. -1 contrib tests. The patch failed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3806/testReport/ Release audit warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3806/artifact/trunk/current/releaseAuditDiffWarnings.txt Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3806/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3806/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3806/console This message is automatically generated.
          Hemanth Yamijala made changes -
          Status Open [ 1 ] Patch Available [ 10002 ]
          Hide
          Hemanth Yamijala added a comment -

          ant test passes locally.

          Show
          Hemanth Yamijala added a comment - ant test passes locally.
          Hide
          Hemanth Yamijala added a comment -

          I also have to mention that some additional work would be required in the LinuxTaskController to sync up with changes from HADOOP-4759 that was committed yesterday, as it introduced a new task (TaskCleanup task) which would create directories like task-attempt-id-cleanup.

          However, the default path (covered by the DefaultTaskController) is not affected, and hence there will be no regression. I am proposing that apart from this, if other things are fine, we should commit HADOOP-4490 and then in a follow up JIRA fix the changes with respect to HADOOP-4759. Otherwise I am seeing that HADOOP-4490 is becoming a moving target (smile). Does this make sense ?

          Show
          Hemanth Yamijala added a comment - I also have to mention that some additional work would be required in the LinuxTaskController to sync up with changes from HADOOP-4759 that was committed yesterday, as it introduced a new task (TaskCleanup task) which would create directories like task-attempt-id-cleanup. However, the default path (covered by the DefaultTaskController) is not affected, and hence there will be no regression. I am proposing that apart from this, if other things are fine, we should commit HADOOP-4490 and then in a follow up JIRA fix the changes with respect to HADOOP-4759 . Otherwise I am seeing that HADOOP-4490 is becoming a moving target ( smile ). Does this make sense ?
          Hide
          Hemanth Yamijala added a comment -

          Merged with trunk. It was broken by a commit done yesterday night. Also removed extraneous logs. Running ant test.

          test-patch gives following output:

          [exec] -1 overall.
          [exec]
          [exec] +1 @author. The patch does not contain any @author tags.
          [exec]
          [exec] -1 tests included. The patch doesn't appear to include any new or modified tests.
          [exec] Please justify why no tests are needed for this patch.
          [exec]
          [exec] +1 javadoc. The javadoc tool did not generate any warning messages.
          [exec]
          [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings.
          [exec]
          [exec] +1 findbugs. The patch does not introduce any new Findbugs warnings.
          [exec]
          [exec] +1 Eclipse classpath. The patch retains Eclipse classpath integrity.
          [exec]
          [exec] -1 release audit. The applied patch generated 823 release audit warnings (more than the trunk's current 820 warnings).

          The -1 on release audit is because jdiff has generated some changes to public classes and packages (DistributedCache, FileUtil and the filecache package). The release audit seems to be flagging these new jdiff changed files as warnings. I've cross checked the ASF license header is included for all new files I've put up.

          The -1 on tests is as explained above.

          There's no change in functionality to the last patch I uploaded. Only a merge.

          Show
          Hemanth Yamijala added a comment - Merged with trunk. It was broken by a commit done yesterday night. Also removed extraneous logs. Running ant test. test-patch gives following output: [exec] -1 overall. [exec] [exec] +1 @author. The patch does not contain any @author tags. [exec] [exec] -1 tests included. The patch doesn't appear to include any new or modified tests. [exec] Please justify why no tests are needed for this patch. [exec] [exec] +1 javadoc. The javadoc tool did not generate any warning messages. [exec] [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings. [exec] [exec] +1 findbugs. The patch does not introduce any new Findbugs warnings. [exec] [exec] +1 Eclipse classpath. The patch retains Eclipse classpath integrity. [exec] [exec] -1 release audit. The applied patch generated 823 release audit warnings (more than the trunk's current 820 warnings). The -1 on release audit is because jdiff has generated some changes to public classes and packages (DistributedCache, FileUtil and the filecache package). The release audit seems to be flagging these new jdiff changed files as warnings. I've cross checked the ASF license header is included for all new files I've put up. The -1 on tests is as explained above. There's no change in functionality to the last patch I uploaded. Only a merge.
          Hemanth Yamijala made changes -
          Attachment HADOOP-4490.patch [ 12399627 ]
          Hide
          Hemanth Yamijala added a comment -

          Attached a hopefully last version of the patch. This one is extensively tested and has fixed a couple of bugs related to incorrect assumptions about multiple mapred local directories. Thanks to Sreekanth and Amar for help in testing this. We're run randomwriter, sort with and without JVM reuse, and also streaming and using distributed cache.

          The test-patch results are showing a -1 on release audit which I've written to core-dev about. I am not sure why a -1 is coming, will continue to debug that. There's also a -1 on tests. It is difficult to write unit tests for this patch since it requires support for multiple users.

          There are a lot of log statements in the patch which should be removed before commit. I am attaching this with the hope that someone can take a look at the changes. Arun ?

          Show
          Hemanth Yamijala added a comment - Attached a hopefully last version of the patch. This one is extensively tested and has fixed a couple of bugs related to incorrect assumptions about multiple mapred local directories. Thanks to Sreekanth and Amar for help in testing this. We're run randomwriter, sort with and without JVM reuse, and also streaming and using distributed cache. The test-patch results are showing a -1 on release audit which I've written to core-dev about. I am not sure why a -1 is coming, will continue to debug that. There's also a -1 on tests. It is difficult to write unit tests for this patch since it requires support for multiple users. There are a lot of log statements in the patch which should be removed before commit. I am attaching this with the hope that someone can take a look at the changes. Arun ?
          Hemanth Yamijala made changes -
          Attachment HADOOP-4490.patch [ 12399574 ]
          Hide
          Hemanth Yamijala added a comment -

          I've updated the patch to trunk, incorporating most of Arun's comments above. Arun, can you please take a look.

          We should use mapred.local.dir instead of hadoop.tmp.dir in LinuxTaskController.

          Done.

          Use Path's methods instead of String manipulation for all path-related manipulations.

          Done.

          Pass mode, user/group to DistributedCache rather than rely on the newly introduced DistributedCache.isFreshlyLoaded which is then unnecessary.

          Done. I've added a new overloaded API that passes the information to DistributedCache. Just to keep options open, I've defined a new public class DistributedCacheFileAccessInfo - a simple class that can be used to define permissions and ownership information for localized files in DistributedCache. Can you take a specific look at this, and let me know if this looks OK ?

          Move setting up of JVM-specific files e.g. task's log directory to TaskController.launchJVM

          I've not done this one alone. It was not very clear what information is necessary at launch time. For e.g. if there are some localized files under the task cache directory that need to be loaded at launch time, we'll need permissions for these also. In general, it seemed a little risky to launch the JVM without giving full access to all jars etc, even if the Task will start running later only. So, I've left this as is. I think the main concern here was about the special check I had in JvmManager where I was avoiding setting the permissions again when getting the task to launch. This seems a simple enough check, and I've documented the rationale in code. Can you verify this again, and let me know your thoughts ?

          Show
          Hemanth Yamijala added a comment - I've updated the patch to trunk, incorporating most of Arun's comments above. Arun, can you please take a look. We should use mapred.local.dir instead of hadoop.tmp.dir in LinuxTaskController. Done. Use Path's methods instead of String manipulation for all path-related manipulations. Done. Pass mode, user/group to DistributedCache rather than rely on the newly introduced DistributedCache.isFreshlyLoaded which is then unnecessary. Done. I've added a new overloaded API that passes the information to DistributedCache. Just to keep options open, I've defined a new public class DistributedCacheFileAccessInfo - a simple class that can be used to define permissions and ownership information for localized files in DistributedCache. Can you take a specific look at this, and let me know if this looks OK ? Move setting up of JVM-specific files e.g. task's log directory to TaskController.launchJVM I've not done this one alone. It was not very clear what information is necessary at launch time. For e.g. if there are some localized files under the task cache directory that need to be loaded at launch time, we'll need permissions for these also. In general, it seemed a little risky to launch the JVM without giving full access to all jars etc, even if the Task will start running later only. So, I've left this as is. I think the main concern here was about the special check I had in JvmManager where I was avoiding setting the permissions again when getting the task to launch. This seems a simple enough check, and I've documented the rationale in code. Can you verify this again, and let me know your thoughts ?
          Hemanth Yamijala made changes -
          Attachment HADOOP-4490.patch [ 12399228 ]
          Hide
          Arun C Murthy added a comment -

          Hemanth, this is look good.

          Some comments:

          1. We should use mapred.local.dir instead of hadoop.tmp.dir in LinuxTaskController.
          2. Use Path's methods instead of String manipulation for all path-related manipulations.
          3. Pass mode, user/group to DistributedCache rather than rely on the newly introduced DistributedCache.isFreshlyLoaded which is then unnecessary.
          4. Move setting up of JVM-specific files e.g. task's log directory to TaskController.launchJVM.
          Show
          Arun C Murthy added a comment - Hemanth, this is look good. Some comments: We should use mapred.local.dir instead of hadoop.tmp.dir in LinuxTaskController. Use Path's methods instead of String manipulation for all path-related manipulations. Pass mode, user/group to DistributedCache rather than rely on the newly introduced DistributedCache.isFreshlyLoaded which is then unnecessary. Move setting up of JVM-specific files e.g. task's log directory to TaskController.launchJVM.
          Hide
          Hemanth Yamijala added a comment -

          The latest patch also adds correct permissions to files localized as part of the distributed cache. In order to do this, I introduced a new API in distributed cache to indicate whether the files were localized freshly (i.e. as part of the current task's localization), or whether they are already existing in cache. I use this API to avoid setting permissions on the same cache files repeatedly for each task. If there's a better way to do this, I would be glad to know that.

          All changes made in the previous patch still hold good, except for minor refactoring.

          This patch is now complete to the best of my knowledge. Please do offer your comments.

          Show
          Hemanth Yamijala added a comment - The latest patch also adds correct permissions to files localized as part of the distributed cache. In order to do this, I introduced a new API in distributed cache to indicate whether the files were localized freshly (i.e. as part of the current task's localization), or whether they are already existing in cache. I use this API to avoid setting permissions on the same cache files repeatedly for each task. If there's a better way to do this, I would be glad to know that. All changes made in the previous patch still hold good, except for minor refactoring. This patch is now complete to the best of my knowledge. Please do offer your comments.
          Hemanth Yamijala made changes -
          Attachment HADOOP-4490.patch [ 12397754 ]
          Hide
          Hemanth Yamijala added a comment -

          I attached a new patch that is more comprehensive. All changes from the previous patch still hold good. This one adds the correct permissions for all relevant files and directories, except distributed cache.

          The previous patch only set relevant permissions on the task and log cache directories for all users, with the intent that tasks running as any user should be able to create and use other files and directories under them. This requirement still applies. However, there are other files and directories whose access needs to be adjusted too. The new patch addresses these changes:

          • It sets permissions on the job related jar files and other directories allowing access to everyone.
          • It sets read and execute permissions on directory paths until the task / job cache and log directories. For e.g. if a task cache directory is created under $ {mapred.local.dir}

            /taskTracker/jobcache, all paths in this component are attempted to be given read and execute (and no write) access for all users. This is required for looking up paths and locating / reading files created by the tasktracker.

          Both the changes above are required in future as well. Except then, the permission string would be more restrictive (disallowing access to group and others).

          The previous patch was working because of a subtle behavior in setuid. On the systems where we tested, the umask was set such that read and execute permissions were provided to group by default. So, any of the job files created by the tasktracker had read and execute to the group to which the tasktracker user belonged. When the setuid executable switched users, it does not clear the supplementary group information of the launcher. Hence, the new process running as the job owner still had access to the groups to which the tasktracker belonged, and hence worked. Again, in HADOOP-4491, we propose to remove all access for the group ownership also, and hence this will not be an issue.

          Show
          Hemanth Yamijala added a comment - I attached a new patch that is more comprehensive. All changes from the previous patch still hold good. This one adds the correct permissions for all relevant files and directories, except distributed cache. The previous patch only set relevant permissions on the task and log cache directories for all users, with the intent that tasks running as any user should be able to create and use other files and directories under them. This requirement still applies. However, there are other files and directories whose access needs to be adjusted too. The new patch addresses these changes: It sets permissions on the job related jar files and other directories allowing access to everyone. It sets read and execute permissions on directory paths until the task / job cache and log directories. For e.g. if a task cache directory is created under $ {mapred.local.dir} /taskTracker/jobcache, all paths in this component are attempted to be given read and execute (and no write) access for all users. This is required for looking up paths and locating / reading files created by the tasktracker. Both the changes above are required in future as well. Except then, the permission string would be more restrictive (disallowing access to group and others). The previous patch was working because of a subtle behavior in setuid. On the systems where we tested, the umask was set such that read and execute permissions were provided to group by default. So, any of the job files created by the tasktracker had read and execute to the group to which the tasktracker user belonged. When the setuid executable switched users, it does not clear the supplementary group information of the launcher. Hence, the new process running as the job owner still had access to the groups to which the tasktracker belonged, and hence worked. Again, in HADOOP-4491 , we propose to remove all access for the group ownership also, and hence this will not be an issue.
          Hemanth Yamijala made changes -
          Attachment HADOOP-4490.patch [ 12397174 ]
          Hemanth Yamijala made changes -
          Attachment hadoop-4490-design.pdf [ 12396649 ]
          Hide
          Hemanth Yamijala added a comment -

          A lot of discussion has happened on various comments in this JIRA. The attached document collates all of them. I hope this will make it easier to follow the approach and review the changes.

          Show
          Hemanth Yamijala added a comment - A lot of discussion has happened on various comments in this JIRA. The attached document collates all of them. I hope this will make it easier to follow the approach and review the changes.
          Hide
          Hemanth Yamijala added a comment -

          The attached patch implements changes in the tasktracker to launch tasks using the setuid executable defined in HADOOP-4930. By doing so, it runs tasks as job owners. The CLI for the setuid exe is:

           task-controller <user-name> <command-enum-value> <job-id> <task-id> <tasktracker-root>
          

          As mentioned in comments above, this patch only handles launching and killing of tasks, and does not handle file and directory permissions securely. In fact, it opens up the permissions so that both the tasktracker and task can share files and directories. However, this change is only done when the feature is enabled, and does not affect the default Hadoop behavior. When HADOOP-4491 and other issues are fixed, secure permissions will be replaced.

          The changes in the patch include:

          • A TaskController class that defines abstract methods for launching and killing tasks
          • A DefaultTaskController where a little code from JvmManager has been moved
          • A LinuxTaskController which implements the methods by calling the setuid executable of HADOOP-4930.
          • A new configuration variable mapred.task.tracker.task-controller to define the specific type of TaskController to use. Defaults to DefaultTaskController.

          Tested this on a single node cluster, along with the setuid executable of HADOOP-4930. Will follow-up with testing on larger clusters.

          I request a review for the same.

          Show
          Hemanth Yamijala added a comment - The attached patch implements changes in the tasktracker to launch tasks using the setuid executable defined in HADOOP-4930 . By doing so, it runs tasks as job owners. The CLI for the setuid exe is: task-controller <user-name> <command- enum -value> <job-id> <task-id> <tasktracker-root> As mentioned in comments above, this patch only handles launching and killing of tasks, and does not handle file and directory permissions securely. In fact, it opens up the permissions so that both the tasktracker and task can share files and directories. However, this change is only done when the feature is enabled, and does not affect the default Hadoop behavior. When HADOOP-4491 and other issues are fixed, secure permissions will be replaced. The changes in the patch include: A TaskController class that defines abstract methods for launching and killing tasks A DefaultTaskController where a little code from JvmManager has been moved A LinuxTaskController which implements the methods by calling the setuid executable of HADOOP-4930 . A new configuration variable mapred.task.tracker.task-controller to define the specific type of TaskController to use. Defaults to DefaultTaskController. Tested this on a single node cluster, along with the setuid executable of HADOOP-4930 . Will follow-up with testing on larger clusters. I request a review for the same.
          Hemanth Yamijala made changes -
          Attachment HADOOP-4490.patch [ 12396648 ]
          Hide
          Hemanth Yamijala added a comment -

          It would also be good to have the task tracker root directories in a separate config file that can be owned by root.

          We are taking care of this point in the setuid executable. One question is to determine how the location of this secure config file will known to the executable. Following are our options:

          Option 1: Read from the environment variable HADOOP_CONF_DIR
          Option 2: Take a command line option to specify the location of the file.
          Option 3: Have it as a build time configuration parameter, and encode into the executable (like for instance, pass it as an autoconf option).

          Options 1 and 2 may allow users to launch the executable pointing to some custom path. Option 3 would completely avoid this, and make it more secure.

          For the sake of deployment, I think the setuid executable should be built using a separate ant target, as it would need to be setup as owned by root etc. So, maybe it is easy to do Option 3 in that case. If we decide to go with one of the other two options, we should mandate additional checks to make sure that the configuration file is owned by the root user, as Owen mentioned.

          Any comments ?

          Show
          Hemanth Yamijala added a comment - It would also be good to have the task tracker root directories in a separate config file that can be owned by root. We are taking care of this point in the setuid executable. One question is to determine how the location of this secure config file will known to the executable. Following are our options: Option 1: Read from the environment variable HADOOP_CONF_DIR Option 2: Take a command line option to specify the location of the file. Option 3: Have it as a build time configuration parameter, and encode into the executable (like for instance, pass it as an autoconf option). Options 1 and 2 may allow users to launch the executable pointing to some custom path. Option 3 would completely avoid this, and make it more secure. For the sake of deployment, I think the setuid executable should be built using a separate ant target, as it would need to be setup as owned by root etc. So, maybe it is easy to do Option 3 in that case. If we decide to go with one of the other two options, we should mandate additional checks to make sure that the configuration file is owned by the root user, as Owen mentioned. Any comments ?
          Hemanth Yamijala made changes -
          Link This issue is blocked by HADOOP-4930 [ HADOOP-4930 ]
          Hemanth Yamijala made changes -
          Link This issue is blocked by HADOOP-2721 [ HADOOP-2721 ]
          Hide
          Hemanth Yamijala added a comment -

          I had an offline discussion with Sameer about how to get this patch in. To make it easier for reviewing, maybe it makes sense to split the task up into multiple sub tasks. Atleast 3 that are identified are:

          • Launch and kill tasks (this would involve RUN_TASK and KILL_TASK commands)
          • Handle local data securely (this would involve SETUP_TASK and MOVE_TASK_OUTPUT and CLEANUP_TASK commands)
          • Handle distributed cache.

          In order to get a working launch and kill tasks patch though, the file and directory permissions will need to be opened up to allow access to all users. Each of the other patches will make it more secure.

          Please note that we have discussed the approach of how we will address directory and file permissions (such as intermediate outputs) in this JIRA already. This proposal is only to make it simpler to get some incremental patches in. Would this work ? If yes, I will use this JIRA to handle the first of the three tasks, then use HADOOP-4491 and HADOOP-4493 for the others.

          Show
          Hemanth Yamijala added a comment - I had an offline discussion with Sameer about how to get this patch in. To make it easier for reviewing, maybe it makes sense to split the task up into multiple sub tasks. Atleast 3 that are identified are: Launch and kill tasks (this would involve RUN_TASK and KILL_TASK commands) Handle local data securely (this would involve SETUP_TASK and MOVE_TASK_OUTPUT and CLEANUP_TASK commands) Handle distributed cache. In order to get a working launch and kill tasks patch though, the file and directory permissions will need to be opened up to allow access to all users. Each of the other patches will make it more secure. Please note that we have discussed the approach of how we will address directory and file permissions (such as intermediate outputs) in this JIRA already. This proposal is only to make it simpler to get some incremental patches in. Would this work ? If yes, I will use this JIRA to handle the first of the three tasks, then use HADOOP-4491 and HADOOP-4493 for the others.
          Hide
          Hemanth Yamijala added a comment -

          Thanks, Owen. I will take care of that.

          Some discussion is required for handling distributed cache (HADOOP-4493). Firstly, localized files from distributed cache are not localized per job. Since anything can be passed through distributed cache, I think it should support the same level of access control as the rest of the files. That is, they should be changed to be localized per job and subject to the same access control mechanisms we are using for the rest of the files - like output directories etc.

          I don't think this is a big impact for users as they can't assume the cache to contain the files they want on the nodes where the task is running. However, from the system perspective, probably if a lot of users (say working on the same project that requires the same data files) want to share this but across multiple jobs, we would be copying only once per node, saving both space and time. If we modify this to be localized per job, we could lose that advantage, no ?

          Any thoughts on this trade off ?

          Show
          Hemanth Yamijala added a comment - Thanks, Owen. I will take care of that. Some discussion is required for handling distributed cache ( HADOOP-4493 ). Firstly, localized files from distributed cache are not localized per job. Since anything can be passed through distributed cache, I think it should support the same level of access control as the rest of the files. That is, they should be changed to be localized per job and subject to the same access control mechanisms we are using for the rest of the files - like output directories etc. I don't think this is a big impact for users as they can't assume the cache to contain the files they want on the nodes where the task is running. However, from the system perspective, probably if a lot of users (say working on the same project that requires the same data files) want to share this but across multiple jobs, we would be copying only once per node, saving both space and time. If we modify this to be localized per job, we could lose that advantage, no ? Any thoughts on this trade off ?
          Hide
          Owen O'Malley added a comment -

          That sounds reasonable. Please ensure that the pid file is owned by the user given on the command line and has permissions of 600. This avoids someone leaving this file writable and having someone point it at a different process.

          Show
          Owen O'Malley added a comment - That sounds reasonable. Please ensure that the pid file is owned by the user given on the command line and has permissions of 600. This avoids someone leaving this file writable and having someone point it at a different process.
          Hide
          Hemanth Yamijala added a comment -

          I have a version now that runs with JVM reuse enabled also. The main changes to get this to work was to correct the way I was figuring out the current task ids in the JvmManager class.

          Also added a KILL_TASK command. This will look as follows:

          KILL_TASK user_name job_id task_attempt_id

          This will be called from JvmRunner.kill(), which in turn is called whenever a TT gets a kill task action. Since the JVM process is running as the job owner (different from the TT), we can't directly destroy the JVM process. Instead, what I've done is the following:

          • Write a (hidden) .pid file into the task directory when a task is executed. This is owned by the job owner and not readable by anyone else. The pid file contains the JVM's pid.
          • When the JVM needs to be killed, we call the taskcontroller executable with the job_id and task_id.
          • The taskcontroller drops privileges to the job owner, then reads the pid file and gets the pid of the jvm.
          • Then the taskcontroller issues a kill(pid, SIGTERM) to kill the jvm.

          Any concerns with this approach ?

          Currently other than distributed cache, all other aspects of the task life cycle are functioning. I'll probably upload a single writeup (as Arun had done for HADOOP-4348) that will capture all the information in comments above for easy reference.

          And of course, follow that up with the first patch. smile

          Show
          Hemanth Yamijala added a comment - I have a version now that runs with JVM reuse enabled also. The main changes to get this to work was to correct the way I was figuring out the current task ids in the JvmManager class. Also added a KILL_TASK command. This will look as follows: KILL_TASK user_name job_id task_attempt_id This will be called from JvmRunner.kill(), which in turn is called whenever a TT gets a kill task action. Since the JVM process is running as the job owner (different from the TT), we can't directly destroy the JVM process. Instead, what I've done is the following: Write a (hidden) .pid file into the task directory when a task is executed. This is owned by the job owner and not readable by anyone else. The pid file contains the JVM's pid. When the JVM needs to be killed, we call the taskcontroller executable with the job_id and task_id. The taskcontroller drops privileges to the job owner, then reads the pid file and gets the pid of the jvm. Then the taskcontroller issues a kill(pid, SIGTERM) to kill the jvm. Any concerns with this approach ? Currently other than distributed cache, all other aspects of the task life cycle are functioning. I'll probably upload a single writeup (as Arun had done for HADOOP-4348 ) that will capture all the information in comments above for easy reference. And of course, follow that up with the first patch. smile
          Hide
          Hemanth Yamijala added a comment -

          I had an offline discussion with Devaraj regarding the implementation, and we also went over the impact this would have when clubbed with JVM reuse.

          A few comments from him that I am documenting here:

          • Task directories under the tasktracker system or root directory to which files (such as intermediate outputs) are copied after task completion should be in the same disk as the original user's task directories. This is to prevent across disk copies.
          • Regarding the problem of serving log outputs which I've mentioned here, we discussed one approach could be to have a command in the executable to read the data and return to the TaskLogServlet on demand. This would happen reasonably rarely and does not affect any other functionality. Hence it seems like the performance overhead can be ignored.
          • Another comment was to reduce the number of times the executable is launched. For e.g. without JVM reuse, I can setup the directories, run the task, and then move the outputs with a single launch of the executable. This is possible because all actions are per task, and there is one JVM per task. Hence the lifecycle of the task fits well with the setuid changes.

          With JVM reuse though, the last point becomes problematic. We can easily setup the directories and move the output before and after the task. However, that needs to be done with a separate launch of the executable - three times actually. The performance impact this would have (and would it offset the advantage of JVM reuse) is something to measure and see.

          Show
          Hemanth Yamijala added a comment - I had an offline discussion with Devaraj regarding the implementation, and we also went over the impact this would have when clubbed with JVM reuse. A few comments from him that I am documenting here: Task directories under the tasktracker system or root directory to which files (such as intermediate outputs) are copied after task completion should be in the same disk as the original user's task directories. This is to prevent across disk copies. Regarding the problem of serving log outputs which I've mentioned here , we discussed one approach could be to have a command in the executable to read the data and return to the TaskLogServlet on demand. This would happen reasonably rarely and does not affect any other functionality. Hence it seems like the performance overhead can be ignored. Another comment was to reduce the number of times the executable is launched. For e.g. without JVM reuse, I can setup the directories, run the task, and then move the outputs with a single launch of the executable. This is possible because all actions are per task, and there is one JVM per task. Hence the lifecycle of the task fits well with the setuid changes. With JVM reuse though, the last point becomes problematic. We can easily setup the directories and move the output before and after the task. However, that needs to be done with a separate launch of the executable - three times actually. The performance impact this would have (and would it offset the advantage of JVM reuse) is something to measure and see.
          Hide
          Hemanth Yamijala added a comment -

          CREATE_TASK_DIR owen task_20080101_0001_m_000001_1

          Owen, sure this makes sense. Just one point is that I might need the job id along with the task id. But that's still within the same spirit.

          I will also make the other changes you have suggested like blocking root user.

          I've also added a CLEANUP_TASK_DIR command to the executable which is now able to cleanup the directories after task is completed. This is called from the CleanupQueue thread in task tracker.

          Show
          Hemanth Yamijala added a comment - CREATE_TASK_DIR owen task_20080101_0001_m_000001_1 Owen, sure this makes sense. Just one point is that I might need the job id along with the task id. But that's still within the same spirit. I will also make the other changes you have suggested like blocking root user. I've also added a CLEANUP_TASK_DIR command to the executable which is now able to cleanup the directories after task is completed. This is called from the CleanupQueue thread in task tracker.
          Hide
          Allen Wittenauer added a comment -

          The user who submits the job should be the user who runs the code on the compute nodes due to issues that surround the environment outside Haddop. For example, it is possible to submit a job that writes junk data to the low priv user's home dir. Without tracking who submitted that job, ops would never know who to go bonk on the head.

          ... and then there is streaming.

          I can think of instances where it might be useful to have generic accounts run stuff. In those instances, it is still much better to have that handled outside Hadoop. [Either through setuid scripts, roles, sudo, kinit a special keytab prior to job submit, whatever.] Let the OS/tool/ops team/whatever deal with the accounting in those situations.

          Show
          Allen Wittenauer added a comment - The user who submits the job should be the user who runs the code on the compute nodes due to issues that surround the environment outside Haddop. For example, it is possible to submit a job that writes junk data to the low priv user's home dir. Without tracking who submitted that job, ops would never know who to go bonk on the head. ... and then there is streaming. I can think of instances where it might be useful to have generic accounts run stuff. In those instances, it is still much better to have that handled outside Hadoop. [Either through setuid scripts, roles, sudo, kinit a special keytab prior to job submit, whatever.] Let the OS/tool/ops team/whatever deal with the accounting in those situations.
          Hide
          Doug Cutting added a comment -

          > Steve: have some low-privilege user for running work; there isn't a 1:1 mapping of grid users to user accounts
          > Owen: running as the real user is important. If I run a job, I should not be able to look at or kill your job's data or tasks

          Might it be possible to have a pool of low-privileged users, to remove the requirement that every user has an account on every machine? Or maybe that requirement's not that onerous, with PAM/LDAP?

          Show
          Doug Cutting added a comment - > Steve: have some low-privilege user for running work; there isn't a 1:1 mapping of grid users to user accounts > Owen: running as the real user is important. If I run a job, I should not be able to look at or kill your job's data or tasks Might it be possible to have a pool of low-privileged users, to remove the requirement that every user has an account on every machine? Or maybe that requirement's not that onerous, with PAM/LDAP?
          Hide
          Owen O'Malley added a comment -

          Hemanth,
          This sounds good, but from a security standpoint, I think that it would be better to make the tasks more specific. So something like:

          CREATE_TASK_DIR owen task_20080101_0001_m_000001_1
          MOVE_TASK_OUTPUT owen task_20080101_0001_m_000001_1
          REMOVE_TASK_DIR owen task_20080101_0001_m_000001_1

          It would also be good to have the task tracker root directories in a separate config file that can be owned by root. My goal is to make this executable as limited as possible. It should also block root as the user. What I do not want to see is having this work:

          MOVE_FILES root /tmp/foo /etc/passwd

          Show
          Owen O'Malley added a comment - Hemanth, This sounds good, but from a security standpoint, I think that it would be better to make the tasks more specific. So something like: CREATE_TASK_DIR owen task_20080101_0001_m_000001_1 MOVE_TASK_OUTPUT owen task_20080101_0001_m_000001_1 REMOVE_TASK_DIR owen task_20080101_0001_m_000001_1 It would also be good to have the task tracker root directories in a separate config file that can be owned by root. My goal is to make this executable as limited as possible. It should also block root as the user. What I do not want to see is having this work: MOVE_FILES root /tmp/foo /etc/passwd
          Hide
          Owen O'Malley added a comment -

          From what I've heard, running under a security manager kills performance, which is pretty much a non-starter. Especially given that we need unix-level security anyways. In our environment, running as the real user is important. If I run a job, I should not be able to look at or kill your job's data or tasks, even if we are sharing a machine. Of course this feature needs to be optional, since:
          1. It requires that all cluster users have accounts on all slave nodes in the cluster
          2. It requires native code that may not work on all platforms
          3. It requires root access, which not all Hadoop admins have.

          Show
          Owen O'Malley added a comment - From what I've heard, running under a security manager kills performance, which is pretty much a non-starter. Especially given that we need unix-level security anyways. In our environment, running as the real user is important. If I run a job, I should not be able to look at or kill your job's data or tasks, even if we are sharing a machine. Of course this feature needs to be optional, since: 1. It requires that all cluster users have accounts on all slave nodes in the cluster 2. It requires native code that may not work on all platforms 3. It requires root access, which not all Hadoop admins have.
          Hide
          steve_l added a comment -

          Hemanth -you are right, for streaming/pipes stuff a second identity is needed. What some of the grid toolkits have done in the past is have some low-privilege user for running work; there isn't a 1:1 mapping of grid users to user accounts, instead the worker is allowed access to the relevant files of a user for a while, then at the end of the job, that data goes away. This eliminates some of the account management problems, though forces you to make sure that the worker doesnt have access to any old/shared data on the same filesystem.

          Show
          steve_l added a comment - Hemanth -you are right, for streaming/pipes stuff a second identity is needed. What some of the grid toolkits have done in the past is have some low-privilege user for running work; there isn't a 1:1 mapping of grid users to user accounts, instead the worker is allowed access to the relevant files of a user for a while, then at the end of the job, that data goes away. This eliminates some of the account management problems, though forces you to make sure that the worker doesnt have access to any old/shared data on the same filesystem.
          Hide
          Hemanth Yamijala added a comment -

          I have been able to make some progress and get a wordcount job to run as the job submitter. The design follows the basic approach mentioned above, minus the plugin abstraction, which I need to create yet.

          • Created a setuid C executable.
          • This executable currently takes the following commands:
            • SETUP_DIRS <list of directories>:
              This command sets up task specific directories to be owned by the user. The general approach I followed for handling directory permissions is that the root directories, such as hadoop.tmp.dir/mapred/local/taskTracker/jobcache/jobid would be owned by the tasktracker daemon, which creates task directories under it when needed. Then the taskcontroller exe will change the ownership and permissions of the task directory and sub folders to the user.
            • RUN_TASK <path to a file containing the M/R task to execute>
              The file is a temp file created under the user's work directory itself - executable by the user
            • MOVE_FILES <source directory> <destination directory>
              This command is used to copy the intermediate output and task logs from the task directories to a system specific directory owned by the daemon. The servlets serving this data are modified to read from the system specific directory.
          • These are called from the JvmManager class at appropriate places.

          A couple of things came up when doing this:

          • Task logs: Currently task logs can be viewed when the task is still executing. Further the task logs are read by the TaskLogServlet, which is running in the daemon context. We want the task logs to be owned by the user. I still need to figure out how to achieve this. Currently, I am only able to access task logs after they are done, by executing the MOVE_FILES command. Any ideas are welcome.
          • JVM Reuse: Currently, I've only handled this with one JVM per task. Need to check the approach when JVM reuse is in the picture.
          • Still need to work on cleanup and kill actions, as also distributed cache.

          The code I have is very raw and needs lots of polishing even as a first draft. Will try to do so in a couple of days. Any comments on the approach so far ?

          Show
          Hemanth Yamijala added a comment - I have been able to make some progress and get a wordcount job to run as the job submitter. The design follows the basic approach mentioned above, minus the plugin abstraction, which I need to create yet. Created a setuid C executable. This executable currently takes the following commands: SETUP_DIRS <list of directories>: This command sets up task specific directories to be owned by the user. The general approach I followed for handling directory permissions is that the root directories, such as hadoop.tmp.dir/mapred/local/taskTracker/jobcache/jobid would be owned by the tasktracker daemon, which creates task directories under it when needed. Then the taskcontroller exe will change the ownership and permissions of the task directory and sub folders to the user. RUN_TASK <path to a file containing the M/R task to execute> The file is a temp file created under the user's work directory itself - executable by the user MOVE_FILES <source directory> <destination directory> This command is used to copy the intermediate output and task logs from the task directories to a system specific directory owned by the daemon. The servlets serving this data are modified to read from the system specific directory. These are called from the JvmManager class at appropriate places. A couple of things came up when doing this: Task logs: Currently task logs can be viewed when the task is still executing. Further the task logs are read by the TaskLogServlet, which is running in the daemon context. We want the task logs to be owned by the user. I still need to figure out how to achieve this. Currently, I am only able to access task logs after they are done, by executing the MOVE_FILES command. Any ideas are welcome. JVM Reuse: Currently, I've only handled this with one JVM per task. Need to check the approach when JVM reuse is in the picture. Still need to work on cleanup and kill actions, as also distributed cache. The code I have is very raw and needs lots of polishing even as a first draft. Will try to do so in a couple of days. Any comments on the approach so far ?
          Hide
          Hemanth Yamijala added a comment -

          Steve, I do agree that the security manager approach is simpler and portable. However, I think the requirement is also to support features like streaming. Given that, I think the security manager approach would not work. Am I right ?

          Also, I agree that authentication and authorization is a pre-requisite for this. It is being handled by the other tasks under the jira HADOOP-4487. Here, I am focussing only on the mechanisms to make tasks run as the users who submitted the jobs - a small part of the larger framework.

          Show
          Hemanth Yamijala added a comment - Steve, I do agree that the security manager approach is simpler and portable. However, I think the requirement is also to support features like streaming. Given that, I think the security manager approach would not work. Am I right ? Also, I agree that authentication and authorization is a pre-requisite for this. It is being handled by the other tasks under the jira HADOOP-4487 . Here, I am focussing only on the mechanisms to make tasks run as the users who submitted the jobs - a small part of the larger framework.
          Hide
          steve_l added a comment -

          You are about to take on one of the big problems they hit in the grid world: identity. all the grid tools (condor, platform, etc) have lots of effort put in at the OS level to create new users on target machines, manage the disk and cpu usage limits of that user, etc. But you also need to propagate identity over the wire, which gets you into SAML and other things. Because right now the JobTracker trusts you to be who you say you are -having caller authentication would be a prerequisite to doing back-end user switching.

          If you are interested in running pure Java apps under different rights, this could be done via a security manager. Every task would be started with an explicit security manager/policy that limited what it could do, file and network operations would be checked against the policy. This would be portable and easier to test. It also eliminates the need to run the TT as root, to keep the unix user database in sync with the hadoop user list, etc.

          Show
          steve_l added a comment - You are about to take on one of the big problems they hit in the grid world: identity. all the grid tools (condor, platform, etc) have lots of effort put in at the OS level to create new users on target machines, manage the disk and cpu usage limits of that user, etc. But you also need to propagate identity over the wire, which gets you into SAML and other things. Because right now the JobTracker trusts you to be who you say you are -having caller authentication would be a prerequisite to doing back-end user switching. If you are interested in running pure Java apps under different rights, this could be done via a security manager. Every task would be started with an explicit security manager/policy that limited what it could do, file and network operations would be checked against the policy. This would be portable and easier to test. It also eliminates the need to run the TT as root, to keep the unix user database in sync with the hadoop user list, etc.
          Hemanth Yamijala made changes -
          Fix Version/s 0.20.0 [ 12313438 ]
          Hide
          Hemanth Yamijala added a comment -

          Reseting fix for version, as this will not make the feature freeze.

          Show
          Hemanth Yamijala added a comment - Reseting fix for version, as this will not make the feature freeze.
          Hide
          Andrzej Bialecki added a comment -

          What about Cygwin / Windows users?

          Show
          Andrzej Bialecki added a comment - What about Cygwin / Windows users?
          Hide
          Hemanth Yamijala added a comment -

          Thanks for the comments, Owen.

          It should be written in C, not Java to ensure it has enough access to the platform to actually be secure. In particular, it has to clear both real and effective user ids.

          Yes, I had that in mind. Specifically, I was planning to do something like setuid(getpwnam(user_name)->pw_uid). Since this would be done by a program running as superuser (the setuid exe), it would clear both the real and effective uids.

          I'd like to see the proposed list of commands for the setuid program.

          Sure, I will work on that and post the list here. In order to be reasonably complete, I think I should have a version that's working. So, I will start prototyping on the lines I described above.

          Show
          Hemanth Yamijala added a comment - Thanks for the comments, Owen. It should be written in C, not Java to ensure it has enough access to the platform to actually be secure. In particular, it has to clear both real and effective user ids. Yes, I had that in mind. Specifically, I was planning to do something like setuid(getpwnam(user_name)->pw_uid). Since this would be done by a program running as superuser (the setuid exe), it would clear both the real and effective uids. I'd like to see the proposed list of commands for the setuid program. Sure, I will work on that and post the list here. In order to be reasonably complete, I think I should have a version that's working. So, I will start prototyping on the lines I described above.
          Hide
          Owen O'Malley added a comment -

          +1 for a setuid program.

          It should be written in C, not Java to ensure it has enough access to the platform to actually be secure. In particular, it has to clear both real and effective user ids.

          I'd like to see the proposed list of commands for the setuid program.

          No user-specified strings should be included on the command line, to avoid special character attacks.

          I agree with Sameer that we should have very tight permissions on the map output and task directories. One of the subcommands should probably be to move the outputs from somewhere like $task/output to somewhere like $tt/output/$job/$task.

          Having a plugin that lets us switch between the current pure-java implementation that doesn't change user ids and a setuid implementation sounds reasonable. We should continue to support the non-user-switch by default for clusters run by a single non-root user.

          Show
          Owen O'Malley added a comment - +1 for a setuid program. It should be written in C, not Java to ensure it has enough access to the platform to actually be secure. In particular, it has to clear both real and effective user ids. I'd like to see the proposed list of commands for the setuid program. No user-specified strings should be included on the command line, to avoid special character attacks. I agree with Sameer that we should have very tight permissions on the map output and task directories. One of the subcommands should probably be to move the outputs from somewhere like $task/output to somewhere like $tt/output/$job/$task. Having a plugin that lets us switch between the current pure-java implementation that doesn't change user ids and a setuid implementation sounds reasonable. We should continue to support the non-user-switch by default for clusters run by a single non-root user.
          Hide
          Hemanth Yamijala added a comment -

          I had some offline discussions with Arun and Sameer and here are some initial thoughts on approach. A lot of details still need to be flushed out, but I am posting this to get some early feedback.

          We do want to run the daemons as non-privileged users, and yet go with a setuid based approach to run tasks as a regular user. One approach that was proposed to do this is as follows:

          • We create a setuid executable, say a taskcontroller, that will be owned by root.
          • This executable can take the following arguments - <user> <command> <command arguments>.
          • <user> will be the job owner.
          • <command> will be an action that needs to be performed, such as LAUNCH_JVM, KILL_TASK, etc.
          • <command arguments> will depend on the command. For e.g. LAUNCH_JVM would have the arguments currently used to launch a JVM via the ShellCommandExecutor.
          • The tasktracker will launch this executable with the appropriate command and arguments when needed.
          • As the executable is a setuid exe, it will run as root, and will quickly drop privileges using setuid, to run as the user.
          • Then the arguments will be used to execute the required action, for e.g. launching a VM or killing a task.
          • Before dropping privileges, if needed, the executable could set up directories with appropriate ownership, etc.
          • Naturally this would be platform specific. Hence, we can define a TaskController class that defines APIs to encapsulate these actions. For e.g., something like:
            abstract class TaskController {
            
              abstract void launchTask();
              abstract void killTask(Task t);
              // etc...
            }
            
          • This could be extended by a LinuxTaskController, that converts the generic arguments into something that can be passed to executable - for e.g. maybe a process ID.
          • One specific point is about the directory / file permissions. Sameer was of the opinion that the permissions should be quite strict, that is, world readable rights are not allowed. There are cases where the task as well as the daemon may need to access files. To handle this, one suggestion is to first set the permissions to the user, and then change the ownership to the daemon after the task is done.

          The points above specify a broad approach. Please comment on whether this seems reasonable, reasonable in parts, or completely way off the mark. smile. Based on feedback, I would start implementing a prototype to flush out the details.

          Show
          Hemanth Yamijala added a comment - I had some offline discussions with Arun and Sameer and here are some initial thoughts on approach. A lot of details still need to be flushed out, but I am posting this to get some early feedback. We do want to run the daemons as non-privileged users, and yet go with a setuid based approach to run tasks as a regular user. One approach that was proposed to do this is as follows: We create a setuid executable, say a taskcontroller, that will be owned by root. This executable can take the following arguments - <user> <command> <command arguments>. <user> will be the job owner. <command> will be an action that needs to be performed, such as LAUNCH_JVM, KILL_TASK, etc. <command arguments> will depend on the command. For e.g. LAUNCH_JVM would have the arguments currently used to launch a JVM via the ShellCommandExecutor. The tasktracker will launch this executable with the appropriate command and arguments when needed. As the executable is a setuid exe, it will run as root, and will quickly drop privileges using setuid, to run as the user. Then the arguments will be used to execute the required action, for e.g. launching a VM or killing a task. Before dropping privileges, if needed, the executable could set up directories with appropriate ownership, etc. Naturally this would be platform specific. Hence, we can define a TaskController class that defines APIs to encapsulate these actions. For e.g., something like: abstract class TaskController { abstract void launchTask(); abstract void killTask(Task t); // etc... } This could be extended by a LinuxTaskController, that converts the generic arguments into something that can be passed to executable - for e.g. maybe a process ID. One specific point is about the directory / file permissions. Sameer was of the opinion that the permissions should be quite strict, that is, world readable rights are not allowed. There are cases where the task as well as the daemon may need to access files. To handle this, one suggestion is to first set the permissions to the user, and then change the ownership to the daemon after the task is done. The points above specify a broad approach. Please comment on whether this seems reasonable, reasonable in parts, or completely way off the mark. smile . Based on feedback, I would start implementing a prototype to flush out the details.
          Hide
          Craig Macdonald added a comment -

          I think that (2) depends on how (1) is proposed to be addressed. If you assume that (1) is addressed by using seteuid() or the su command such that processes actually run on the system as the appropriate user, then (2) is extremely difficult without being ruin as root.

          If (1) is addressed just by setting the UGI in some way, then this had disadvantages compared to the seteuid/su - which facilitates secured access to non-HDFS resources (e.g. NFS in smaller environments).

          Show
          Craig Macdonald added a comment - I think that (2) depends on how (1) is proposed to be addressed. If you assume that (1) is addressed by using seteuid() or the su command such that processes actually run on the system as the appropriate user, then (2) is extremely difficult without being ruin as root. If (1) is addressed just by setting the UGI in some way, then this had disadvantages compared to the seteuid/su - which facilitates secured access to non-HDFS resources (e.g. NFS in smaller environments).
          Hide
          Hemanth Yamijala added a comment -

          Before beginning discussions on approach, I wanted to summarize my understanding of this task, and also start discussion on a few points that I have some questions on.

          The following are some salient points:

          1. We want to run tasks as the user who submitted the job, rather than as the user running the daemon.
          2. I think we also don't want to run the daemon as a privileged user (such as root) in order to solve this requirement. Right ?
          3. The directories and files used by the task should have appropriate permissions. Currently these directories and files are mostly created by the daemons, but used by the task. A few are used/accessed by the daemons also. Some of these directories and files are the following:
            1. mapred.local.dir/taskTracker/archive - directories containing distributed cache archives
            2. mapred.local.dir/taskTracker/jobcache/$jobid/ - Include work (which is a scratch space), jars (containing the job jars), job.xml.
            3. mapred.local.dir/taskTracker/jobcache/$jobid/$taskid - Include job.xml, output (intermediate files), work (current working dir) and temp (work/tmp) directories for the task.
            4. mapred.local.dir/taskTracker/pids/$taskid - Written by the shell launching the task, but read by the daemons.
          4. What should 'appropriate' permissions mean ? I guess read/write/execute (on directories) for the owner of the job is required. What should the permissions be for others ? If the task is the only consumer, then the permissions for others can be turned off. However, there are cases where the daemon / other processes might read the files. For instance:
            1. The distributed cache files can be shared across jobs.
            2. Jetty seems to require read permissions on the intermediate files to serve them to the reducers.
              In the above cases, can we make these world readable ?
            3. Task logs are currently generated under $ {hadoop.log.dir}

              /userlogs/$taskid. These are served from the TaskLogServlet of the TaskTracker.

          5. Apart from launching the task itself, we may need some other actions to be performed as the job owner. For instance:
            1. Killing of a task
            2. Maybe setting up and cleaning up of the directories / files
            3. Running the debug script - mapred.map|reduce.task.debug.script

          Is there anything that I am missing ? Comments on the questions of shared directories / files - distributed cache, intermediate outputs, log files ?

          Show
          Hemanth Yamijala added a comment - Before beginning discussions on approach, I wanted to summarize my understanding of this task, and also start discussion on a few points that I have some questions on. The following are some salient points: We want to run tasks as the user who submitted the job, rather than as the user running the daemon. I think we also don't want to run the daemon as a privileged user (such as root) in order to solve this requirement. Right ? The directories and files used by the task should have appropriate permissions. Currently these directories and files are mostly created by the daemons, but used by the task. A few are used/accessed by the daemons also. Some of these directories and files are the following: mapred.local.dir/taskTracker/archive - directories containing distributed cache archives mapred.local.dir/taskTracker/jobcache/$jobid/ - Include work (which is a scratch space), jars (containing the job jars), job.xml. mapred.local.dir/taskTracker/jobcache/$jobid/$taskid - Include job.xml, output (intermediate files), work (current working dir) and temp (work/tmp) directories for the task. mapred.local.dir/taskTracker/pids/$taskid - Written by the shell launching the task, but read by the daemons. What should 'appropriate' permissions mean ? I guess read/write/execute (on directories) for the owner of the job is required. What should the permissions be for others ? If the task is the only consumer, then the permissions for others can be turned off. However, there are cases where the daemon / other processes might read the files. For instance: The distributed cache files can be shared across jobs. Jetty seems to require read permissions on the intermediate files to serve them to the reducers. In the above cases, can we make these world readable ? Task logs are currently generated under $ {hadoop.log.dir} /userlogs/$taskid. These are served from the TaskLogServlet of the TaskTracker. Apart from launching the task itself, we may need some other actions to be performed as the job owner. For instance: Killing of a task Maybe setting up and cleaning up of the directories / files Running the debug script - mapred.map|reduce.task.debug.script Is there anything that I am missing ? Comments on the questions of shared directories / files - distributed cache, intermediate outputs, log files ?
          Hemanth Yamijala made changes -
          Assignee Hemanth Yamijala [ yhemanth ]
          Hemanth Yamijala made changes -
          Assignee Arun C Murthy [ acmurthy ]
          Hide
          Hemanth Yamijala added a comment -

          I think HADOOP-4451 is either related or a duplicate of this. Correct ?

          Show
          Hemanth Yamijala added a comment - I think HADOOP-4451 is either related or a duplicate of this. Correct ?
          Hide
          Vinod Kumar Vavilapalli added a comment -

          Duplicate of HADOOP-4451.

          Show
          Vinod Kumar Vavilapalli added a comment - Duplicate of HADOOP-4451 .
          Arun C Murthy made changes -
          Field Original Value New Value
          Component/s security [ 12312526 ]
          Component/s mapred [ 12310690 ]
          Arun C Murthy created issue -

            People

            • Assignee:
              Hemanth Yamijala
              Reporter:
              Arun C Murthy
            • Votes:
              0 Vote for this issue
              Watchers:
              23 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development