Hadoop Map/Reduce
  1. Hadoop Map/Reduce
  2. MAPREDUCE-1607

Task controller may not set permissions for a task cleanup attempt's log directory

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.21.0
    • Fix Version/s: 0.21.0
    • Component/s: task-controller
    • Labels:
      None
    • Hadoop Flags:
      Incompatible change, Reviewed
    • Release Note:
      Hide
      Fixed initialization of a task-cleanup attempt's log directory by setting correct permissions via task-controller. Added new log4j properties hadoop.tasklog.iscleanup and log4j.appender.TLA.isCleanup to conf/log4j.properties. Changed the userlogs for a task-cleanup attempt to go into its own directory instead of the original attempt directory. This is an incompatible change as old userlogs of cleanup attempt-dirs before this release will no longer be visible.
      Show
      Fixed initialization of a task-cleanup attempt's log directory by setting correct permissions via task-controller. Added new log4j properties hadoop.tasklog.iscleanup and log4j.appender.TLA.isCleanup to conf/log4j.properties. Changed the userlogs for a task-cleanup attempt to go into its own directory instead of the original attempt directory. This is an incompatible change as old userlogs of cleanup attempt-dirs before this release will no longer be visible.

      Description

      Task controller uses the INITIALIZE_TASK command to initialize task attempt and task log directories. For cleanup tasks, task attempt directories are named as task-attempt-id.cleanup. But log directories do not have the .cleanup suffix. The task controller is not aware of this distinction and tries to set permissions for log directories named task-attempt-id.cleanup. This is a NO-OP. Typically the task cleanup runs on the same node that ran the original task attempt as well. So, the task log directories are already properly initialized. However, the task cleanup can run on a node that has not run the original task attempt. In that case, the initialization would not happen and this could result in the cleanup task failing.

      1. patch-1607.txt
        52 kB
        Amareshwari Sriramadasu
      2. patch-1607-1.txt
        52 kB
        Amareshwari Sriramadasu
      3. patch-1607-2.txt
        52 kB
        Amareshwari Sriramadasu
      4. patch-1607-ydist.txt
        65 kB
        Amareshwari Sriramadasu

        Activity

        Hemanth Yamijala created issue -
        Amareshwari Sriramadasu made changes -
        Field Original Value New Value
        Assignee Amareshwari Sriramadasu [ amareshwari ]
        Hide
        Amareshwari Sriramadasu added a comment -

        Illustrative patch for Yahoo! distribution not for commit here.

        Patch does the following :
        1. It changes the directory structure of the task log directory from
        $hadoop.log.dir/userlogs/jobid/taskid to $hadoop.log.dir/userlogs/jobid/taskid[.cleanup]

        2. Changes the index file name from log.index[.cleanup] to log.index
        Did this change because log.index file need not have a suffix now, because it is going into separate directory.

        3. Changes the contents of index file for LOG_DIR from
        LOG_DIR:<attemptid> to LOG_DIR:<actual log location>
        This change is needed for the indexing into proper directory.
        We can just add LOG_DIR:<attemptid>[.cleanup], but caller has to pass jobid also to find the actual directory. I
        think that adding the actual log path will avoid any other directory structure change (like adding user) to make
        changes in all the places where log location is used. One more use case for doing this is distributing logs in all the disks, similar to "mapred.local.dir".

        4. Patch adds a regression test TestTaskTrackerLocalization.testCleanupTaskLocalization() which fails without the patch and passes with the patch.

        5. Patch adds a Junit test to test tasklogs and indices for all the tasks of job with jvm-reuse. The existing test
        TestTaskFail already tests logs of task attempts and cleanup attempts.

        Show
        Amareshwari Sriramadasu added a comment - Illustrative patch for Yahoo! distribution not for commit here. Patch does the following : 1. It changes the directory structure of the task log directory from $hadoop.log.dir/userlogs/jobid/taskid to $hadoop.log.dir/userlogs/jobid/taskid [.cleanup] 2. Changes the index file name from log.index [.cleanup] to log.index Did this change because log.index file need not have a suffix now, because it is going into separate directory. 3. Changes the contents of index file for LOG_DIR from LOG_DIR:<attemptid> to LOG_DIR:<actual log location> This change is needed for the indexing into proper directory. We can just add LOG_DIR:<attemptid> [.cleanup] , but caller has to pass jobid also to find the actual directory. I think that adding the actual log path will avoid any other directory structure change (like adding user) to make changes in all the places where log location is used. One more use case for doing this is distributing logs in all the disks, similar to "mapred.local.dir". 4. Patch adds a regression test TestTaskTrackerLocalization.testCleanupTaskLocalization() which fails without the patch and passes with the patch. 5. Patch adds a Junit test to test tasklogs and indices for all the tasks of job with jvm-reuse. The existing test TestTaskFail already tests logs of task attempts and cleanup attempts.
        Amareshwari Sriramadasu made changes -
        Attachment patch-1607-ydist.txt [ 12442538 ]
        Hide
        Amareshwari Sriramadasu added a comment -

        Patch for trunk. Patch removes tasklog truncation changes from the yahoo! patch and all other changes are almost same.

        Show
        Amareshwari Sriramadasu added a comment - Patch for trunk. Patch removes tasklog truncation changes from the yahoo! patch and all other changes are almost same.
        Amareshwari Sriramadasu made changes -
        Attachment patch-1607.txt [ 12443071 ]
        Hide
        Amareshwari Sriramadasu added a comment -

        Earlier patch had minor bug because of a copy-paste. Attached patch fixes that.

        Show
        Amareshwari Sriramadasu added a comment - Earlier patch had minor bug because of a copy-paste. Attached patch fixes that.
        Amareshwari Sriramadasu made changes -
        Attachment patch-1607-1.txt [ 12443075 ]
        Hide
        Amareshwari Sriramadasu added a comment -

        All linux task controller tests passed with the patch (both as tt user and some other user). I tested viewing of task logs and cleanup attempts logs on the single node machine.

        Show
        Amareshwari Sriramadasu added a comment - All linux task controller tests passed with the patch (both as tt user and some other user). I tested viewing of task logs and cleanup attempts logs on the single node machine.
        Amareshwari Sriramadasu made changes -
        Status Open [ 1 ] Patch Available [ 10002 ]
        Amareshwari Sriramadasu made changes -
        Fix Version/s 0.21.0 [ 12314045 ]
        Affects Version/s 0.21.0 [ 12314045 ]
        Hide
        Hadoop QA added a comment -

        +1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12443075/patch-1607-1.txt
        against trunk revision 938805.

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 21 new or modified tests.

        +1 javadoc. The javadoc tool did not generate any warning messages.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 findbugs. The patch does not introduce any new Findbugs warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        +1 core tests. The patch passed core unit tests.

        +1 contrib tests. The patch passed contrib unit tests.

        Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/152/testReport/
        Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/152/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
        Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/152/artifact/trunk/build/test/checkstyle-errors.html
        Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/152/console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12443075/patch-1607-1.txt against trunk revision 938805. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 21 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/152/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/152/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/152/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/152/console This message is automatically generated.
        Hide
        Vinod Kumar Vavilapalli added a comment -

        I started looking at this patch.

        Show
        Vinod Kumar Vavilapalli added a comment - I started looking at this patch.
        Hide
        Vinod Kumar Vavilapalli added a comment -

        Patch is not applying cleanly anymore to trunk: Minor merges are needed in Application.java and TestTaskTrackerLocalization

        I manually merged the patch myself to continue with the review. Mostly looks good except the following:

        • TaskRunner.java#setupLog4j(): level should not be INFO, but getLogLevel(conf).toString()
        • Tasklogservlet: The check haveTaskLog() for bypassing SYSLOG in tests will lead to missing log files from webui in case of JVMReuse. We should not do this.
        • Convert TestJvmReuse into junit 4 testcase?

        Canceling patch so as to include above comments.

        Show
        Vinod Kumar Vavilapalli added a comment - Patch is not applying cleanly anymore to trunk: Minor merges are needed in Application.java and TestTaskTrackerLocalization I manually merged the patch myself to continue with the review. Mostly looks good except the following: TaskRunner.java#setupLog4j(): level should not be INFO, but getLogLevel(conf).toString() Tasklogservlet: The check haveTaskLog() for bypassing SYSLOG in tests will lead to missing log files from webui in case of JVMReuse. We should not do this. Convert TestJvmReuse into junit 4 testcase? Canceling patch so as to include above comments.
        Vinod Kumar Vavilapalli made changes -
        Status Patch Available [ 10002 ] Open [ 1 ]
        Hide
        Vinod Kumar Vavilapalli added a comment -

        Marking this as an incompatible change: We are changing the directory structure for cleanup-attempt log-dir and so users will not be able to look at old cleanup-attempt logs.

        Show
        Vinod Kumar Vavilapalli added a comment - Marking this as an incompatible change: We are changing the directory structure for cleanup-attempt log-dir and so users will not be able to look at old cleanup-attempt logs.
        Vinod Kumar Vavilapalli made changes -
        Hadoop Flags [Incompatible change]
        Hide
        Amareshwari Sriramadasu added a comment -

        Patch is updated to trunk and incorporates review comments.

        Show
        Amareshwari Sriramadasu added a comment - Patch is updated to trunk and incorporates review comments.
        Amareshwari Sriramadasu made changes -
        Attachment patch-1607-2.txt [ 12444103 ]
        Amareshwari Sriramadasu made changes -
        Status Open [ 1 ] Patch Available [ 10002 ]
        Fix Version/s 0.22.0 [ 12314184 ]
        Fix Version/s 0.21.0 [ 12314045 ]
        Hide
        Hadoop QA added a comment -

        +1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12444103/patch-1607-2.txt
        against trunk revision 941564.

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 21 new or modified tests.

        +1 javadoc. The javadoc tool did not generate any warning messages.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 findbugs. The patch does not introduce any new Findbugs warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        +1 core tests. The patch passed core unit tests.

        +1 contrib tests. The patch passed contrib unit tests.

        Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/177/testReport/
        Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/177/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
        Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/177/artifact/trunk/build/test/checkstyle-errors.html
        Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/177/console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12444103/patch-1607-2.txt against trunk revision 941564. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 21 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/177/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/177/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/177/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/177/console This message is automatically generated.
        Hide
        Vinod Kumar Vavilapalli added a comment -

        +1 for the patch.

        This one's good to go in. Just to be sure, can you please run a sanity manual test with cleanup tasks? And also execute the linux-task-controller tests once? Thanks!

        Show
        Vinod Kumar Vavilapalli added a comment - +1 for the patch. This one's good to go in. Just to be sure, can you please run a sanity manual test with cleanup tasks? And also execute the linux-task-controller tests once? Thanks!
        Hide
        Amareshwari Sriramadasu added a comment -

        Just to be sure, can you please run a sanity manual test with cleanup tasks? And also execute the linux-task-controller tests once?

        All linux task controller tests passed with the patch (both as tt user and some other user). I tested viewing of task logs and cleanup attempts logs on the single node machine.

        Show
        Amareshwari Sriramadasu added a comment - Just to be sure, can you please run a sanity manual test with cleanup tasks? And also execute the linux-task-controller tests once? All linux task controller tests passed with the patch (both as tt user and some other user). I tested viewing of task logs and cleanup attempts logs on the single node machine.
        Hide
        Vinod Kumar Vavilapalli added a comment -

        I just committed this to trunk and branch 0.21(bug fix, so). Thanks Amareshwari!

        Show
        Vinod Kumar Vavilapalli added a comment - I just committed this to trunk and branch 0.21(bug fix, so). Thanks Amareshwari!
        Vinod Kumar Vavilapalli made changes -
        Status Patch Available [ 10002 ] Resolved [ 5 ]
        Hadoop Flags [Incompatible change] [Incompatible change, Reviewed]
        Release Note Fixed initialization of a task-cleanup attempt's log directory by setting correct permissions via task-controller. Changed the userlogs for a task-cleanup attempt to go into its own directory instead of the original attempt directory. This is an incompatible change as old userlogs of cleanup attempt-dirs will no longer be visible.
        Fix Version/s 0.21.0 [ 12314045 ]
        Fix Version/s 0.22.0 [ 12314184 ]
        Resolution Fixed [ 1 ]
        Vinod Kumar Vavilapalli made changes -
        Release Note Fixed initialization of a task-cleanup attempt's log directory by setting correct permissions via task-controller. Changed the userlogs for a task-cleanup attempt to go into its own directory instead of the original attempt directory. This is an incompatible change as old userlogs of cleanup attempt-dirs will no longer be visible. Fixed initialization of a task-cleanup attempt's log directory by setting correct permissions via task-controller. Added new log4j properties hadoop.tasklog.iscleanup and log4j.appender.TLA.isCleanup to conf/log4j.properties. Changed the userlogs for a task-cleanup attempt to go into its own directory instead of the original attempt directory. This is an incompatible change as old userlogs of cleanup attempt-dirs before this release will no longer be visible.
        Tom White made changes -
        Status Resolved [ 5 ] Closed [ 6 ]

          People

          • Assignee:
            Amareshwari Sriramadasu
            Reporter:
            Hemanth Yamijala
          • Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development