Hadoop Map/Reduce
  1. Hadoop Map/Reduce
  2. MAPREDUCE-158

mapred.userlog.retain.hours killing long running tasks

    Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Duplicate
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: tasktracker
    • Labels:
      None
    • Environment:

      0.19.2-dev, r753365

      Description

      One can reproduce the scenario by configuring mapred.userlog.retain.hours to 1hr, and running tasks that take more than an hour.

      More info on closed ticket HADOOP-5591.

        Issue Links

          Activity

          Hide
          Ruyue Ma added a comment -

          This is related to mapred.userlog.retain.hours.

          Current, every task jvm tries to clean up user logs in hadoop/logs/userlogs dir. The standard is
          return file.lastModified() < purgeTimeStamp. This 'file' is the attempt dir. But the dir lastModified time doesn't change. so the change is
          + File indexFile = new File(file, "log.index");
          + if (indexFile.exists())

          { + return indexFile.lastModified() < purgeTimeStamp; + }

          else

          { + return file.lastModified() < purgeTimeStamp; + }

          Show
          Ruyue Ma added a comment - This is related to mapred.userlog.retain.hours. Current, every task jvm tries to clean up user logs in hadoop/logs/userlogs dir. The standard is return file.lastModified() < purgeTimeStamp. This 'file' is the attempt dir. But the dir lastModified time doesn't change. so the change is + File indexFile = new File(file, "log.index"); + if (indexFile.exists()) { + return indexFile.lastModified() < purgeTimeStamp; + } else { + return file.lastModified() < purgeTimeStamp; + }
          Hide
          Vinod Kumar Vavilapalli added a comment -

          Correcting the summary and description.

          Show
          Vinod Kumar Vavilapalli added a comment - Correcting the summary and description.
          Hide
          Vinod Kumar Vavilapalli added a comment -

          This issue is circumvented after HADOOP-4374 and so is not visible beyond 0.20.

          The reason why logs of running tasks are no longer cleaned up causing failures is that HADOOP-4374 introduced log.tmp for atomicity of changes to log.index which is periodically created and written to by a running task. This results in a periodic change in modification time of attempt-log directory and prevents its cleanup even though mapred.userlog.retain.hours is over.

          So what should be done here? Close this issue? Or make the check for running tasks explicit during cleanup?

          Show
          Vinod Kumar Vavilapalli added a comment - This issue is circumvented after HADOOP-4374 and so is not visible beyond 0.20. The reason why logs of running tasks are no longer cleaned up causing failures is that HADOOP-4374 introduced log.tmp for atomicity of changes to log.index which is periodically created and written to by a running task. This results in a periodic change in modification time of attempt-log directory and prevents its cleanup even though mapred.userlog.retain.hours is over. So what should be done here? Close this issue? Or make the check for running tasks explicit during cleanup?
          Hide
          Amareshwari Sriramadasu added a comment -

          As per http://issues.apache.org/jira/browse/MAPREDUCE-927?focusedCommentId=12766412&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#action_12766412,

          TaskTracker should delete the userlogs only after mapred.userlog.retain.hours after the job completion. Then it becomes a TaskTracker config parameter. And there will no permission issues for deletion.

          Show
          Amareshwari Sriramadasu added a comment - As per http://issues.apache.org/jira/browse/MAPREDUCE-927?focusedCommentId=12766412&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#action_12766412 , TaskTracker should delete the userlogs only after mapred.userlog.retain.hours after the job completion. Then it becomes a TaskTracker config parameter. And there will no permission issues for deletion.
          Hide
          Amareshwari Sriramadasu added a comment -

          This issue doesn't exist any more, because MAPREDUCE-927 solves this by modifying "mapred.userlog.retain.hours" to specify the time(in hours) for which the user-logs are to be retained after the job completion.

          Show
          Amareshwari Sriramadasu added a comment - This issue doesn't exist any more, because MAPREDUCE-927 solves this by modifying "mapred.userlog.retain.hours" to specify the time(in hours) for which the user-logs are to be retained after the job completion.

            People

            • Assignee:
              Unassigned
              Reporter:
              Billy Pearson
            • Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development