Hadoop Map/Reduce
  1. Hadoop Map/Reduce
  2. MAPREDUCE-2804

"Creation of symlink to attempt log dir failed." message is not useful

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Blocker Blocker
    • Resolution: Fixed
    • Affects Version/s: 0.20.204.0
    • Fix Version/s: 0.20.204.0
    • Component/s: None
    • Labels:
      None
    • Hadoop Flags:
      Reviewed
    • Release Note:
      Removed duplicate chmods of job log dir that were vulnerable to race conditions between tasks. Also improved the messages when the symlinks failed to be created.

      Description

      In attempting to qualify the 204 RC2 release, my tasktracker logs are filled with the above message. I'd love to do something about it, but since it doesn't tell me what exactly it is trying to symlink I cannot unless I dig into the source code.

      1. mr-2804-205.patch
        18 kB
        Owen O'Malley
      2. mr-2804.patch
        20 kB
        Owen O'Malley

        Issue Links

          Activity

          Hide
          Allen Wittenauer added a comment -

          This came as part of the changes in 204. I can't find the same message in 203.

          Show
          Allen Wittenauer added a comment - This came as part of the changes in 204. I can't find the same message in 203.
          Hide
          Ravi Gummadi added a comment -

          MAPREDUCE-2415 distributed the userlogs on to multiple disks. Its implementation relies on this symlink creation from attempt-directory of (older) userlogs directory to attempt-directory on one of the multiple disks.
          The design doc attached to MAPREDUCE-2415 can give more details.

          Show
          Ravi Gummadi added a comment - MAPREDUCE-2415 distributed the userlogs on to multiple disks. Its implementation relies on this symlink creation from attempt-directory of (older) userlogs directory to attempt-directory on one of the multiple disks. The design doc attached to MAPREDUCE-2415 can give more details.
          Hide
          Allen Wittenauer added a comment -

          Thanks. Now we know the source of the patch that introduced this message.

          Show
          Allen Wittenauer added a comment - Thanks. Now we know the source of the patch that introduced this message.
          Hide
          Owen O'Malley added a comment -

          I'm seeing this one too. Interestingly, it seems to be reported for a task that finishes without problem. But another task that is starting at the same time fails with no logs and no explanation.

          Show
          Owen O'Malley added a comment - I'm seeing this one too. Interestingly, it seems to be reported for a task that finishes without problem. But another task that is starting at the same time fails with no logs and no explanation.
          Hide
          Owen O'Malley added a comment -

          I found it and will generate a patch this afternoon.

          Show
          Owen O'Malley added a comment - I found it and will generate a patch this afternoon.
          Hide
          Owen O'Malley added a comment -

          Fundamentally, there were two problems:

          1. The logging message was less than useful.
          2. There is a race condition when java's primitives are used to change permissions on a file.

          This patch fixes the log messages to be much clearer about what is going wrong and removes the extra chmod of the job directory. It was that chmod that was causing the creation of the symlinks of a different task to fail when they were launching at the same time.

          Show
          Owen O'Malley added a comment - Fundamentally, there were two problems: 1. The logging message was less than useful. 2. There is a race condition when java's primitives are used to change permissions on a file. This patch fixes the log messages to be much clearer about what is going wrong and removes the extra chmod of the job directory. It was that chmod that was causing the creation of the symlinks of a different task to fail when they were launching at the same time.
          Hide
          Allen Wittenauer added a comment -

          +1

          Show
          Allen Wittenauer added a comment - +1
          Hide
          Devaraj Das added a comment -

          +1

          Show
          Devaraj Das added a comment - +1
          Hide
          Owen O'Malley added a comment -

          The patch conflicted with HADOOP-7100, so here is the proposed updated patch.

          Show
          Owen O'Malley added a comment - The patch conflicted with HADOOP-7100 , so here is the proposed updated patch.
          Hide
          Owen O'Malley added a comment -

          Oops, that should be HADOOP-7110.

          Show
          Owen O'Malley added a comment - Oops, that should be HADOOP-7110 .
          Hide
          Devaraj Das added a comment -

          +1 for the branch-0.20-security patch.

          Show
          Devaraj Das added a comment - +1 for the branch-0.20-security patch.
          Hide
          Owen O'Malley added a comment -

          Hadoop 0.20.204.0 was just released.

          Show
          Owen O'Malley added a comment - Hadoop 0.20.204.0 was just released.

            People

            • Assignee:
              Owen O'Malley
              Reporter:
              Allen Wittenauer
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development