Hadoop Map/Reduce
  1. Hadoop Map/Reduce
  2. MAPREDUCE-2364

Shouldn't hold lock on rjob while localizing resources.

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.20.203.0
    • Fix Version/s: 0.20.204.0
    • Component/s: tasktracker
    • Labels:
      None

      Description

      There is a deadlock while localizing resources on the TaskTracker.

      1. MAPREDUCE-2364.patch
        0.9 kB
        Binglin Chang
      2. no-lock-localize-branch-0.20-security.patch
        6 kB
        Devaraj Das
      3. no-lock-localize-trunk.patch
        5 kB
        Binglin Chang

        Issue Links

          Activity

          Owen O'Malley created issue -
          Hide
          Binglin Chang added a comment -

          We encounter the same problem, when TaskTracker download & unJar a very big job.jar in localizeJob(), it stops sending heartbeat and web service hangs too.
          Our solution for this issue is to add a new lock in RunningJob class called localizing. Instead of holding the whole rjob lock, rjob.localizing is locked.

          Show
          Binglin Chang added a comment - We encounter the same problem, when TaskTracker download & unJar a very big job.jar in localizeJob(), it stops sending heartbeat and web service hangs too. Our solution for this issue is to add a new lock in RunningJob class called localizing. Instead of holding the whole rjob lock, rjob.localizing is locked.
          Hide
          Binglin Chang added a comment -

          trunk patch

          Show
          Binglin Chang added a comment - trunk patch
          Binglin Chang made changes -
          Field Original Value New Value
          Attachment MAPREDUCE-2364.patch [ 12480768 ]
          Binglin Chang made changes -
          Status Open [ 1 ] Patch Available [ 10002 ]
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12480768/MAPREDUCE-2364.patch
          against trunk revision 1129771.

          +1 @author. The patch does not contain any @author tags.

          -1 tests included. The patch doesn't appear to include any new or modified tests.
          Please justify why no new tests are needed for this patch.
          Also please list what manual steps were performed to verify this patch.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          -1 findbugs. The patch appears to introduce 1 new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed these core unit tests:
          org.apache.hadoop.cli.TestMRCLI

          -1 contrib tests. The patch failed contrib unit tests.

          +1 system test framework. The patch passed system test framework compile.

          Test results: https://builds.apache.org/hudson/job/PreCommit-MAPREDUCE-Build/328//testReport/
          Findbugs warnings: https://builds.apache.org/hudson/job/PreCommit-MAPREDUCE-Build/328//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Console output: https://builds.apache.org/hudson/job/PreCommit-MAPREDUCE-Build/328//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12480768/MAPREDUCE-2364.patch against trunk revision 1129771. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 1 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these core unit tests: org.apache.hadoop.cli.TestMRCLI -1 contrib tests. The patch failed contrib unit tests. +1 system test framework. The patch passed system test framework compile. Test results: https://builds.apache.org/hudson/job/PreCommit-MAPREDUCE-Build/328//testReport/ Findbugs warnings: https://builds.apache.org/hudson/job/PreCommit-MAPREDUCE-Build/328//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/hudson/job/PreCommit-MAPREDUCE-Build/328//console This message is automatically generated.
          Hide
          Devaraj Das added a comment -

          Hi Binglin, I thought I'd attach the patch that I did for branch-0.20-security. The crux of the patch you submitted and the one i did is mostly the same..
          Please have a look at this one, and see if you can map it to a trunk patch. Thanks!

          Show
          Devaraj Das added a comment - Hi Binglin, I thought I'd attach the patch that I did for branch-0.20-security. The crux of the patch you submitted and the one i did is mostly the same.. Please have a look at this one, and see if you can map it to a trunk patch. Thanks!
          Devaraj Das made changes -
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12481414/no-lock-localize-branch-0.20-security.patch
          against trunk revision 1131265.

          +1 @author. The patch does not contain any @author tags.

          -1 tests included. The patch doesn't appear to include any new or modified tests.
          Please justify why no new tests are needed for this patch.
          Also please list what manual steps were performed to verify this patch.

          -1 patch. The patch command could not apply the patch.

          Console output: https://builds.apache.org/hudson/job/PreCommit-MAPREDUCE-Build/345//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12481414/no-lock-localize-branch-0.20-security.patch against trunk revision 1131265. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/hudson/job/PreCommit-MAPREDUCE-Build/345//console This message is automatically generated.
          Binglin Chang made changes -
          Attachment no-lock-localize-trunk.patch [ 12481449 ]
          Hide
          Binglin Chang added a comment -

          trunk patch

          Show
          Binglin Chang added a comment - trunk patch
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12481449/no-lock-localize-trunk.patch
          against trunk revision 1131265.

          +1 @author. The patch does not contain any @author tags.

          -1 tests included. The patch doesn't appear to include any new or modified tests.
          Please justify why no new tests are needed for this patch.
          Also please list what manual steps were performed to verify this patch.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          -1 findbugs. The patch appears to introduce 1 new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed these core unit tests:
          org.apache.hadoop.cli.TestMRCLI

          +1 contrib tests. The patch passed contrib unit tests.

          +1 system test framework. The patch passed system test framework compile.

          Test results: https://builds.apache.org/hudson/job/PreCommit-MAPREDUCE-Build/348//testReport/
          Findbugs warnings: https://builds.apache.org/hudson/job/PreCommit-MAPREDUCE-Build/348//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Console output: https://builds.apache.org/hudson/job/PreCommit-MAPREDUCE-Build/348//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12481449/no-lock-localize-trunk.patch against trunk revision 1131265. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 1 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these core unit tests: org.apache.hadoop.cli.TestMRCLI +1 contrib tests. The patch passed contrib unit tests. +1 system test framework. The patch passed system test framework compile. Test results: https://builds.apache.org/hudson/job/PreCommit-MAPREDUCE-Build/348//testReport/ Findbugs warnings: https://builds.apache.org/hudson/job/PreCommit-MAPREDUCE-Build/348//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/hudson/job/PreCommit-MAPREDUCE-Build/348//console This message is automatically generated.
          Steve Loughran made changes -
          Fix Version/s 0.20.204.0 [ 12316318 ]
          Fix Version/s 0.20.203.0 [ 12316151 ]
          Hide
          Liyin Liang added a comment -

          I think this issue is the same with MAPREDUCE-2209.

          Show
          Liyin Liang added a comment - I think this issue is the same with MAPREDUCE-2209 .
          Hide
          Subroto Sanyal added a comment -

          Hi Devraj,
          MAPREDUCE-2209 also resolves the same issue. MAPREDUCE-2209 targets to solve one more thread blocking.
          Request you to look into MAPREDUCE-2209 patch. The patch provided in the issue is for 0.23 version.

          Show
          Subroto Sanyal added a comment - Hi Devraj, MAPREDUCE-2209 also resolves the same issue. MAPREDUCE-2209 targets to solve one more thread blocking. Request you to look into MAPREDUCE-2209 patch. The patch provided in the issue is for 0.23 version.
          Hide
          Devaraj Das added a comment -

          Subroto, I see a significant difference in the patches attached to MAPREDUCE-2209 and the last one here. I'll need to look at the details but if you have time could you please take a look at the patch attached here and see if this makes sense (given this patch predates the patch on MAPREDUCE-2209; I am sorry that I didn't look at the patch here earlier).

          Show
          Devaraj Das added a comment - Subroto, I see a significant difference in the patches attached to MAPREDUCE-2209 and the last one here. I'll need to look at the details but if you have time could you please take a look at the patch attached here and see if this makes sense (given this patch predates the patch on MAPREDUCE-2209 ; I am sorry that I didn't look at the patch here earlier).
          Owen O'Malley made changes -
          Status Patch Available [ 10002 ] Resolved [ 5 ]
          Resolution Fixed [ 1 ]
          Hide
          Owen O'Malley added a comment -

          Hadoop 0.20.204.0 was just released.

          Show
          Owen O'Malley added a comment - Hadoop 0.20.204.0 was just released.
          Owen O'Malley made changes -
          Status Resolved [ 5 ] Closed [ 6 ]
          Ravi Prakash made changes -
          Link This issue is related to MAPREDUCE-4088 [ MAPREDUCE-4088 ]
          Transition Time In Source Status Execution Times Last Executer Last Execution Date
          Open Open Patch Available Patch Available
          85d 15h 10m 1 Binglin Chang 01/Jun/11 14:19
          Patch Available Patch Available Resolved Resolved
          85d 6h 52m 1 Owen O'Malley 25/Aug/11 21:12
          Resolved Resolved Closed Closed
          8d 2h 1m 1 Owen O'Malley 02/Sep/11 23:13

            People

            • Assignee:
              Devaraj Das
              Reporter:
              Owen O'Malley
            • Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development