Hadoop Map/Reduce
  1. Hadoop Map/Reduce
  2. MAPREDUCE-1213

TaskTrackers restart is very slow because it deletes distributed cache directory synchronously

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.20.1
    • Fix Version/s: 0.21.0
    • Component/s: None
    • Labels:
      None
    • Hadoop Flags:
      Incompatible change, Reviewed
    • Release Note:
      Directories specified in mapred.local.dir that can not be created now cause the TaskTracker to fail to start.

      Description

      We are seeing that when we restart a tasktracker, it tries to recursively delete all the file in the distributed cache. It invoked FileUtil.fullyDelete() which is very very slow. This means that the TaskTracker cannot join the cluster for an extended period of time (upto 2 hours for us). The problem is acute if the number of files in a distributed cache is a few-thousands.

      1. MAPREDUCE-1213.branch-0.20.patch
        16 kB
        Zheng Shao
      2. MAPREDUCE-1213.branch-0.20.2.patch
        14 kB
        Zheng Shao
      3. MAPREDUCE-1213.4.patch
        14 kB
        Zheng Shao
      4. MAPREDUCE-1213.3.patch
        14 kB
        Zheng Shao
      5. MAPREDUCE-1213.2.patch
        14 kB
        Zheng Shao
      6. MAPREDUCE-1213.1.patch
        14 kB
        Zheng Shao

        Issue Links

          Activity

          Hide
          dhruba borthakur added a comment -

          One option is to rename the currentworkdir to a temporary location and then delete these file asynchronously via the technique used in HDFS-611

          Show
          dhruba borthakur added a comment - One option is to rename the currentworkdir to a temporary location and then delete these file asynchronously via the technique used in HDFS-611
          Hide
          Zheng Shao added a comment -

          This patch fixes the problem by moving the file first and removing it later asynchronously using a thread pool per volume.

          Show
          Zheng Shao added a comment - This patch fixes the problem by moving the file first and removing it later asynchronously using a thread pool per volume.
          Hide
          Todd Lipcon added a comment -

          Given that this almost duplicates the async disk service from the DN, could we move it into o.a.h.io in Common? The duplication seems a bit ugly since this is a nontrivial class.

          Show
          Todd Lipcon added a comment - Given that this almost duplicates the async disk service from the DN, could we move it into o.a.h.io in Common? The duplication seems a bit ugly since this is a nontrivial class.
          Hide
          Zheng Shao added a comment -

          That's the plan but I suppose I cannot add that to common and at the same time use it in mapreduce?
          I will open one jira in common and get that committed, while this is also committed.

          Then we can clean things up by letting both hdfs and mapreduce use the class in common.

          Is that good?

          Show
          Zheng Shao added a comment - That's the plan but I suppose I cannot add that to common and at the same time use it in mapreduce? I will open one jira in common and get that committed, while this is also committed. Then we can clean things up by letting both hdfs and mapreduce use the class in common. Is that good?
          Hide
          Zheng Shao added a comment -

          Fixed the comments and reorganized the class.

          Show
          Zheng Shao added a comment - Fixed the comments and reorganized the class.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12427654/MAPREDUCE-1213.1.patch
          against trunk revision 889085.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 3 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs warnings.

          -1 release audit. The applied patch generated 160 release audit warnings (more than the trunk's current 159 warnings).

          -1 core tests. The patch failed core unit tests.

          -1 contrib tests. The patch failed contrib unit tests.

          Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/185/testReport/
          Release audit warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/185/artifact/trunk/patchprocess/releaseAuditDiffWarnings.txt
          Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/185/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/185/artifact/trunk/build/test/checkstyle-errors.html
          Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/185/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12427654/MAPREDUCE-1213.1.patch against trunk revision 889085. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. -1 release audit. The applied patch generated 160 release audit warnings (more than the trunk's current 159 warnings). -1 core tests. The patch failed core unit tests. -1 contrib tests. The patch failed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/185/testReport/ Release audit warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/185/artifact/trunk/patchprocess/releaseAuditDiffWarnings.txt Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/185/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/185/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/185/console This message is automatically generated.
          Hide
          Todd Lipcon added a comment -

          That's the plan but I suppose I cannot add that to common and at the same time use it in mapreduce?

          Right. I think you should open a new JIRA in common, and link this jira to be Blocked by it. Once that jira is committed, you can use this jira to use it in mapreduce, and file another JIRA to switch over HDFS to use it. Sound good?

          Show
          Todd Lipcon added a comment - That's the plan but I suppose I cannot add that to common and at the same time use it in mapreduce? Right. I think you should open a new JIRA in common, and link this jira to be Blocked by it. Once that jira is committed, you can use this jira to use it in mapreduce, and file another JIRA to switch over HDFS to use it. Sound good?
          Hide
          Zheng Shao added a comment -

          This one uses the newly-committed AsyncDiskService from common.

          Show
          Zheng Shao added a comment - This one uses the newly-committed AsyncDiskService from common.
          Hide
          Zheng Shao added a comment -

          This one uses the AsyncDiskService from common.

          Show
          Zheng Shao added a comment - This one uses the AsyncDiskService from common.
          Hide
          Vinod Kumar Vavilapalli added a comment -

          The class org.apache.hadoop.mapred.CleanupQueue is very similar to MRAsyncDiskService. I think we should merge their functionality here or in another issue.

          Show
          Vinod Kumar Vavilapalli added a comment - The class org.apache.hadoop.mapred.CleanupQueue is very similar to MRAsyncDiskService . I think we should merge their functionality here or in another issue.
          Hide
          Todd Lipcon added a comment -

          Patch looks mostly good to me.

          One comment: I think the name "moveAndDeleteLocalFiles" isn't quite descriptive enough. Perhaps "asyncDeletePathOnEachVolume" or something? The fact that the path is relative to all volumes and will be deleted on each of them is the key part I didn't understand at the first pass through the JIRA.

          I agree with Vinod that it would be nice to share code with CleanupQueue, but would be fine seeing it in another JIRA.

          Show
          Todd Lipcon added a comment - Patch looks mostly good to me. One comment: I think the name "moveAndDeleteLocalFiles" isn't quite descriptive enough. Perhaps "asyncDeletePathOnEachVolume" or something? The fact that the path is relative to all volumes and will be deleted on each of them is the key part I didn't understand at the first pass through the JIRA. I agree with Vinod that it would be nice to share code with CleanupQueue, but would be fine seeing it in another JIRA.
          Hide
          Zheng Shao added a comment -

          Changed function name to moveAndDeleteFromEachVolume.

          AsyncDelete may have a different meaning - users might still see the files when the function returns. This code actually moves the file first.

          Show
          Zheng Shao added a comment - Changed function name to moveAndDeleteFromEachVolume. AsyncDelete may have a different meaning - users might still see the files when the function returns. This code actually moves the file first.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12428074/MAPREDUCE-1213.4.patch
          against trunk revision 891111.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 3 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed core unit tests.

          -1 contrib tests. The patch failed contrib unit tests.

          Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/199/testReport/
          Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/199/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/199/artifact/trunk/build/test/checkstyle-errors.html
          Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/199/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12428074/MAPREDUCE-1213.4.patch against trunk revision 891111. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. -1 contrib tests. The patch failed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/199/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/199/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/199/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/199/console This message is automatically generated.
          Hide
          Zheng Shao added a comment -

          The contrib tests failures do not seem to be related to this patch. I saw the same errors on hudson in the results of other patches.

          Show
          Zheng Shao added a comment - The contrib tests failures do not seem to be related to this patch. I saw the same errors on hudson in the results of other patches.
          Hide
          dhruba borthakur added a comment -

          is this ready to be committed? If so, can somebody pl create a separate JIRA to address Vinod's comments?

          Show
          dhruba borthakur added a comment - is this ready to be committed? If so, can somebody pl create a separate JIRA to address Vinod's comments?
          Hide
          dhruba borthakur added a comment -

          I just committed this. Thanks Zheng.

          Show
          dhruba borthakur added a comment - I just committed this. Thanks Zheng.
          Hide
          Todd Lipcon added a comment -

          Hi Zheng. Do you have this patch against 0.20 as well? We're considering backporting. Thanks

          Show
          Todd Lipcon added a comment - Hi Zheng. Do you have this patch against 0.20 as well? We're considering backporting. Thanks
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Mapreduce-trunk #196 (See http://hudson.zones.apache.org/hudson/job/Hadoop-Mapreduce-trunk/196/)

          Show
          Hudson added a comment - Integrated in Hadoop-Mapreduce-trunk #196 (See http://hudson.zones.apache.org/hudson/job/Hadoop-Mapreduce-trunk/196/ )
          Hide
          Zheng Shao added a comment -

          Patch for 0.20.

          Show
          Zheng Shao added a comment - Patch for 0.20.
          Hide
          Zheng Shao added a comment -

          Removed unnecessary changes for hadoop 0.20.

          Show
          Zheng Shao added a comment - Removed unnecessary changes for hadoop 0.20.
          Hide
          Eli Collins added a comment -

          Marking as in incompatible change as mapred.local.dirs that do not exist are no longer with this change. You'll get the following error and the TT will not start.

          ERROR org.apache.hadoop.mapred.TaskTracker: Can not start task tracker because java.io.IOException: Cannot create toBeDeleted in /doesNotExist

          Show
          Eli Collins added a comment - Marking as in incompatible change as mapred.local.dirs that do not exist are no longer with this change. You'll get the following error and the TT will not start. ERROR org.apache.hadoop.mapred.TaskTracker: Can not start task tracker because java.io.IOException: Cannot create toBeDeleted in /doesNotExist
          Hide
          Eli Collins added a comment -

          Filed MAPREDUCE-2049 to restore the old behavior (tolerate local dirs it can not create).

          Show
          Eli Collins added a comment - Filed MAPREDUCE-2049 to restore the old behavior (tolerate local dirs it can not create).
          Hide
          Bhallamudi Venkata Siva Kamesh added a comment -

          While analyzing the patch, I found an issue, The below moveAndDelete method is called from both jobTracker and TaskTracker. JobTracker calls the below snippet on it's JobTracker folder and TaskTracker on it's TaskTracker folder(ex: /home/hadoop/tasktracker/local). This method renames the current folder and deletes it asynchronously. Let us assume the deletion step failed due to some reason (Like abrupt kill or some thing else), then the renamed folders are never deleted by any one.

          MRAsyncDiskService.java
          
          public boolean moveAndDelete(String volume, String pathName) throws IOException {
              // Move the file right now, so that it can be deleted later
              String newPathName;
              synchronized (this) {
                newPathName = format.format(new Date()) + "_" + uniqueId;
                uniqueId ++;
              }
              newPathName = SUBDIR + Path.SEPARATOR_CHAR + newPathName;
          
              Path source = new Path(volume, pathName);
              Path target = new Path(volume, newPathName);
              try {
                if (!localFileSystem.rename(source, target)) {
                  return false;
                }
              } catch (FileNotFoundException e) {
                // Return false in case that the file is not found.
                return false;
              }
              DeleteTask task = new DeleteTask(volume, pathName, newPathName);
              execute(volume, task);
              return true;
            }
          
          Show
          Bhallamudi Venkata Siva Kamesh added a comment - While analyzing the patch, I found an issue, The below moveAndDelete method is called from both jobTracker and TaskTracker. JobTracker calls the below snippet on it's JobTracker folder and TaskTracker on it's TaskTracker folder(ex: /home/hadoop/tasktracker/local). This method renames the current folder and deletes it asynchronously. Let us assume the deletion step failed due to some reason (Like abrupt kill or some thing else), then the renamed folders are never deleted by any one. MRAsyncDiskService.java public boolean moveAndDelete( String volume, String pathName) throws IOException { // Move the file right now, so that it can be deleted later String newPathName; synchronized ( this ) { newPathName = format.format( new Date()) + "_" + uniqueId; uniqueId ++; } newPathName = SUBDIR + Path.SEPARATOR_CHAR + newPathName; Path source = new Path(volume, pathName); Path target = new Path(volume, newPathName); try { if (!localFileSystem.rename(source, target)) { return false ; } } catch (FileNotFoundException e) { // Return false in case that the file is not found. return false ; } DeleteTask task = new DeleteTask(volume, pathName, newPathName); execute(volume, task); return true ; }

            People

            • Assignee:
              Zheng Shao
              Reporter:
              dhruba borthakur
            • Votes:
              0 Vote for this issue
              Watchers:
              14 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development