Hadoop Map/Reduce
  1. Hadoop Map/Reduce
  2. MAPREDUCE-1382

MRAsyncDiscService should tolerate missing local.dir

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.22.0
    • Component/s: None
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      Currently when some of the local.dir do not exist, MRAsyncDiscService will fail. It should only fail when all directories don't work.

      1. MAPREDUCE-1382.1.patch
        6 kB
        Zheng Shao
      2. MAPREDUCE-1382.branch-0.20.on.top.of.MAPREDUCE-1302.1.patch
        5 kB
        Zheng Shao
      3. MAPREDUCE-1382.2.patch
        6 kB
        Zheng Shao
      4. MAPREDUCE-1382.3.patch
        6 kB
        Zheng Shao
      5. MAPREDUCE-1382.4.patch
        7 kB
        Dan Adkins
      6. MAPREDUCE-1382.5.patch
        7 kB
        Tom White
      7. MAPREDUCE-1382.patch
        7 kB
        Tom White

        Issue Links

          Activity

          Hide
          Zheng Shao added a comment -

          Added a test that tests deletion non-existing files.

          Show
          Zheng Shao added a comment - Added a test that tests deletion non-existing files.
          Hide
          Zheng Shao added a comment -

          patch for 0.20

          Show
          Zheng Shao added a comment - patch for 0.20
          Hide
          Todd Lipcon added a comment -

          Hi Zheng,

          I wonder whether localFileSystem.listStatus will ever throw IOE in the case of a bad volume? Yours handles the case where the volume doesn't exist at all, but I can imagine if it's on a broken mount, the listStatus will get an I/O Error from the underlying filesystem and throw at that point. Do you need to catch this and handle in the same way as if it were null? Thanks.

          Show
          Todd Lipcon added a comment - Hi Zheng, I wonder whether localFileSystem.listStatus will ever throw IOE in the case of a bad volume? Yours handles the case where the volume doesn't exist at all, but I can imagine if it's on a broken mount, the listStatus will get an I/O Error from the underlying filesystem and throw at that point. Do you need to catch this and handle in the same way as if it were null? Thanks.
          Hide
          Zheng Shao added a comment -

          Todd, you are right. LocalFileSystem will throw IOE in that case.
          This patch addresses Todd's concern.

          Show
          Zheng Shao added a comment - Todd, you are right. LocalFileSystem will throw IOE in that case. This patch addresses Todd's concern.
          Hide
          Zheng Shao added a comment -

          Fixed unit test.

          Show
          Zheng Shao added a comment - Fixed unit test.
          Hide
          Scott Chen added a comment -

          Does it make sense to throw IOE if all volumes in the constructor are missing?

          Show
          Scott Chen added a comment - Does it make sense to throw IOE if all volumes in the constructor are missing?
          Hide
          Zheng Shao added a comment -

          I believe other logic in TaskTracker/JobTracker will fail and report in that case.

          Show
          Zheng Shao added a comment - I believe other logic in TaskTracker/JobTracker will fail and report in that case.
          Hide
          Scott Chen added a comment -

          +1, the patch looks good to me

          Show
          Scott Chen added a comment - +1, the patch looks good to me
          Hide
          Dan Adkins added a comment -

          I have applied the latest patch to trunk and still I cannot start a task tracker if the the directory does not exist:

          10/09/09 00:36:29 WARN mapred.TaskTracker: Task Tracker local can not create directory: /mnt/data8/mr20
          10/09/09 00:36:29 WARN util.MRAsyncDiskService: Cannot create toBeDeleted in /mnt/data8/mr20. Ignored.
          10/09/09 00:36:29 ERROR mapred.TaskTracker: Can not start task tracker because java.io.FileNotFoundException: File /mnt/data8/mr20/toBeDeleted does not exist.
          at org.apache.hadoop.fs.RawLocalFileSystem.listStatus(RawLocalFileSystem.java:301)
          at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1071)
          at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1096)
          at org.apache.hadoop.fs.ChecksumFileSystem.listStatus(ChecksumFileSystem.java:494)
          at org.apache.hadoop.mapreduce.util.MRAsyncDiskService.<init>(MRAsyncDiskService.java:103)
          at org.apache.hadoop.mapreduce.util.MRAsyncDiskService.<init>(MRAsyncDiskService.java:127)
          at org.apache.hadoop.mapred.TaskTracker.initialize(TaskTracker.java:616)
          at org.apache.hadoop.mapred.TaskTracker.<init>(TaskTracker.java:1323)
          at org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:3508)

          Show
          Dan Adkins added a comment - I have applied the latest patch to trunk and still I cannot start a task tracker if the the directory does not exist: 10/09/09 00:36:29 WARN mapred.TaskTracker: Task Tracker local can not create directory: /mnt/data8/mr20 10/09/09 00:36:29 WARN util.MRAsyncDiskService: Cannot create toBeDeleted in /mnt/data8/mr20. Ignored. 10/09/09 00:36:29 ERROR mapred.TaskTracker: Can not start task tracker because java.io.FileNotFoundException: File /mnt/data8/mr20/toBeDeleted does not exist. at org.apache.hadoop.fs.RawLocalFileSystem.listStatus(RawLocalFileSystem.java:301) at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1071) at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1096) at org.apache.hadoop.fs.ChecksumFileSystem.listStatus(ChecksumFileSystem.java:494) at org.apache.hadoop.mapreduce.util.MRAsyncDiskService.<init>(MRAsyncDiskService.java:103) at org.apache.hadoop.mapreduce.util.MRAsyncDiskService.<init>(MRAsyncDiskService.java:127) at org.apache.hadoop.mapred.TaskTracker.initialize(TaskTracker.java:616) at org.apache.hadoop.mapred.TaskTracker.<init>(TaskTracker.java:1323) at org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:3508)
          Hide
          Dan Adkins added a comment -

          s/finally/catch (Exception e)/

          The previous patch was still throwing the exception instead of ignoring it as intended.

          Show
          Dan Adkins added a comment - s/finally/catch (Exception e)/ The previous patch was still throwing the exception instead of ignoring it as intended.
          Hide
          Scott Chen added a comment -

          The try { } block in cleanupAllVolumes() also needs the fix Dan pointed out.

          Show
          Scott Chen added a comment - The try { } block in cleanupAllVolumes() also needs the fix Dan pointed out.
          Hide
          Scott Chen added a comment -

          Zheng, Is it possible that you can add a simple test for this bug? Use some non-exist volume name to construct a MRAsyncDiskService.

          Show
          Scott Chen added a comment - Zheng, Is it possible that you can add a simple test for this bug? Use some non-exist volume name to construct a MRAsyncDiskService.
          Hide
          Tom White added a comment -

          Added a unit test. I also tested manually by starting a jobtracker which included unwritable directory specified in mapred.local.dir.

          Show
          Tom White added a comment - Added a unit test. I also tested manually by starting a jobtracker which included unwritable directory specified in mapred.local.dir.
          Hide
          Todd Lipcon added a comment -

          +1, looks good to me.

          Show
          Todd Lipcon added a comment - +1, looks good to me.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12467776/MAPREDUCE-1382.5.patch
          against trunk revision 1058374.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 3 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed these core unit tests:
          org.apache.hadoop.mapred.TestControlledMapReduceJob
          org.apache.hadoop.mapred.TestSetupTaskScheduling

          -1 contrib tests. The patch failed contrib unit tests.

          -1 system test framework. The patch failed system test framework compile.

          Test results: https://hudson.apache.org/hudson/job/PreCommit-MAPREDUCE-Build/14//testReport/
          Findbugs warnings: https://hudson.apache.org/hudson/job/PreCommit-MAPREDUCE-Build/14//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Console output: https://hudson.apache.org/hudson/job/PreCommit-MAPREDUCE-Build/14//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12467776/MAPREDUCE-1382.5.patch against trunk revision 1058374. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these core unit tests: org.apache.hadoop.mapred.TestControlledMapReduceJob org.apache.hadoop.mapred.TestSetupTaskScheduling -1 contrib tests. The patch failed contrib unit tests. -1 system test framework. The patch failed system test framework compile. Test results: https://hudson.apache.org/hudson/job/PreCommit-MAPREDUCE-Build/14//testReport/ Findbugs warnings: https://hudson.apache.org/hudson/job/PreCommit-MAPREDUCE-Build/14//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://hudson.apache.org/hudson/job/PreCommit-MAPREDUCE-Build/14//console This message is automatically generated.
          Hide
          Tom White added a comment -

          Minor change to the test (which I tested locally) to make the test directory writable again after the test has run so that the directory can be removed easily later (e.g. by ant clean). The failures are known or not caused by this patch, so I'd like to commit this.

          Show
          Tom White added a comment - Minor change to the test (which I tested locally) to make the test directory writable again after the test has run so that the directory can be removed easily later (e.g. by ant clean). The failures are known or not caused by this patch, so I'd like to commit this.
          Hide
          Todd Lipcon added a comment -

          +1, seems like a good fix.

          Show
          Todd Lipcon added a comment - +1, seems like a good fix.
          Hide
          Tom White added a comment -

          I've just committed this. Thanks, Zheng.

          Show
          Tom White added a comment - I've just committed this. Thanks, Zheng.
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Mapreduce-trunk-Commit #581 (See https://hudson.apache.org/hudson/job/Hadoop-Mapreduce-trunk-Commit/581/)
          MAPREDUCE-1382. MRAsyncDiscService should tolerate missing local.dir. Contributed by Zheng Shao and tomwhite.

          Show
          Hudson added a comment - Integrated in Hadoop-Mapreduce-trunk-Commit #581 (See https://hudson.apache.org/hudson/job/Hadoop-Mapreduce-trunk-Commit/581/ ) MAPREDUCE-1382 . MRAsyncDiscService should tolerate missing local.dir. Contributed by Zheng Shao and tomwhite.
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Mapreduce-trunk #643 (See https://hudson.apache.org/hudson/job/Hadoop-Mapreduce-trunk/643/)

          Show
          Hudson added a comment - Integrated in Hadoop-Mapreduce-trunk #643 (See https://hudson.apache.org/hudson/job/Hadoop-Mapreduce-trunk/643/ )

            People

            • Assignee:
              Zheng Shao
              Reporter:
              Scott Chen
            • Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development