Hadoop Map/Reduce
  1. Hadoop Map/Reduce
  2. MAPREDUCE-476

extend DistributedCache to work locally (LocalJobRunner)

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Minor Minor
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.21.0
    • Component/s: None
    • Labels:
      None
    • Hadoop Flags:
      Reviewed
    • Release Note:
      Extended DistributedCache to work with LocalJobRunner.

      Description

      The DistributedCache does not work locally when using the outlined recipe at http://hadoop.apache.org/core/docs/r0.16.0/api/org/apache/hadoop/filecache/DistributedCache.html

      Ideally, LocalJobRunner would take care of populating the JobConf and copying remote files to the local file sytem (http, assume hdfs = default fs = local fs when doing local development.

      1. 476.20S-2.patch
        97 kB
        Ravi Gummadi
      2. MAPREDUCE-476-v9.patch
        101 kB
        Philip Zeyliger
      3. MAPREDUCE-476-v8.patch
        100 kB
        Philip Zeyliger
      4. MAPREDUCE-476-v7.patch
        100 kB
        Philip Zeyliger
      5. v6-to-v7.patch
        13 kB
        Philip Zeyliger
      6. MAPREDUCE-476-20090818.txt
        94 kB
        Vinod Kumar Vavilapalli
      7. MAPREDUCE-476-20090814.1.txt
        94 kB
        Vinod Kumar Vavilapalli
      8. MAPREDUCE-476-v5-requires-MR711.patch
        92 kB
        Philip Zeyliger
      9. MAPREDUCE-476-v4-requires-MR711.patch
        92 kB
        Philip Zeyliger
      10. MAPREDUCE-476-v2-vs-v4.txt
        42 kB
        Philip Zeyliger
      11. MAPREDUCE-476-v3.try2.patch
        0.4 kB
        Philip Zeyliger
      12. MAPREDUCE-476-v2-vs-v3.try2.patch
        41 kB
        Philip Zeyliger
      13. MAPREDUCE-476-v3.patch
        91 kB
        Philip Zeyliger
      14. MAPREDUCE-476-v2-vs-v3.patch
        41 kB
        Philip Zeyliger
      15. MAPREDUCE-476-v2.patch
        85 kB
        Philip Zeyliger
      16. MAPREDUCE-476.patch
        24 kB
        Philip Zeyliger
      17. HADOOP-2914-v3.patch
        84 kB
        Philip Zeyliger
      18. HADOOP-2914-v2.patch
        112 kB
        Philip Zeyliger
      19. HADOOP-2914-v1-full.patch
        116 kB
        Philip Zeyliger
      20. HADOOP-2914-v1-since-4041.patch
        83 kB
        Philip Zeyliger

        Issue Links

          Activity

          sam rash created issue -
          Vinod Kumar Vavilapalli made changes -
          Field Original Value New Value
          Link This issue duplicates HADOOP-5174 [ HADOOP-5174 ]
          Hide
          Vinod Kumar Vavilapalli added a comment -

          This issue is the cause for HADOOP-5174. DistributedCache not yet extended to work locally results in Pipes failing with LocalJobRunner. Marking HADOOP-5174 as duplicate of this issue.

          Show
          Vinod Kumar Vavilapalli added a comment - This issue is the cause for HADOOP-5174 . DistributedCache not yet extended to work locally results in Pipes failing with LocalJobRunner. Marking HADOOP-5174 as duplicate of this issue.
          Hide
          Philip Zeyliger added a comment -

          I set out to get DistributedCache to work on local job runner — which wasn't too tricky — but I ended up refactoring the DistributedCache code quite a bit, which has made this patch large and perhaps unfriendly.

          DistributedCache code is used in three places:

          1. In user code, to (1) configure files to be cached and (2) retrieve the URIs of those files at runtime,
          2. In JobClient, to record some metadata information about the files desired in user code,
          3. And in TaskTracker/TaskRunner, to (1) maintain the cache, and (2) configure the cache per task.

          Most of the code for all of these uses was in public static methods in DistributedCache.java, though some pretty complicated logic about the DistributedCache was also in TaskTracker.java and TaskRunner.java. This made it tricky to tease out what the sacrosanct public APIs were. My interpretation is that the methods described in the documentation (http://hadoop.apache.org/core/docs/current/mapred_tutorial.html#DistributedCache) are public APIs, and I have left those, and a few others, in tact. I separated out the other logic into two other classes, so that then I could avoid duplication between TaskRunner and LocalJobRunner.

          The current patch depends on HADOOP-4041, so I've attached two patches: one for Hudson, and another if you don't want to revisit the intersection with 4041 (which is largely uninteresting: either way code moves out of TaskRunner into DistributedCacheHandle).

          I've added some tests. TestDistributedCache has become TestDistributedCacheManager, and there's a new test in there. TestMRWithDistributedCache tests against both local and MiniMRClusters. I've also tested using streaming, with commands like:

          bin/hadoop jar build/contrib/streaming/hadoop-0.21.0-dev-streaming.jar \
            -files /etc/passwd -input /dev/null -output /tmp/output1 -mapper 'sh -c "test ! -z $mapred_cache_localFiles"'
          bin/hadoop jar build/contrib/streaming/hadoop-0.21.0-dev-streaming.jar \
            -jt local -files /etc/passwd -input /dev/null -output /tmp/output2 -mapper 'sh -c "test ! -z $mapred_cache_localFiles"'
          

          Is there a place where tests that use streaming to check other functionality could be checked in?

          I wanted to stop somewhere and send this out, but I can think of several potential future JIRAs:

          • The DistributedCache is in core/, but it only makes sense with mapred, so it probably should be relocated to mapred.
          • There's more work to be done to separate out the public interfaces from the private ones. The timestamp handling that's done by JobClient should really be done by something within the filecache package, for example. Much of the annoyance here stems from the haphazard ways in which Hadoop jobs serialize some configuration data to the configuration file. DistributedCache uses, I believe, 6 configuration keys, just to store ("file", "archive", "file+classpath", "archive+classpath", "filetimestamp", "archive+timestamp").
          • Speaking of configuration, DistributedCache will not likely work for files with a comma in their path, though perhaps URI encoding saves us there.
          • I haven't touched the DistributedCacheManager code except to move it there, but I suspect it could be significantly simplified now that it contains a Configuration object.
          • It's my belief that SVN r696957 (HADOOP-249) turned off the symlink feature and that it hasn't worked since then. That said, I haven't yet written the test that would confirm this.

          Looking forward to your feedback. – Philip

          Show
          Philip Zeyliger added a comment - I set out to get DistributedCache to work on local job runner — which wasn't too tricky — but I ended up refactoring the DistributedCache code quite a bit, which has made this patch large and perhaps unfriendly. DistributedCache code is used in three places: In user code, to (1) configure files to be cached and (2) retrieve the URIs of those files at runtime, In JobClient, to record some metadata information about the files desired in user code, And in TaskTracker/TaskRunner, to (1) maintain the cache, and (2) configure the cache per task. Most of the code for all of these uses was in public static methods in DistributedCache.java, though some pretty complicated logic about the DistributedCache was also in TaskTracker.java and TaskRunner.java. This made it tricky to tease out what the sacrosanct public APIs were. My interpretation is that the methods described in the documentation ( http://hadoop.apache.org/core/docs/current/mapred_tutorial.html#DistributedCache ) are public APIs, and I have left those, and a few others, in tact. I separated out the other logic into two other classes, so that then I could avoid duplication between TaskRunner and LocalJobRunner. The current patch depends on HADOOP-4041 , so I've attached two patches: one for Hudson, and another if you don't want to revisit the intersection with 4041 (which is largely uninteresting: either way code moves out of TaskRunner into DistributedCacheHandle). I've added some tests. TestDistributedCache has become TestDistributedCacheManager, and there's a new test in there. TestMRWithDistributedCache tests against both local and MiniMRClusters. I've also tested using streaming, with commands like: bin/hadoop jar build/contrib/streaming/hadoop-0.21.0-dev-streaming.jar \ -files /etc/passwd -input /dev/null -output /tmp/output1 -mapper 'sh -c "test ! -z $mapred_cache_localFiles"' bin/hadoop jar build/contrib/streaming/hadoop-0.21.0-dev-streaming.jar \ -jt local -files /etc/passwd -input /dev/null -output /tmp/output2 -mapper 'sh -c "test ! -z $mapred_cache_localFiles"' Is there a place where tests that use streaming to check other functionality could be checked in? I wanted to stop somewhere and send this out, but I can think of several potential future JIRAs: The DistributedCache is in core/, but it only makes sense with mapred, so it probably should be relocated to mapred. There's more work to be done to separate out the public interfaces from the private ones. The timestamp handling that's done by JobClient should really be done by something within the filecache package, for example. Much of the annoyance here stems from the haphazard ways in which Hadoop jobs serialize some configuration data to the configuration file. DistributedCache uses, I believe, 6 configuration keys, just to store ("file", "archive", "file+classpath", "archive+classpath", "filetimestamp", "archive+timestamp"). Speaking of configuration, DistributedCache will not likely work for files with a comma in their path, though perhaps URI encoding saves us there. I haven't touched the DistributedCacheManager code except to move it there, but I suspect it could be significantly simplified now that it contains a Configuration object. It's my belief that SVN r696957 ( HADOOP-249 ) turned off the symlink feature and that it hasn't worked since then. That said, I haven't yet written the test that would confirm this. Looking forward to your feedback. – Philip
          Philip Zeyliger made changes -
          Attachment HADOOP-2914-v1-since-4041.patch [ 12409531 ]
          Attachment HADOOP-2914-v1-full.patch [ 12409532 ]
          Philip Zeyliger made changes -
          Status Open [ 1 ] Patch Available [ 10002 ]
          Assignee Philip Zeyliger [ philip ]
          Hide
          Amareshwari Sriramadasu added a comment -

          It's my belief that SVN r696957 (HADOOP-249) turned off the symlink feature and that it hasn't worked since then. That said, I haven't yet written the test that would confirm this.

          symlinks are not created TaskRunner.run(). They are created in setupWorkDir(), called from child. This is an optimization introduced in HADOOP-249. The test TestMiniMRDFSCaching tests DistributedCache and symlinks in it also .

          Show
          Amareshwari Sriramadasu added a comment - It's my belief that SVN r696957 ( HADOOP-249 ) turned off the symlink feature and that it hasn't worked since then. That said, I haven't yet written the test that would confirm this. symlinks are not created TaskRunner.run(). They are created in setupWorkDir(), called from child. This is an optimization introduced in HADOOP-249 . The test TestMiniMRDFSCaching tests DistributedCache and symlinks in it also .
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12409532/HADOOP-2914-v1-full.patch
          against trunk revision 780465.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 11 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          -1 findbugs. The patch appears to introduce 2 new Findbugs warnings.

          +1 Eclipse classpath. The patch retains Eclipse classpath integrity.

          -1 release audit. The applied patch generated 498 release audit warnings (more than the trunk's current 495 warnings).

          -1 core tests. The patch failed core unit tests.

          -1 contrib tests. The patch failed contrib unit tests.

          Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/444/testReport/
          Release audit warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/444/artifact/trunk/patchprocess/releaseAuditDiffWarnings.txt
          Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/444/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/444/artifact/trunk/build/test/checkstyle-errors.html
          Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/444/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12409532/HADOOP-2914-v1-full.patch against trunk revision 780465. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 11 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 2 new Findbugs warnings. +1 Eclipse classpath. The patch retains Eclipse classpath integrity. -1 release audit. The applied patch generated 498 release audit warnings (more than the trunk's current 495 warnings). -1 core tests. The patch failed core unit tests. -1 contrib tests. The patch failed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/444/testReport/ Release audit warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/444/artifact/trunk/patchprocess/releaseAuditDiffWarnings.txt Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/444/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/444/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/444/console This message is automatically generated.
          Hide
          Philip Zeyliger added a comment -

          Thanks, Amareshwari. That's exactly what's going on, which means symlinks aren't working in LocalJobRunner, even after this patch. That seems like a fine first step. (Symlinks in local job runner might be tricky anyway — LocalJobRunner doesn't cd to a working dir, so making symlinks might dirty up your current directory.)

          I hadn't seen that specific test before, but it passes.

          The failing tests are:

          org.apache.hadoop.mapred.TestQueueCapacities.testMultipleQueues	0.0010	3
          org.apache.hadoop.mapred.lib.TestCombineFileInputFormat.<init>
          

          which are broken in trunk, I believe.

          Show
          Philip Zeyliger added a comment - Thanks, Amareshwari. That's exactly what's going on, which means symlinks aren't working in LocalJobRunner, even after this patch. That seems like a fine first step. (Symlinks in local job runner might be tricky anyway — LocalJobRunner doesn't cd to a working dir, so making symlinks might dirty up your current directory.) I hadn't seen that specific test before, but it passes. The failing tests are: org.apache.hadoop.mapred.TestQueueCapacities.testMultipleQueues 0.0010 3 org.apache.hadoop.mapred.lib.TestCombineFileInputFormat.<init> which are broken in trunk, I believe.
          Hide
          Tom White added a comment -

          Some initial comments:

          • In DistributedCacheHandle the class doc should go before the class declaration, not at the beginning of the file. Also need to add Apache license.
          • Use an enum rather than a boolean for isArchive in CacheFile.
          • We shouldn't remove public methods to DistributedCache, but rather deprecate them and remove them in a future release. Can DistributedCache delegate to DistributedCacheManager? I like the fact you have documented the intended audience for each public method of DistributedCache. (This paves the way to separating the public and private interfaces in future.)
          • Is there duplication between TestMRWithDistributedCache and tests that use MRCaching that could be avoided?
          • Could TestMRWithDistributedCache also test symlinking?
          Show
          Tom White added a comment - Some initial comments: In DistributedCacheHandle the class doc should go before the class declaration, not at the beginning of the file. Also need to add Apache license. Use an enum rather than a boolean for isArchive in CacheFile. We shouldn't remove public methods to DistributedCache, but rather deprecate them and remove them in a future release. Can DistributedCache delegate to DistributedCacheManager? I like the fact you have documented the intended audience for each public method of DistributedCache. (This paves the way to separating the public and private interfaces in future.) Is there duplication between TestMRWithDistributedCache and tests that use MRCaching that could be avoided? Could TestMRWithDistributedCache also test symlinking?
          Hide
          Philip Zeyliger added a comment -

          In DistributedCacheHandle the class doc should go before the class declaration, not at the beginning of the file. Also need to add Apache license.

          Done.

          Use an enum rather than a boolean for isArchive in CacheFile.

          Done.

          We shouldn't remove public methods to DistributedCache, but rather deprecate them and remove them in a future release. Can DistributedCache delegate to DistributedCacheManager? I like the fact you have documented the intended audience for each public method of DistributedCache. (This paves the way to separating the public and private interfaces in future.)

          Done.

          My current thinking on APIs (for a future JIRA) is that users should access DistributedCache through Job.addToCache(URI, flags) and Context.getCachedFiles(). But there's some more work to get there.

          Is there duplication between TestMRWithDistributedCache and tests that use MRCaching that could be avoided?

          Probably, but it's hard to tease out. MRCaching is more complicated than the test I'm adding, and does, I believe, test some things that I don't. On the other hand, TestMRWithDistributedCache tests the classpath stuff. I'm loath to delete tests too eagerly.

          Could TestMRWithDistributedCache also test symlinking?

          It does now test symlinking. However, I couldn't (easily) get LocalJobRunner to do symlinks appropriately. LocalJobRunner doesn't currently have a notion of task directory, and I think this patch is already quite large.

          Show
          Philip Zeyliger added a comment - In DistributedCacheHandle the class doc should go before the class declaration, not at the beginning of the file. Also need to add Apache license. Done. Use an enum rather than a boolean for isArchive in CacheFile. Done. We shouldn't remove public methods to DistributedCache, but rather deprecate them and remove them in a future release. Can DistributedCache delegate to DistributedCacheManager? I like the fact you have documented the intended audience for each public method of DistributedCache. (This paves the way to separating the public and private interfaces in future.) Done. My current thinking on APIs (for a future JIRA) is that users should access DistributedCache through Job.addToCache(URI, flags) and Context.getCachedFiles(). But there's some more work to get there. Is there duplication between TestMRWithDistributedCache and tests that use MRCaching that could be avoided? Probably, but it's hard to tease out. MRCaching is more complicated than the test I'm adding, and does, I believe, test some things that I don't. On the other hand, TestMRWithDistributedCache tests the classpath stuff. I'm loath to delete tests too eagerly. Could TestMRWithDistributedCache also test symlinking? It does now test symlinking. However, I couldn't (easily) get LocalJobRunner to do symlinks appropriately. LocalJobRunner doesn't currently have a notion of task directory, and I think this patch is already quite large.
          Philip Zeyliger made changes -
          Attachment HADOOP-2914-v2.patch [ 12410608 ]
          Philip Zeyliger made changes -
          Status Patch Available [ 10002 ] Open [ 1 ]
          Hide
          Philip Zeyliger added a comment -

          Submitting to test queue.

          I should note that the patch includes 4041, which needs to be committed before this. Simply ignore the IsolationRunner.java changes if reviewing.

          Show
          Philip Zeyliger added a comment - Submitting to test queue. I should note that the patch includes 4041, which needs to be committed before this. Simply ignore the IsolationRunner.java changes if reviewing.
          Philip Zeyliger made changes -
          Status Open [ 1 ] Patch Available [ 10002 ]
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12410608/HADOOP-2914-v2.patch
          against trunk revision 785065.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 11 new or modified tests.

          -1 patch. The patch command could not apply the patch.

          Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/511/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12410608/HADOOP-2914-v2.patch against trunk revision 785065. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 11 new or modified tests. -1 patch. The patch command could not apply the patch. Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/511/console This message is automatically generated.
          Hide
          Philip Zeyliger added a comment -

          Regenerated patch after HADOOP-4041 was committed.

          Show
          Philip Zeyliger added a comment - Regenerated patch after HADOOP-4041 was committed.
          Philip Zeyliger made changes -
          Attachment HADOOP-2914-v3.patch [ 12410957 ]
          Philip Zeyliger made changes -
          Status Patch Available [ 10002 ] Open [ 1 ]
          Philip Zeyliger made changes -
          Status Open [ 1 ] Patch Available [ 10002 ]
          Hide
          Tom White added a comment -

          +1

          > My current thinking on APIs (for a future JIRA) is that users should access DistributedCache through Job.addToCache(URI, flags) and Context.getCachedFiles(). But there's some more work to get there.

          I imagine this would be done as a part of moving DistributedCache from the filecache package to the mapreduce package (or a subpackage, see HADOOP-398).

          Show
          Tom White added a comment - +1 > My current thinking on APIs (for a future JIRA) is that users should access DistributedCache through Job.addToCache(URI, flags) and Context.getCachedFiles(). But there's some more work to get there. I imagine this would be done as a part of moving DistributedCache from the filecache package to the mapreduce package (or a subpackage, see HADOOP-398 ).
          Owen O'Malley made changes -
          Project Hadoop Common [ 12310240 ] Hadoop Map/Reduce [ 12310941 ]
          Key HADOOP-2914 MAPREDUCE-476
          Component/s mapred [ 12310690 ]
          Philip Zeyliger made changes -
          Link This issue incorporates HADOOP-6125 [ HADOOP-6125 ]
          Hide
          Philip Zeyliger added a comment -

          Regenerated patch after project split. I used:

          cat /Users/philip/Downloads/HADOOP-2914-v3.patch | sed -e 's,src/mapred/,src/java/,' | patch -p0

          Show
          Philip Zeyliger added a comment - Regenerated patch after project split. I used: cat /Users/philip/Downloads/ HADOOP-2914 -v3.patch | sed -e 's,src/mapred/,src/java/,' | patch -p0
          Philip Zeyliger made changes -
          Attachment MAPREDUCE-476.patch [ 12412325 ]
          Philip Zeyliger made changes -
          Status Patch Available [ 10002 ] Open [ 1 ]
          Philip Zeyliger made changes -
          Status Open [ 1 ] Patch Available [ 10002 ]
          Owen O'Malley made changes -
          Link This issue is related to MAPREDUCE-711 [ MAPREDUCE-711 ]
          Vinod Kumar Vavilapalli made changes -
          Link This issue blocks HADOOP-4493 [ HADOOP-4493 ]
          Vinod Kumar Vavilapalli made changes -
          Link This issue is blocked by MAPREDUCE-711 [ MAPREDUCE-711 ]
          Hide
          Vinod Kumar Vavilapalli added a comment -

          Philip, a quick look at the latest patch revealed missing files, for e.g. DistributedCacheHandle.java.

          Also, can you update this with the latest patch at MAPREDUCE-711? I am assuming MAPREDUCE-711 is a blocker for this.

          Show
          Vinod Kumar Vavilapalli added a comment - Philip, a quick look at the latest patch revealed missing files, for e.g. DistributedCacheHandle.java. Also, can you update this with the latest patch at MAPREDUCE-711 ? I am assuming MAPREDUCE-711 is a blocker for this.
          Vinod Kumar Vavilapalli made changes -
          Status Patch Available [ 10002 ] Open [ 1 ]
          Hide
          Philip Zeyliger added a comment -

          In light of MAPREDUCE-711, I generated a new patch. I applied MAPREDUCE-711-20090709-mapreduce.1.txt first, so this shouldn't be submitted to Hudson until after that gets checked in. I generated the patch by applying

          cat HADOOP-2914-v3.patch | sed -e 's%src/core/%src/java/%g' | sed -e 's%src/mapred/%src/java/%g' | sed -e 's%src/test/core%src/test/mapred%g' | patch -p0

          I had to clean up DistributedCache.java a tiny bit (there were 2 rejects) because some Javadoc links were removed in the project move; I've reinstated them. (I think they were removed because they pointed to MR from Common, but that's no longer an issue with MAPREDUCE-711.)

          Show
          Philip Zeyliger added a comment - In light of MAPREDUCE-711 , I generated a new patch. I applied MAPREDUCE-711 -20090709-mapreduce.1.txt first, so this shouldn't be submitted to Hudson until after that gets checked in. I generated the patch by applying cat HADOOP-2914 -v3.patch | sed -e 's%src/core/%src/java/%g' | sed -e 's%src/mapred/%src/java/%g' | sed -e 's%src/test/core%src/test/mapred%g' | patch -p0 I had to clean up DistributedCache.java a tiny bit (there were 2 rejects) because some Javadoc links were removed in the project move; I've reinstated them. (I think they were removed because they pointed to MR from Common, but that's no longer an issue with MAPREDUCE-711 .)
          Philip Zeyliger made changes -
          Attachment MAPREDUCE-476-v2.patch [ 12413042 ]
          Hide
          Vinod Kumar Vavilapalli added a comment -

          I've started looking at your latest patch in combination with an intermediate patch for MAPREDUCE-711. There is quite a bit of refactoring in this patch, though I find it really useful. Just to be sure, I will call this out explicitly to Devaraj/Owen offline.

          I have some comments in relation to the patch.

          General:

          • Please make sure that in the newly added code, lines aren't longer than 80 characters. For e.g, see DistributedCacheManager.newTaskHandle() method.
          • Just a thought, can the classes be better renamed to reflect their usage, something like TrackerDistributedCacheManager and TaskDistributeCacheManager?

          DistributedCacheManager

          • Explicitly state in javadoc that it is not a public interface
          • This class should also have the variable number argument getLocalCache() methods so that the corresponding methods in DistributedCache can be deprecated. Also, each method in DistributedCache should call the correponding method in DistributedCacheManager class.

          DistributedCacheHandle

          • Explicitly state in javadoc that it is not a public interface
          • CacheFile.makeCacheFiles()
            • isClassPath can be renamed to shouldBePutInClasspath
            • paths can be renamed to pathsToPutInClasspath.
            • Use .equals method at +150
              if (cacheFile.type == CacheFile.FileType.ARCHIVE)
            • static void makeCacheFiles(URI[] uris, String[] timestamps, Path[] paths,
                         boolean isArchive, List<CacheFile> target)

              should instead be

              static void makeCacheFiles(URI[] uris, String[] timestamps, Path[] paths,
                         FileType fileType, List<CacheFile> target)
            • I think it would be cleaner to return target instead of passing it as an argument.
            • makeCacheFiles() method should be documented
          • setup()
            • This method is really useful, avoids a lot of code duplication!
            • Leave localTaskFile writing business back in TaskRunner itself. I think It is the task's responsiblity, not the DistributeCacheHandle's
            • cacheSubdir can better be an argument to setup() method instead of passing it to the constructor.
          • getClassPaths() : Document that it has to be called and useful only when is already invoked.
          Show
          Vinod Kumar Vavilapalli added a comment - I've started looking at your latest patch in combination with an intermediate patch for MAPREDUCE-711 . There is quite a bit of refactoring in this patch, though I find it really useful. Just to be sure, I will call this out explicitly to Devaraj/Owen offline. I have some comments in relation to the patch. General: Please make sure that in the newly added code, lines aren't longer than 80 characters. For e.g, see DistributedCacheManager.newTaskHandle() method. Just a thought, can the classes be better renamed to reflect their usage, something like TrackerDistributedCacheManager and TaskDistributeCacheManager? DistributedCacheManager Explicitly state in javadoc that it is not a public interface This class should also have the variable number argument getLocalCache() methods so that the corresponding methods in DistributedCache can be deprecated. Also, each method in DistributedCache should call the correponding method in DistributedCacheManager class. DistributedCacheHandle Explicitly state in javadoc that it is not a public interface CacheFile.makeCacheFiles() isClassPath can be renamed to shouldBePutInClasspath paths can be renamed to pathsToPutInClasspath. Use .equals method at +150 if (cacheFile.type == CacheFile.FileType.ARCHIVE) static void makeCacheFiles(URI[] uris, String [] timestamps, Path[] paths, boolean isArchive, List<CacheFile> target) should instead be static void makeCacheFiles(URI[] uris, String [] timestamps, Path[] paths, FileType fileType, List<CacheFile> target) I think it would be cleaner to return target instead of passing it as an argument. makeCacheFiles() method should be documented setup() This method is really useful, avoids a lot of code duplication! Leave localTaskFile writing business back in TaskRunner itself. I think It is the task's responsiblity, not the DistributeCacheHandle's cacheSubdir can better be an argument to setup() method instead of passing it to the constructor. getClassPaths() : Document that it has to be called and useful only when is already invoked.
          Hide
          Vinod Kumar Vavilapalli added a comment -

          Some more comments:

          TaskTracker.initialize()

          • A new DistributedCacheManager is created every time, so old files will not be deleted by the subsequent purgeCache. Either we should have only one cacheManager for a TaskTracker or DistributedCacheManager.cachedArchives should be static.
          • The same problem exists with the deprecated purgeCache() method in DistributedCache.java

          JobClient.java

          • What happens to the code in here? Should this be refactored too/ are we going to do something about it?

          DistributedCache.java

          • getLocalCache() (+190) should be indirected to DistributedCacheManager.getLocalCache(). Otherwise there is a stack overflow error - the method calls itself as of now.
          • I think most of the getter/setter methods are deprecable and moveable to DistributedCacheManager. Or at least we should give some thought about it. For most of them, I do see from the javadoc that they are only for internal use anyways.
          Show
          Vinod Kumar Vavilapalli added a comment - Some more comments: TaskTracker.initialize() A new DistributedCacheManager is created every time, so old files will not be deleted by the subsequent purgeCache. Either we should have only one cacheManager for a TaskTracker or DistributedCacheManager.cachedArchives should be static. The same problem exists with the deprecated purgeCache() method in DistributedCache.java JobClient.java What happens to the code in here? Should this be refactored too/ are we going to do something about it? DistributedCache.java getLocalCache() (+190) should be indirected to DistributedCacheManager.getLocalCache(). Otherwise there is a stack overflow error - the method calls itself as of now. I think most of the getter/setter methods are deprecable and moveable to DistributedCacheManager. Or at least we should give some thought about it. For most of them, I do see from the javadoc that they are only for internal use anyways.
          Hide
          Vinod Kumar Vavilapalli added a comment -

          I am still to look at the changes in LocalJobRunner (the real part of the patch ) and the test-cases.

          Show
          Vinod Kumar Vavilapalli added a comment - I am still to look at the changes in LocalJobRunner (the real part of the patch ) and the test-cases.
          Hide
          Vinod Kumar Vavilapalli added a comment -

          Looked at the remaining parts of the patch. Some more comments.

          DistributedCacheHandle.makeClassLoader

          • use File.toURI.toURL() instead of File.toURL() directly. File.toURL() is deprecated.

          LocalJobRunner.java

          • TaskRunner.setupWorkDir needs to be fixed to have a workDir arg to help the code in LocalJobRunner. Compilation(ant binary) breaks now.
          • TODO: Test pipes with LocalJobRunner and make sure that it works, i.e. verify HADOOP-5174.

          TestDistributedCacheManager

          • Code related to setFileStamps in JobClient.java (+636 to +656) and testManagerFlow() (+71 to +74) can be refactored into an internal-only method in DistributedCacheManager.
          • Minor: assertNonNull(+90) check can be moved after +91 and use localCacheFiles for the null check

          TestMRWithDistributedCache

          • Just documenting a TODO: Will need to fix this class's javadoc once MAPREDUCE-711 is in.
          • Fix the comment at +84. It is actually two archives.
          • Please put a comment for the class saying that it is NOT A FAST test, keeping in view the recent efforts to have a fast test target.
          Show
          Vinod Kumar Vavilapalli added a comment - Looked at the remaining parts of the patch. Some more comments. DistributedCacheHandle.makeClassLoader use File.toURI.toURL() instead of File.toURL() directly. File.toURL() is deprecated. LocalJobRunner.java TaskRunner.setupWorkDir needs to be fixed to have a workDir arg to help the code in LocalJobRunner. Compilation(ant binary) breaks now. TODO: Test pipes with LocalJobRunner and make sure that it works, i.e. verify HADOOP-5174 . TestDistributedCacheManager Code related to setFileStamps in JobClient.java (+636 to +656) and testManagerFlow() (+71 to +74) can be refactored into an internal-only method in DistributedCacheManager. Minor: assertNonNull(+90) check can be moved after +91 and use localCacheFiles for the null check TestMRWithDistributedCache Just documenting a TODO: Will need to fix this class's javadoc once MAPREDUCE-711 is in. Fix the comment at +84. It is actually two archives. Please put a comment for the class saying that it is NOT A FAST test, keeping in view the recent efforts to have a fast test target.
          Hide
          Vinod Kumar Vavilapalli added a comment -

          Another point. We should consolidate TestMRWithDistributedCache, TestMiniMRLocalFS and TestMiniMRDFSCaching. That's a lot of code in three different files testing mostly common code. May be another JIRA issue if you feel so.

          Show
          Vinod Kumar Vavilapalli added a comment - Another point. We should consolidate TestMRWithDistributedCache, TestMiniMRLocalFS and TestMiniMRDFSCaching. That's a lot of code in three different files testing mostly common code. May be another JIRA issue if you feel so.
          Hide
          Philip Zeyliger added a comment -

          Vinod,

          Thanks for your comments and thorough review. I'll take a closer look over the next couple of days and post a new patch.

          Show
          Philip Zeyliger added a comment - Vinod, Thanks for your comments and thorough review. I'll take a closer look over the next couple of days and post a new patch.
          Hide
          Vinod Kumar Vavilapalli added a comment -

          Philip, any update on this issue?

          Show
          Vinod Kumar Vavilapalli added a comment - Philip, any update on this issue?
          Philip Zeyliger made changes -
          Attachment MAPREDUCE-476-v2-vs-v3.patch [ 12415081 ]
          Attachment MAPREDUCE-476-v3.patch [ 12415082 ]
          Hide
          Philip Zeyliger added a comment -

          Vinod,

          Yes. I've been hacking away at it today. Please ignore those last two updated diffs: while getting rid of some 80+ character lines, I fumbled some git stuff and produced bad patches. I'll be producing good ones after some more sanity checking either late today or tomorrow morning.

          – Philip

          Show
          Philip Zeyliger added a comment - Vinod, Yes. I've been hacking away at it today. Please ignore those last two updated diffs: while getting rid of some 80+ character lines, I fumbled some git stuff and produced bad patches. I'll be producing good ones after some more sanity checking either late today or tomorrow morning. – Philip
          Hide
          Philip Zeyliger added a comment -

          Ok, I think this patch is better generated. Yet again depends on MAPREDUCE-711.

          Show
          Philip Zeyliger added a comment - Ok, I think this patch is better generated. Yet again depends on MAPREDUCE-711 .
          Philip Zeyliger made changes -
          Attachment MAPREDUCE-476-v2-vs-v3.try2.patch [ 12415084 ]
          Attachment MAPREDUCE-476-v3.try2.patch [ 12415085 ]
          Hide
          Philip Zeyliger added a comment -

          Never mind, trying to rush before leaving the office, and the tests fail here. Back tomorrow.

          Show
          Philip Zeyliger added a comment - Never mind, trying to rush before leaving the office, and the tests fail here. Back tomorrow.
          Hide
          Philip Zeyliger added a comment -

          Hi Vinod,

          Thanks for the ping; got distracted by other things. And thanks again for the detailed
          review. My responses are below. I've generated a patch that shows the differences
          between v2 and v4, and also the patch, in a state where it still depends on MAPREDUCE-711.
          is there anything blocking MAPREDUCE-711 that prevents it from being committed?

          Also, sorry about the multiple uploads here. I had a very clever bug in there
          (caused by not thinking enough while resolving a merge conflict) that deleted
          the current working directory, recursively, in one of the tests. (TaskRunner
          is hard-coded to delete current working directory, which is ok, since it's
          typically a child process; not ok for LocalJobRunner.)

          I've run the "relevant" tests; the full tests take a while, so I'm running those
          in the background.

          $for i in TestMRWithDistributedCache TestMiniMRLocalFS TestMiniMRDFSCaching TestTrackerDistributedCacheManager; do; ant test Dtestcase=$i > test-out$i && echo "$i good" || echo "$i bad"; done
          TestMRWithDistributedCache good
          TestMiniMRLocalFS good
          TestMiniMRDFSCaching good
          TestTrackerDistributedCacheManager good

          There is quite a bit of refactoring in this patch, though I find it really useful.

          Yep. Having DistributedCache work locally is easy if you refactor the code
          a bit, so that's how I went at it.

          Please make sure that in the newly added code, lines aren't longer than 80 characters. For e.g, see DistributedCacheManager.newTaskHandle() method.

          A handful of "git diff foo..bar | egrep "+++|+ .

          {80}

          " has done the trick,
          I think. The tricky bit is always fixing only the lines I've changed,
          and not all the lines in a given file, to preserve history and keep
          reviewing sane.

          Just a thought, can the classes be better renamed to reflect their usage, something like TrackerDistributedCacheManager and TaskDistributeCacheManager?

          I like those names better; thanks. Changed.

          DistributedCacheManager and DistributedCacheHandle: Explicitly state in javadoc that it is not a public interface

          Done.

          This class should also have the variable number argument getLocalCache() methods so that the corresponding methods in DistributedCache can be deprecated. Also, each method in DistributedCache should call the correponding method in DistributedCacheManager class.

          Don't think I agree here. We can deprecate the getLocalCache methods
          in DistributedCache right away. They delegate to each other, and one of them
          delegates to TrackerDistributedCacheManager. Ideally, I'd remove these
          altogether — Hadoop internally does not use these methods with
          this patch, and there's no sensible reason why someone else would,
          but since it's public, it's getting deprecated. But it's not
          being deprecated with a pointer to use something else; it's getting
          deprecated so that you don't use it at all.

          DistributedCacheHandle CacheFile.makeCacheFiles()

          isClassPath can be renamed to shouldBePutInClasspath

          Renamed to shouldBeAddedToClasspath.

          paths can be renamed to pathsToPutInClasspath.

          Renamed to pathsToBeAddedToClasspath

          Use .equals method at +150 if (cacheFile.type == CacheFile.FileType.ARCHIVE)

          I believe that technically it doesn't matter.
          The JDK implementation of equals() on java.lang.Enum is final,
          and hardcoded to "this==other". This is the only thing that makes
          sense, since there's only ever one instance of a given Enum.
          I took an inaccurate look at the code base, and == is the more
          common option.

          1. Inaccurate! Not a static analysis! Not even close!
            [1]doorstop:hadoop-mapreduce(140142)$ack "([a-zA-Z]\.[a-zA-Z]\.equals" src | wc -l
            11
            [0]doorstop:hadoop-mapreduce(140143)$ack "([a-zA-Z]\.[a-zA-Z] ==" src | wc -l
            127

          If you feel strongly about this, happy to change it, but I think == is
          more consistent.

          makeCacheFiles: boolean isArchive -> FileType fileType

          done.

          I think it would be cleaner to return target instead of passing it as an argument.

          Done.

          makeCacheFiles() method should be documented

          Done.

          setup() This method is really useful, avoids a lot of code duplication!

          Ok.

          Leave localTaskFile writing business back in TaskRunner itself. I think It is the task's responsiblity, not the DistributeCacheHandle's

          Good call; done.

          cacheSubdir can better be an argument to setup() method instead of passing it to the constructor.

          Good idea; done.

          getClassPaths() : Document that it has to be called and useful only when is already invoked.

          Done. I've made it throw an exception if it's called erroneously, since I could
          see that causing trouble for developers.

          TaskTracker.initialize() A new DistributedCacheManager is created every time, so old files will not be deleted by the subsequent purgeCache. Either we should have only one cacheManager for a TaskTracker or DistributedCacheManager.cachedArchives should be static. The same problem exists with the deprecated purgeCache() method in DistributedCache.java

          If my understanding is correct,
          in the course of normal operations, TaskTracker.initialize() is only called once, by
          TaskTracker.run(). The comment there says:

          /**

          • The server retry loop.
          • This while-loop attempts to connect to the JobTracker. It only
          • loops when the old TaskTracker has gone bad (its state is
          • stale somehow) and we need to reinitialize everything.
            */

          At startup time (and whenever the TaskTracker decides to lose all of its state
          and reset itself, which is essentially the same), the TaskTracker initializes
          the TrackerDistributedCacheManager object. This is also the only time it's
          allowed to "clear" the cache, since there are for certain no tasks running
          at initialization time that depend on those.

          Part of the whole refactoring here is to avoid static members and methods
          as much as possible, because they make it quite tricky to test, deprecate,
          and generally play nice.

          JobClient.java What happens to the code in here? Should this be refactored too/ are we going to do something about it?

          Good question I do think that JobClient should be using some methods in the
          filecache package, instead of hard-coding a lot of logic. That said, I chose to
          stop somewhere to avoid making this patch even harder to review. I think it's
          ripe for a future JIRA.

          DistributedCache.java getLocalCache() (+190) should be indirected to DistributedCacheManager.getLocalCache(). Otherwise there is a stack overflow error - the method calls itself as of now.

          Oy, good catch.

          I think most of the getter/setter methods are deprecable and moveable to DistributedCacheManager. Or at least we should give some thought about it. For most of them, I do see from the javadoc that they are only for internal use anyways.

          I agree. A few of them are used to manage the Configuration object. (In my mind, we're
          serializing and de-serializing a set of requirements for the distributed cache
          into the text configuration, and doing so a bit haphazardly.) I was very tempted
          to remove all the ones that are only meant to be internal, but Tom advised me that
          I need to keep them deprecated for a version.

          Again, I think moving those methods into a more private place is a good task to do
          along with changing how JobClient calls into this stuff.

          DistributedCacheHandle.makeClassLoader use File.toURI.toURL() instead of File.toURL() directly. File.toURL() is deprecated.

          Done.

          LocalJobRunner.java TaskRunner.setupWorkDir needs to be fixed to have a workDir arg to help the code in LocalJobRunner. Compilation(ant binary) breaks now.

          Oops, done.

          TODO: Test pipes with LocalJobRunner and make sure that it works, i.e. verify HADOOP-5174.

          I don't use Pipes typically, and it doesn't seem to compile on my Mac.
          I'll try it on a Linux machine, but if it's easy/handy for you, it'd be great
          to verify that bug.

          TestDistributedCacheManager * Code related to setFileStamps in JobClient.java (+636 to +656) and testManagerFlow() (+71 to +74) can be refactored into an internal-only method in DistributedCacheManager.

          I extracted the method and moved it to TrackerDistributedCacheManager. Now that
          that has "Tracker" in the name, it's a little misplaced, but it's not too bad.

          Minor: assertNonNull(+90) check can be moved after +91 and use localCacheFiles for the null check

          Done.

          TestMRWithDistributedCache Just documenting a TODO: Will need to fix this class's javadoc once MAPREDUCE-711 is in.

          Indeed.

          Fix the comment at +84. It is actually two archives.

          done.

          Please put a comment for the class saying that it is NOT A FAST test, keeping in view the recent efforts to have a fast test target.

          Done.

          Another point. We should consolidate TestMRWithDistributedCache, TestMiniMRLocalFS and TestMiniMRDFSCaching. That's a lot of code in three different files testing mostly common code. May be another JIRA issue if you feel so.

          I agree. I didn't find the relevant tests right away, and there are some bits
          covered by all of them. I'd prefer a separate JIRA.

          That pretty much covers the points you brought up. Take another look?

          Cheers,

          – Philip

          Show
          Philip Zeyliger added a comment - Hi Vinod, Thanks for the ping; got distracted by other things. And thanks again for the detailed review. My responses are below. I've generated a patch that shows the differences between v2 and v4, and also the patch, in a state where it still depends on MAPREDUCE-711 . is there anything blocking MAPREDUCE-711 that prevents it from being committed? Also, sorry about the multiple uploads here. I had a very clever bug in there (caused by not thinking enough while resolving a merge conflict) that deleted the current working directory, recursively, in one of the tests. (TaskRunner is hard-coded to delete current working directory, which is ok, since it's typically a child process; not ok for LocalJobRunner.) I've run the "relevant" tests; the full tests take a while, so I'm running those in the background. $for i in TestMRWithDistributedCache TestMiniMRLocalFS TestMiniMRDFSCaching TestTrackerDistributedCacheManager; do; ant test Dtestcase=$i > test-out $i && echo "$i good" || echo "$i bad"; done TestMRWithDistributedCache good TestMiniMRLocalFS good TestMiniMRDFSCaching good TestTrackerDistributedCacheManager good There is quite a bit of refactoring in this patch, though I find it really useful. Yep. Having DistributedCache work locally is easy if you refactor the code a bit, so that's how I went at it. Please make sure that in the newly added code, lines aren't longer than 80 characters. For e.g, see DistributedCacheManager.newTaskHandle() method. A handful of "git diff foo..bar | egrep " +++| + . {80} " has done the trick, I think. The tricky bit is always fixing only the lines I've changed, and not all the lines in a given file, to preserve history and keep reviewing sane. Just a thought, can the classes be better renamed to reflect their usage, something like TrackerDistributedCacheManager and TaskDistributeCacheManager? I like those names better; thanks. Changed. DistributedCacheManager and DistributedCacheHandle: Explicitly state in javadoc that it is not a public interface Done. This class should also have the variable number argument getLocalCache() methods so that the corresponding methods in DistributedCache can be deprecated. Also, each method in DistributedCache should call the correponding method in DistributedCacheManager class. Don't think I agree here. We can deprecate the getLocalCache methods in DistributedCache right away. They delegate to each other, and one of them delegates to TrackerDistributedCacheManager. Ideally, I'd remove these altogether — Hadoop internally does not use these methods with this patch, and there's no sensible reason why someone else would, but since it's public, it's getting deprecated. But it's not being deprecated with a pointer to use something else; it's getting deprecated so that you don't use it at all. DistributedCacheHandle CacheFile.makeCacheFiles() isClassPath can be renamed to shouldBePutInClasspath Renamed to shouldBeAddedToClasspath. paths can be renamed to pathsToPutInClasspath. Renamed to pathsToBeAddedToClasspath Use .equals method at +150 if (cacheFile.type == CacheFile.FileType.ARCHIVE) I believe that technically it doesn't matter. The JDK implementation of equals() on java.lang.Enum is final, and hardcoded to "this==other". This is the only thing that makes sense, since there's only ever one instance of a given Enum. I took an inaccurate look at the code base, and == is the more common option. Inaccurate! Not a static analysis! Not even close! [1] doorstop:hadoop-mapreduce(140142)$ack "( [a-zA-Z] \. [a-zA-Z] \.equals" src | wc -l 11 [0] doorstop:hadoop-mapreduce(140143)$ack "( [a-zA-Z] \. [a-zA-Z] ==" src | wc -l 127 If you feel strongly about this, happy to change it, but I think == is more consistent. makeCacheFiles: boolean isArchive -> FileType fileType done. I think it would be cleaner to return target instead of passing it as an argument. Done. makeCacheFiles() method should be documented Done. setup() This method is really useful, avoids a lot of code duplication! Ok. Leave localTaskFile writing business back in TaskRunner itself. I think It is the task's responsiblity, not the DistributeCacheHandle's Good call; done. cacheSubdir can better be an argument to setup() method instead of passing it to the constructor. Good idea; done. getClassPaths() : Document that it has to be called and useful only when is already invoked. Done. I've made it throw an exception if it's called erroneously, since I could see that causing trouble for developers. TaskTracker.initialize() A new DistributedCacheManager is created every time, so old files will not be deleted by the subsequent purgeCache. Either we should have only one cacheManager for a TaskTracker or DistributedCacheManager.cachedArchives should be static. The same problem exists with the deprecated purgeCache() method in DistributedCache.java If my understanding is correct, in the course of normal operations, TaskTracker.initialize() is only called once, by TaskTracker.run(). The comment there says: /** The server retry loop. This while-loop attempts to connect to the JobTracker. It only loops when the old TaskTracker has gone bad (its state is stale somehow) and we need to reinitialize everything. */ At startup time (and whenever the TaskTracker decides to lose all of its state and reset itself, which is essentially the same), the TaskTracker initializes the TrackerDistributedCacheManager object. This is also the only time it's allowed to "clear" the cache, since there are for certain no tasks running at initialization time that depend on those. Part of the whole refactoring here is to avoid static members and methods as much as possible, because they make it quite tricky to test, deprecate, and generally play nice. JobClient.java What happens to the code in here? Should this be refactored too/ are we going to do something about it? Good question I do think that JobClient should be using some methods in the filecache package, instead of hard-coding a lot of logic. That said, I chose to stop somewhere to avoid making this patch even harder to review. I think it's ripe for a future JIRA. DistributedCache.java getLocalCache() (+190) should be indirected to DistributedCacheManager.getLocalCache(). Otherwise there is a stack overflow error - the method calls itself as of now. Oy, good catch. I think most of the getter/setter methods are deprecable and moveable to DistributedCacheManager. Or at least we should give some thought about it. For most of them, I do see from the javadoc that they are only for internal use anyways. I agree. A few of them are used to manage the Configuration object. (In my mind, we're serializing and de-serializing a set of requirements for the distributed cache into the text configuration, and doing so a bit haphazardly.) I was very tempted to remove all the ones that are only meant to be internal, but Tom advised me that I need to keep them deprecated for a version. Again, I think moving those methods into a more private place is a good task to do along with changing how JobClient calls into this stuff. DistributedCacheHandle.makeClassLoader use File.toURI.toURL() instead of File.toURL() directly. File.toURL() is deprecated. Done. LocalJobRunner.java TaskRunner.setupWorkDir needs to be fixed to have a workDir arg to help the code in LocalJobRunner. Compilation(ant binary) breaks now. Oops, done. TODO: Test pipes with LocalJobRunner and make sure that it works, i.e. verify HADOOP-5174 . I don't use Pipes typically, and it doesn't seem to compile on my Mac. I'll try it on a Linux machine, but if it's easy/handy for you, it'd be great to verify that bug. TestDistributedCacheManager * Code related to setFileStamps in JobClient.java (+636 to +656) and testManagerFlow() (+71 to +74) can be refactored into an internal-only method in DistributedCacheManager. I extracted the method and moved it to TrackerDistributedCacheManager. Now that that has "Tracker" in the name, it's a little misplaced, but it's not too bad. Minor: assertNonNull(+90) check can be moved after +91 and use localCacheFiles for the null check Done. TestMRWithDistributedCache Just documenting a TODO: Will need to fix this class's javadoc once MAPREDUCE-711 is in. Indeed. Fix the comment at +84. It is actually two archives. done. Please put a comment for the class saying that it is NOT A FAST test, keeping in view the recent efforts to have a fast test target. Done. Another point. We should consolidate TestMRWithDistributedCache, TestMiniMRLocalFS and TestMiniMRDFSCaching. That's a lot of code in three different files testing mostly common code. May be another JIRA issue if you feel so. I agree. I didn't find the relevant tests right away, and there are some bits covered by all of them. I'd prefer a separate JIRA. That pretty much covers the points you brought up. Take another look? Cheers, – Philip
          Philip Zeyliger made changes -
          Attachment MAPREDUCE-476-v2-vs-v4.txt [ 12415133 ]
          Attachment MAPREDUCE-476-v4-requires-MR711.patch [ 12415134 ]
          Hide
          Vinod Kumar Vavilapalli added a comment -

          Replies in-line.

          Is there anything blocking MAPREDUCE-711 that prevents it from being committed?

          No, I've asked Owen to commit it.

          I had a very clever bug in there(caused by not thinking enough while resolving a merge conflict) that deleted the current working directory, recursively, in one of the tests. (TaskRunner is hard-coded to delete current working directory, which is ok, since it's typically a child process; not ok for LocalJobRunner.)

          Nice catch! I too had burnt myself once trying to refactor out the code in Child.java to a thread, and ended up deleting my whole work space because of the same issue

          The tricky bit is always fixing only the lines I've changed, and not all the lines in a given file, to preserve history and keep reviewing sane.

          That is correct.

          This class should also have the variable number argument getLocalCache() methods so that the corresponding methods in DistributedCache can be deprecated. Also, each method in DistributedCache should call the correponding method in DistributedCacheManager class.

          Don't think I agree here. We can deprecate the getLocalCache methods in DistributedCache right away. They delegate to each other, and one of them delegates to TrackerDistributedCacheManager. Ideally, I'd remove these altogether — Hadoop internally does not use these methods with this patch, and there's no sensible reason why someone else would, but since it's public, it's getting deprecated. But it's not being deprecated with a pointer to use something else; it's getting deprecated so that you don't use it at all.

          Okay, i thought those methods were used internally atleast. But if, as you've said, they are not used at all internally, +1 for deprecating them all together.

          Using .equals method at +150 if (cacheFile.type == CacheFile.FileType.ARCHIVE). If you feel strongly about this, happy to change it, but I think == is more consistent.

          I am fine, no problem with this.

          TaskTracker.initialize() A new DistributedCacheManager is created every time, so old files will not be deleted by the subsequent purgeCache. Either we should have only one cacheManager for a TaskTracker or DistributedCacheManager.cachedArchives should be static. The same problem exists with the deprecated purgeCache() method in DistributedCache.java

          If my understanding is correct, in the course of normal operations, TaskTracker.initialize() is only called once, by TaskTracker.run(). At startup time (and whenever the TaskTracker decides to lose all of its state and reset itself, which is essentially the same), the TaskTracker initializes the TrackerDistributedCacheManager object. This is also the only time it's allowed to "clear" the cache, since there are for certain no tasks running at initialization time that depend on those.

          Ah, I was under the impression that distribute cache files are purged EXPLICITLY when a TT reinitializes, but it looks like we do a full delete on TaskTracker's local files during re-init. So, the code in your patch will work for now, but it won't be in future when we might need explicit purge of distributed cache files when they are owned by users themselves (HADOOP-4493). We will change it then, +1 for the current changes.

          JobClient.java What happens to the code in here? Should this be refactored too/ are we going to do something about it?

          Good question I do think that JobClient should be using some methods in the filecache package, instead of hard-coding a lot of logic. That said, I chose to stop somewhere to avoid making this patch even harder to review. I think it's ripe for a future JIRA.

          Okay.

          I think most of the getter/setter methods are deprecable and moveable to DistributedCacheManager. Or at least we should give some thought about it. For most of them, I do see from the javadoc that they are only for internal use anyways.

          I agree. A few of them are used to manage the Configuration object. (In my mind, we're serializing and de-serializing a set of requirements for the distributed cache into the text configuration, and doing so a bit haphazardly.) I was very tempted to remove all the ones that are only meant to be internal, but Tom advised me that I need to keep them deprecated for a version. Again, I think moving those methods into a more private place is a good task to do along with changing how JobClient calls into this stuff.

          +1. So are you planning to do in the next version or in this patch itself?

          I don't use Pipes typically, and it doesn't seem to compile on my Mac. I'll try it on a Linux machine, but if it's easy/handy for you, it'd be great to verify that bug.

          Will do so.

          TestDistributedCacheManager * Code related to setFileStamps in JobClient.java (+636 to +656) and testManagerFlow() (+71 to +74) can be refactored into an internal-only method in DistributedCacheManager.

          I extracted the method and moved it to TrackerDistributedCacheManager. Now that that has "Tracker" in the name, it's a little misplaced, but it's not too bad.

          I think that's agreeable for now. We can move it to a more appropriate place in future if we find one.

          Another point. We should consolidate TestMRWithDistributedCache, TestMiniMRLocalFS and TestMiniMRDFSCaching. That's a lot of code in three different files testing mostly common code. May be another JIRA issue if you feel so.

          I'd prefer a separate JIRA.

          Fine. +1.

          Show
          Vinod Kumar Vavilapalli added a comment - Replies in-line. Is there anything blocking MAPREDUCE-711 that prevents it from being committed? No, I've asked Owen to commit it. I had a very clever bug in there(caused by not thinking enough while resolving a merge conflict) that deleted the current working directory, recursively, in one of the tests. (TaskRunner is hard-coded to delete current working directory, which is ok, since it's typically a child process; not ok for LocalJobRunner.) Nice catch! I too had burnt myself once trying to refactor out the code in Child.java to a thread, and ended up deleting my whole work space because of the same issue The tricky bit is always fixing only the lines I've changed, and not all the lines in a given file, to preserve history and keep reviewing sane. That is correct. This class should also have the variable number argument getLocalCache() methods so that the corresponding methods in DistributedCache can be deprecated. Also, each method in DistributedCache should call the correponding method in DistributedCacheManager class. Don't think I agree here. We can deprecate the getLocalCache methods in DistributedCache right away. They delegate to each other, and one of them delegates to TrackerDistributedCacheManager. Ideally, I'd remove these altogether — Hadoop internally does not use these methods with this patch, and there's no sensible reason why someone else would, but since it's public, it's getting deprecated. But it's not being deprecated with a pointer to use something else; it's getting deprecated so that you don't use it at all. Okay, i thought those methods were used internally atleast. But if, as you've said, they are not used at all internally, +1 for deprecating them all together. Using .equals method at +150 if (cacheFile.type == CacheFile.FileType.ARCHIVE). If you feel strongly about this, happy to change it, but I think == is more consistent. I am fine, no problem with this. TaskTracker.initialize() A new DistributedCacheManager is created every time, so old files will not be deleted by the subsequent purgeCache. Either we should have only one cacheManager for a TaskTracker or DistributedCacheManager.cachedArchives should be static. The same problem exists with the deprecated purgeCache() method in DistributedCache.java If my understanding is correct, in the course of normal operations, TaskTracker.initialize() is only called once, by TaskTracker.run(). At startup time (and whenever the TaskTracker decides to lose all of its state and reset itself, which is essentially the same), the TaskTracker initializes the TrackerDistributedCacheManager object. This is also the only time it's allowed to "clear" the cache, since there are for certain no tasks running at initialization time that depend on those. Ah, I was under the impression that distribute cache files are purged EXPLICITLY when a TT reinitializes, but it looks like we do a full delete on TaskTracker's local files during re-init. So, the code in your patch will work for now, but it won't be in future when we might need explicit purge of distributed cache files when they are owned by users themselves ( HADOOP-4493 ). We will change it then, +1 for the current changes. JobClient.java What happens to the code in here? Should this be refactored too/ are we going to do something about it? Good question I do think that JobClient should be using some methods in the filecache package, instead of hard-coding a lot of logic. That said, I chose to stop somewhere to avoid making this patch even harder to review. I think it's ripe for a future JIRA. Okay. I think most of the getter/setter methods are deprecable and moveable to DistributedCacheManager. Or at least we should give some thought about it. For most of them, I do see from the javadoc that they are only for internal use anyways. I agree. A few of them are used to manage the Configuration object. (In my mind, we're serializing and de-serializing a set of requirements for the distributed cache into the text configuration, and doing so a bit haphazardly.) I was very tempted to remove all the ones that are only meant to be internal, but Tom advised me that I need to keep them deprecated for a version. Again, I think moving those methods into a more private place is a good task to do along with changing how JobClient calls into this stuff. +1. So are you planning to do in the next version or in this patch itself? I don't use Pipes typically, and it doesn't seem to compile on my Mac. I'll try it on a Linux machine, but if it's easy/handy for you, it'd be great to verify that bug. Will do so. TestDistributedCacheManager * Code related to setFileStamps in JobClient.java (+636 to +656) and testManagerFlow() (+71 to +74) can be refactored into an internal-only method in DistributedCacheManager. I extracted the method and moved it to TrackerDistributedCacheManager. Now that that has "Tracker" in the name, it's a little misplaced, but it's not too bad. I think that's agreeable for now. We can move it to a more appropriate place in future if we find one. Another point. We should consolidate TestMRWithDistributedCache, TestMiniMRLocalFS and TestMiniMRDFSCaching. That's a lot of code in three different files testing mostly common code. May be another JIRA issue if you feel so. I'd prefer a separate JIRA. Fine. +1.
          Hide
          Vinod Kumar Vavilapalli added a comment -

          Looking at your latest patch now.

          Show
          Vinod Kumar Vavilapalli added a comment - Looking at your latest patch now.
          Hide
          Vinod Kumar Vavilapalli added a comment -

          The changes look good overall, barring

          • a few unused imports in JobClient.java, TaskRunner.java and TaskDistributedCacheManager (may or may not have introduced by your patch)
          • misspelt taskDstributedCacheManager in LocalJobRunner
          Show
          Vinod Kumar Vavilapalli added a comment - The changes look good overall, barring a few unused imports in JobClient.java, TaskRunner.java and TaskDistributedCacheManager (may or may not have introduced by your patch) misspelt taskDstributedCacheManager in LocalJobRunner
          Hide
          Philip Zeyliger added a comment -

          > I agree. A few of them are used to manage the Configuration object. (In my mind, we're serializing and de-serializing a set of requirements for the distributed cache into the text configuration, and doing so a bit haphazardly.) I was very tempted to remove all the ones that are only meant to be internal, but Tom advised me that I need to keep them deprecated for a version. Again, I think moving those methods into a more private place is a good task to do along with changing how JobClient calls into this stuff.

          +1. So are you planning to do in the next version or in this patch itself?

          Next version.

          Show
          Philip Zeyliger added a comment - > I agree. A few of them are used to manage the Configuration object. (In my mind, we're serializing and de-serializing a set of requirements for the distributed cache into the text configuration, and doing so a bit haphazardly.) I was very tempted to remove all the ones that are only meant to be internal, but Tom advised me that I need to keep them deprecated for a version. Again, I think moving those methods into a more private place is a good task to do along with changing how JobClient calls into this stuff. +1. So are you planning to do in the next version or in this patch itself? Next version.
          Hide
          Philip Zeyliger added a comment -

          Didn't see unused imports in JobClient.java. There are some deprecated imports, but not related to this patch. I've addressed your other two comments (other unused imports and misspelling of private variable).

          Will upload new patch after a bit more compiling and testing.

          Show
          Philip Zeyliger added a comment - Didn't see unused imports in JobClient.java. There are some deprecated imports, but not related to this patch. I've addressed your other two comments (other unused imports and misspelling of private variable). Will upload new patch after a bit more compiling and testing.
          Hide
          Philip Zeyliger added a comment -

          Latest patch.

          Thanks Vinod for the review!

          Show
          Philip Zeyliger added a comment - Latest patch. Thanks Vinod for the review!
          Philip Zeyliger made changes -
          Attachment MAPREDUCE-476-v5-requires-MR711.patch [ 12415496 ]
          Hide
          Vinod Kumar Vavilapalli added a comment -

          MAPREDUCE-842 broke this patch a bit.

          Philip, I am taking liberty to sync this patch to the latest trunk. You can verify my changes by taking diff of this patch with your last patch. The changes are mostly restricted to TaskRunner.java whose changes clashed with those in MAPREDUCE-842. Besides that, I've corrected some import statements and also added one of your fast tests to the list of tests that can be run via 10 minutes target "ant run-commit-test".

          Show
          Vinod Kumar Vavilapalli added a comment - MAPREDUCE-842 broke this patch a bit. Philip, I am taking liberty to sync this patch to the latest trunk. You can verify my changes by taking diff of this patch with your last patch. The changes are mostly restricted to TaskRunner.java whose changes clashed with those in MAPREDUCE-842 . Besides that, I've corrected some import statements and also added one of your fast tests to the list of tests that can be run via 10 minutes target "ant run-commit-test".
          Vinod Kumar Vavilapalli made changes -
          Attachment MAPREDUCE-476-20090814.1.txt [ 12416555 ]
          Hide
          Philip Zeyliger added a comment -

          Vinod,

          Thanks for updating the patch! Do you have an update to MAPREDUCE-711 that has the package move? I am trying to apply MAPREDUCE-711-20090709-mapreduce.1.txt and MAPREDUCE-476-20090814.1.txt to trunk, and I think there's a mismatch in the new filecache package name.

          – Philip

          Show
          Philip Zeyliger added a comment - Vinod, Thanks for updating the patch! Do you have an update to MAPREDUCE-711 that has the package move? I am trying to apply MAPREDUCE-711 -20090709-mapreduce.1.txt and MAPREDUCE-476 -20090814.1.txt to trunk, and I think there's a mismatch in the new filecache package name. – Philip
          Hide
          Vinod Kumar Vavilapalli added a comment -

          MAPREDUCE-711 is done on mapreduce side. So, no more blockers for this issue.

          Not sure what was wrong with the earlier patch. In any case, it was broken again by changes from MAPREDUCE-478. Attached is the updated patch.

          Can you please check the latest patch and see if anything is missing or if you need any more changes? If you are fine with it, can you ask someone to commit this, perhaps Tom? Already this patch is broken by a couple of patches, so I think the earlier this gets checked in, the better.

          Show
          Vinod Kumar Vavilapalli added a comment - MAPREDUCE-711 is done on mapreduce side. So, no more blockers for this issue. Not sure what was wrong with the earlier patch. In any case, it was broken again by changes from MAPREDUCE-478 . Attached is the updated patch. Can you please check the latest patch and see if anything is missing or if you need any more changes? If you are fine with it, can you ask someone to commit this, perhaps Tom? Already this patch is broken by a couple of patches, so I think the earlier this gets checked in, the better.
          Vinod Kumar Vavilapalli made changes -
          Attachment MAPREDUCE-476-20090818.txt [ 12416836 ]
          Vinod Kumar Vavilapalli made changes -
          Status Open [ 1 ] Patch Available [ 10002 ]
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12416836/MAPREDUCE-476-20090818.txt
          against trunk revision 805324.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 14 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          -1 findbugs. The patch appears to introduce 2 new Findbugs warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed core unit tests.

          -1 contrib tests. The patch failed contrib unit tests.

          Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/489/testReport/
          Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/489/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/489/artifact/trunk/build/test/checkstyle-errors.html
          Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/489/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12416836/MAPREDUCE-476-20090818.txt against trunk revision 805324. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 14 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 2 new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed core unit tests. -1 contrib tests. The patch failed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/489/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/489/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/489/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/489/console This message is automatically generated.
          Philip Zeyliger made changes -
          Attachment v6-to-v7.patch [ 12417332 ]
          Hide
          Philip Zeyliger added a comment -

          Fixing the test failures, or so I hope.

          The "workDir" handling in LocalJobRunner was confused, so I've fixed that. In general, the symlinking stuff is a bit of a complex mess: you can symlink things when you do "foo#bar", and also, if you set the config variable, everything gets symlinked into the local working dir. This is used by streaming. It seems like the latter layer should be done by streaming on its own, but, alas, it may be too late to do that.

          Note that I've attached a v6-vs-v7.patch file to make it easier to see what the latest changes were.

          – Philip

          Show
          Philip Zeyliger added a comment - Fixing the test failures, or so I hope. The "workDir" handling in LocalJobRunner was confused, so I've fixed that. In general, the symlinking stuff is a bit of a complex mess: you can symlink things when you do "foo#bar", and also, if you set the config variable, everything gets symlinked into the local working dir. This is used by streaming. It seems like the latter layer should be done by streaming on its own, but, alas, it may be too late to do that. Note that I've attached a v6-vs-v7.patch file to make it easier to see what the latest changes were. – Philip
          Philip Zeyliger made changes -
          Attachment MAPREDUCE-476-v7.patch [ 12417333 ]
          Philip Zeyliger made changes -
          Status Patch Available [ 10002 ] Open [ 1 ]
          Philip Zeyliger made changes -
          Status Open [ 1 ] Patch Available [ 10002 ]
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12417333/MAPREDUCE-476-v7.patch
          against trunk revision 806764.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 14 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          -1 findbugs. The patch appears to introduce 3 new Findbugs warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed core unit tests.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/504/testReport/
          Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/504/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/504/artifact/trunk/build/test/checkstyle-errors.html
          Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/504/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12417333/MAPREDUCE-476-v7.patch against trunk revision 806764. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 14 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 3 new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/504/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/504/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/504/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/504/console This message is automatically generated.
          Philip Zeyliger made changes -
          Status Patch Available [ 10002 ] Open [ 1 ]
          Hide
          Philip Zeyliger added a comment -

          The test failure was spurious (org.apache.hadoop.mapred.TestRecoveryManager.testRestartCount is failing elsewhere also, and has nothing to do with this patch).

          But the FindBugs errors were reasonable. New patch fixes the three reported, and pasted in below.

          Bad practice Warnings
          Code Warning
          DP org.apache.hadoop.mapreduce.filecache.TaskDistributedCacheManager.makeClassLoader(ClassLoader) creates a java.net.URLClassLoader classloader, which should be performed within a doPrivileged block

          Bug type DP_CREATE_CLASSLOADER_INSIDE_DO_PRIVILEGED (click for details)
          In class org.apache.hadoop.mapreduce.filecache.TaskDistributedCacheManager
          In method org.apache.hadoop.mapreduce.filecache.TaskDistributedCacheManager.makeClassLoader(ClassLoader)
          In class java.net.URLClassLoader
          At TaskDistributedCacheManager.java:[line 235]
          RV org.apache.hadoop.mapred.TaskRunner.setupWorkDir(JobConf, File) ignores exceptional return value of java.io.File.mkdir()

          Bug type RV_RETURN_VALUE_IGNORED_BAD_PRACTICE (click for details)
          In class org.apache.hadoop.mapred.TaskRunner
          In method org.apache.hadoop.mapred.TaskRunner.setupWorkDir(JobConf, File)
          Called method java.io.File.mkdir()
          At TaskRunner.java:[line 630]
          Performance Warnings
          Code Warning
          UrF Unread field: org.apache.hadoop.mapreduce.filecache.TaskDistributedCacheManager$CacheFile.localClassPath

          Show
          Philip Zeyliger added a comment - The test failure was spurious (org.apache.hadoop.mapred.TestRecoveryManager.testRestartCount is failing elsewhere also, and has nothing to do with this patch). But the FindBugs errors were reasonable. New patch fixes the three reported, and pasted in below. Bad practice Warnings Code Warning DP org.apache.hadoop.mapreduce.filecache.TaskDistributedCacheManager.makeClassLoader(ClassLoader) creates a java.net.URLClassLoader classloader, which should be performed within a doPrivileged block Bug type DP_CREATE_CLASSLOADER_INSIDE_DO_PRIVILEGED (click for details) In class org.apache.hadoop.mapreduce.filecache.TaskDistributedCacheManager In method org.apache.hadoop.mapreduce.filecache.TaskDistributedCacheManager.makeClassLoader(ClassLoader) In class java.net.URLClassLoader At TaskDistributedCacheManager.java: [line 235] RV org.apache.hadoop.mapred.TaskRunner.setupWorkDir(JobConf, File) ignores exceptional return value of java.io.File.mkdir() Bug type RV_RETURN_VALUE_IGNORED_BAD_PRACTICE (click for details) In class org.apache.hadoop.mapred.TaskRunner In method org.apache.hadoop.mapred.TaskRunner.setupWorkDir(JobConf, File) Called method java.io.File.mkdir() At TaskRunner.java: [line 630] Performance Warnings Code Warning UrF Unread field: org.apache.hadoop.mapreduce.filecache.TaskDistributedCacheManager$CacheFile.localClassPath
          Philip Zeyliger made changes -
          Attachment MAPREDUCE-476-v8.patch [ 12417438 ]
          Philip Zeyliger made changes -
          Status Open [ 1 ] Patch Available [ 10002 ]
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12417438/MAPREDUCE-476-v8.patch
          against trunk revision 807064.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 14 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed core unit tests.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/507/testReport/
          Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/507/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/507/artifact/trunk/build/test/checkstyle-errors.html
          Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/507/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12417438/MAPREDUCE-476-v8.patch against trunk revision 807064. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 14 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/507/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/507/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/507/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/507/console This message is automatically generated.
          Hide
          Tom White added a comment -

          Sorry Philip, but I've just noticed that the testFileSystemOtherThanDefault() test from TestDistributedCache (introduced in HADOOP-5635) got missed during the move to TestTrackerDistributedCacheManager.

          Show
          Tom White added a comment - Sorry Philip, but I've just noticed that the testFileSystemOtherThanDefault() test from TestDistributedCache (introduced in HADOOP-5635 ) got missed during the move to TestTrackerDistributedCacheManager.
          Hide
          Philip Zeyliger added a comment -

          Well-spotted, Tom. I've restored the missing test.

          Show
          Philip Zeyliger added a comment - Well-spotted, Tom. I've restored the missing test.
          Philip Zeyliger made changes -
          Attachment MAPREDUCE-476-v9.patch [ 12417497 ]
          Philip Zeyliger made changes -
          Status Patch Available [ 10002 ] Open [ 1 ]
          Philip Zeyliger made changes -
          Status Open [ 1 ] Patch Available [ 10002 ]
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12417497/MAPREDUCE-476-v9.patch
          against trunk revision 807165.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 14 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed core unit tests.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/509/testReport/
          Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/509/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/509/artifact/trunk/build/test/checkstyle-errors.html
          Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/509/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12417497/MAPREDUCE-476-v9.patch against trunk revision 807165. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 14 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/509/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/509/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/509/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/509/console This message is automatically generated.
          Hide
          Philip Zeyliger added a comment -

          Failing test is "org.apache.hadoop.mapred.TestRecoveryManager.testRestartCount". I think that's failing all-over, not just here.

          – Philip

          Show
          Philip Zeyliger added a comment - Failing test is "org.apache.hadoop.mapred.TestRecoveryManager.testRestartCount". I think that's failing all-over, not just here. – Philip
          Hide
          Tom White added a comment -

          I've just committed this. Thanks for your persistence on this one, Philip!

          Show
          Tom White added a comment - I've just committed this. Thanks for your persistence on this one, Philip!
          Tom White made changes -
          Status Patch Available [ 10002 ] Resolved [ 5 ]
          Hadoop Flags [Reviewed]
          Fix Version/s 0.21.0 [ 12314045 ]
          Resolution Fixed [ 1 ]
          Hide
          Ravi Gummadi added a comment -

          Attaching patch for Y! 20 distribution. Not for commit here.

          Show
          Ravi Gummadi added a comment - Attaching patch for Y! 20 distribution. Not for commit here.
          Ravi Gummadi made changes -
          Attachment 476.20S-2.patch [ 12430866 ]
          Vinod Kumar Vavilapalli made changes -
          Release Note Extended DistributedCache to work with LocalJobRunner.
          Tom White made changes -
          Status Resolved [ 5 ] Closed [ 6 ]

            People

            • Assignee:
              Philip Zeyliger
              Reporter:
              sam rash
            • Votes:
              2 Vote for this issue
              Watchers:
              13 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development