Hadoop Map/Reduce
  1. Hadoop Map/Reduce
  2. MAPREDUCE-1641

Job submission should fail if same uri is added for mapred.cache.files and mapred.cache.archives

    Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: distributed-cache
    • Labels:
      None

      Description

      The behavior of mapred.cache.files and mapred.cache.archives is different during localization in the following way:

      If a jar file is added to mapred.cache.files, it will be localized under TaskTracker under a unique path.
      If a jar file is added to mapred.cache.archives, it will be localized under a unique path in a directory named the jar file name, and will be unarchived under the same directory.

      If same jar file is passed for both the configurations, the behavior undefined. Thus the job submission should fail.
      Currently, since distributed cache processes files before archives, the jar file will be just localized and not unarchived.

      1. patch-1641-ydist-bugfix.txt
        4 kB
        Amareshwari Sriramadasu
      2. mapreduce-1641--2010-05-21.patch
        21 kB
        Dick King
      3. mapreduce-1641--2010-05-19.patch
        21 kB
        Al Thompson
      4. mapreduce-1641--2010-04-27.patch
        21 kB
        Dick King
      5. duped-files-archives--off-0-20-101--2010-04-23--1819.patch
        26 kB
        Dick King
      6. duped-files-archives--off-0-20-101--2010-04-21.patch
        21 kB
        Dick King
      7. BZ-3539321--off-0-20-101--2010-04-20.patch
        11 kB
        Dick King

        Issue Links

          Activity

          Hide
          Allen Wittenauer added a comment -

          Likely fixed.

          Show
          Allen Wittenauer added a comment - Likely fixed.
          Hide
          Amareshwari Sriramadasu added a comment -

          Sorry for picking up the patch late. Can you update the patch to trunk?

          Show
          Amareshwari Sriramadasu added a comment - Sorry for picking up the patch late. Can you update the patch to trunk?
          Hide
          Amareshwari Sriramadasu added a comment -

          latest patch looks fine.

          Show
          Amareshwari Sriramadasu added a comment - latest patch looks fine.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12445204/mapreduce-1641--2010-05-21.patch
          against trunk revision 947112.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 9 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed core unit tests.

          -1 contrib tests. The patch failed contrib unit tests.

          Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/201/testReport/
          Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/201/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/201/artifact/trunk/build/test/checkstyle-errors.html
          Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/201/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12445204/mapreduce-1641--2010-05-21.patch against trunk revision 947112. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 9 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed core unit tests. -1 contrib tests. The patch failed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/201/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/201/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/201/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/201/console This message is automatically generated.
          Hide
          Dick King added a comment -

          This is a new patch to accommodate trunk changes in locations of defined literals.

          Show
          Dick King added a comment - This is a new patch to accommodate trunk changes in locations of defined literals.
          Hide
          Amareshwari Sriramadasu added a comment -

          The new patch does not apply to trunk. I tried to resolve conflicts and apply. Then, it does not compile also.
          Can you please update the patch to trunk?

          Show
          Amareshwari Sriramadasu added a comment - The new patch does not apply to trunk. I tried to resolve conflicts and apply. Then, it does not compile also. Can you please update the patch to trunk?
          Hide
          Al Thompson added a comment -

          Minor edits made to the patch in an effort to improve readability.

          Show
          Al Thompson added a comment - Minor edits made to the patch in an effort to improve readability.
          Hide
          Amareshwari Sriramadasu added a comment -

          Should mapred.cache.libjars be checked too?

          Dick, there is no such configuration. files passed in -libjars are added mapreduce.job.classpath.files to and mapreduce.job.cache.files.

          The patch needs to be updated to trunk. so canceling.

          Show
          Amareshwari Sriramadasu added a comment - Should mapred.cache.libjars be checked too? Dick, there is no such configuration. files passed in -libjars are added mapreduce.job.classpath.files to and mapreduce.job.cache.files. The patch needs to be updated to trunk. so canceling.
          Hide
          Dick King added a comment -

          Should mapred.cache.libjars be checked too?

          Show
          Dick King added a comment - Should mapred.cache.libjars be checked too?
          Hide
          Amareshwari Sriramadasu added a comment -

          +1 patch for trunk looks good.

          Show
          Amareshwari Sriramadasu added a comment - +1 patch for trunk looks good.
          Hide
          Dick King added a comment -

          I checked out a trunk from Apache's git repository shortly before I checked in this patch. It, too, fails TestJobACLs.testACLS .

          Show
          Dick King added a comment - I checked out a trunk from Apache's git repository shortly before I checked in this patch. It, too, fails TestJobACLs.testACLS .
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12442974/mapreduce-1641--2010-04-27.patch
          against trunk revision 938576.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 9 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed core unit tests.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/148/testReport/
          Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/148/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/148/artifact/trunk/build/test/checkstyle-errors.html
          Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/148/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12442974/mapreduce-1641--2010-04-27.patch against trunk revision 938576. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 9 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/148/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/148/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/148/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/148/console This message is automatically generated.
          Hide
          Dick King added a comment -

          The code to ask TrackerDistributedCacheManager to copy dfs files to the local cache is in JobSubmitter in trunk. This patch is otherwise similar to the previous patches [including the amendment].

          Show
          Dick King added a comment - The code to ask TrackerDistributedCacheManager to copy dfs files to the local cache is in JobSubmitter in trunk . This patch is otherwise similar to the previous patches [including the amendment] .
          Hide
          Amareshwari Sriramadasu added a comment -

          Patch for yahoo! distribution fixing the code in JobClient.

          Show
          Amareshwari Sriramadasu added a comment - Patch for yahoo! distribution fixing the code in JobClient.
          Hide
          Amareshwari Sriramadasu added a comment -

          The following code change in JobClient does not look correct

          @@ -767,6 +766,9 @@ public class JobClient extends Configured implements MRConstants, Tool  {
                          (new Path("file:///" +  binaryTokenFilename), jobCopy);
                     }
          
          +          // First we check whether the cached archives and files are legal.
          +          TrackerDistributedCacheManager.validate(jobCopy);
          +
                     copyAndConfigureFiles(jobCopy, submitJobDir);
          

          copyAndConfigureFiles adds files/archives given for command line options: -files, -archives, -libjars. So, the patch does not validate these files. Validate should happen after the call to copyAndConfigureFiles.
          A test with same file added for -files and -archives option would fail with the patch.

          Show
          Amareshwari Sriramadasu added a comment - The following code change in JobClient does not look correct @@ -767,6 +766,9 @@ public class JobClient extends Configured implements MRConstants, Tool { ( new Path( "file: ///" + binaryTokenFilename), jobCopy); } + // First we check whether the cached archives and files are legal. + TrackerDistributedCacheManager.validate(jobCopy); + copyAndConfigureFiles(jobCopy, submitJobDir); copyAndConfigureFiles adds files/archives given for command line options: -files, -archives, -libjars. So, the patch does not validate these files. Validate should happen after the call to copyAndConfigureFiles. A test with same file added for -files and -archives option would fail with the patch.
          Hide
          Dick King added a comment -

          This patch applies to 0.20.101, not to trunk, and needs to be forward ported. However, it is licensed.

          Show
          Dick King added a comment - This patch applies to 0.20.101, not to trunk, and needs to be forward ported. However, it is licensed.
          Hide
          Dick King added a comment -

          This is a replacement patch for an internal release; it may not apply to trunk . However, it is ASF licensed.

          Show
          Dick King added a comment - This is a replacement patch for an internal release; it may not apply to trunk . However, it is ASF licensed.
          Hide
          Amareshwari Sriramadasu added a comment -

          Dick, though DistributedCache.validateCachesDontOverlap is a public api, it is not a user facing api. So, I think we should move it to TrackerDistributedCacheManager, where other public apis (which framework uses) like determineTimeStamps, determineCacheVisibilities are present. We should also remove the comment saying "It is a public api" and add "it is used by framework only".

          Show
          Amareshwari Sriramadasu added a comment - Dick, though DistributedCache.validateCachesDontOverlap is a public api, it is not a user facing api. So, I think we should move it to TrackerDistributedCacheManager, where other public apis (which framework uses) like determineTimeStamps, determineCacheVisibilities are present. We should also remove the comment saying "It is a public api" and add "it is used by framework only".
          Hide
          Arun C Murthy added a comment -

          Dick, some comments:

          1. I'd rename DistributedCache.validateCachesDontOverlap to DistributedCache.validate to keep it simple, in future we may add more validations there, so having one public api might prove useful and succint.
          2. In the same function, and elsewhere, I'd define a public static final String for 'mapred.cache.archives' and 'mapred.cache.files' and use them instead of the raw strings.
          3. Request you to add a test-case which uses the Mini* clusters to ensure the validation works with fully-qualified hdfs uris (hdfs://nn:8000/user/blah/foo), and absolute Paths (/user/blah/foo).
          Show
          Arun C Murthy added a comment - Dick, some comments: I'd rename DistributedCache.validateCachesDontOverlap to DistributedCache.validate to keep it simple, in future we may add more validations there, so having one public api might prove useful and succint. In the same function, and elsewhere, I'd define a public static final String for 'mapred.cache.archives' and 'mapred.cache.files' and use them instead of the raw strings. Request you to add a test-case which uses the Mini* clusters to ensure the validation works with fully-qualified hdfs uris (hdfs://nn:8000/user/blah/foo), and absolute Paths (/user/blah/foo).
          Hide
          Dick King added a comment -

          This is a patch for an internal release; it may not apply to trunk . However, it is ASF licensed.

          Show
          Dick King added a comment - This is a patch for an internal release; it may not apply to trunk . However, it is ASF licensed.
          Hide
          Dick King added a comment -

          The subject of this jira is that in one case a user specified that a particular archive be both in mapred.cache.archives and mapred.cache.files . This duplication was not forseen in the code, which happened to do something unpleasant. While the particular instance that triggered the bug report was accidental, it occurs to me that you could want that – especially with a .jar file.

          I see the point that if we silently explode a .jar file as well as copying it locally [and using it as a {{-classpath}}, presumably] we could be exposed to pain – not in terms of the code working, because the exploded copy wouldn't ever be accessed, but in terms of cleanup performance.

          For the time being, I'll just error out as described on 16/Apr/10 08:12 PM .

          Show
          Dick King added a comment - The subject of this jira is that in one case a user specified that a particular archive be both in mapred.cache.archives and mapred.cache.files . This duplication was not forseen in the code, which happened to do something unpleasant. While the particular instance that triggered the bug report was accidental, it occurs to me that you could want that – especially with a .jar file. I see the point that if we silently explode a .jar file as well as copying it locally [and using it as a {{-classpath}}, presumably] we could be exposed to pain – not in terms of the code working, because the exploded copy wouldn't ever be accessed, but in terms of cleanup performance. For the time being, I'll just error out as described on 16/Apr/10 08:12 PM .
          Hide
          Amareshwari Sriramadasu added a comment -

          Perhaps we should allow this, and both localize the file and unarchive it? What do you think?

          We should not make the file option to unarchive the file. We have seen many use cases where users do not want their jars to be unjarred, for example HADOOP-5175

          We perform the check for conflicts between mapred.cache.files and mapred.cache.archives when the user finally submits the offending JobConf .

          +1

          In particular, I plan to make a new class DistributedCache.DuplicatedURI extends InvalidJobConfException and throw that .

          +1

          Show
          Amareshwari Sriramadasu added a comment - Perhaps we should allow this, and both localize the file and unarchive it? What do you think? We should not make the file option to unarchive the file. We have seen many use cases where users do not want their jars to be unjarred, for example HADOOP-5175 We perform the check for conflicts between mapred.cache.files and mapred.cache.archives when the user finally submits the offending JobConf . +1 In particular, I plan to make a new class DistributedCache.DuplicatedURI extends InvalidJobConfException and throw that . +1
          Hide
          Dick King added a comment -

          We will not allow this file duplication as proposed in 16/Apr/10 11:41AM.

          However, we will not throw an IllegalArgumentException . We will throw an InvalidJobconfException instead. A consequence of this is that the check cannot be performed as you add individual files or blocks of files to the cache; the interface is wrong. We perform the check for conflicts between mapred.cache.files and mapred.cache.archives when the user finally submits the offending JobConf .

          In particular, I plan to make a new class DistributedCache.DuplicatedURI extends InvalidJobConfException and throw that .

          Show
          Dick King added a comment - We will not allow this file duplication as proposed in 16/Apr/10 11:41AM. However, we will not throw an IllegalArgumentException . We will throw an InvalidJobconfException instead. A consequence of this is that the check cannot be performed as you add individual files or blocks of files to the cache; the interface is wrong. We perform the check for conflicts between mapred.cache.files and mapred.cache.archives when the user finally submits the offending JobConf . In particular, I plan to make a new class DistributedCache.DuplicatedURI extends InvalidJobConfException and throw that .
          Hide
          Dick King added a comment -

          Perhaps we should allow this, and both localize the file and unarchive it? What do you think?

          Show
          Dick King added a comment - Perhaps we should allow this, and both localize the file and unarchive it? What do you think?
          Hide
          Dick King added a comment -

          I think I should throw an InvalidArgumentException when a particular file is added to the second of mapred.cache.files and mapred.cache.archives . I will have to check that it doesn't get caught and ignored somewhere in the stack. Does anyone out there have a comment on this plan?

          Show
          Dick King added a comment - I think I should throw an InvalidArgumentException when a particular file is added to the second of mapred.cache.files and mapred.cache.archives . I will have to check that it doesn't get caught and ignored somewhere in the stack. Does anyone out there have a comment on this plan?

            People

            • Assignee:
              Dick King
              Reporter:
              Amareshwari Sriramadasu
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development