Uploaded image for project: 'Hadoop Map/Reduce'
  1. Hadoop Map/Reduce
  2. MAPREDUCE-121

DistributedCache parses Paths with sheme or port components incorrectly

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • distributed-cache
    • None
    • linux ("path.separator" is ":")
      hdfs filesystem (not "local")

    Description

      When passing paths with scheme or port components set up (like
      "hdfs://localhost:9000/deploy/hello") to DistributedCache.addFileToClassPath, they are appended to configuration option "mapred.job.classpath.files" using delimeter "path.separator", which is ":".
      This misleads DistributedCache.getFileClassPath: same symbol is used to delimete parts of Path and whole paths.

      Example:
      I have some jars and conf-files in hdfs directory "/deploy". Next code adds them to job's classpath:

      Test.java
           Path deployPath = new Path("/deploy");
            FileSystem fs = deployPath.getFileSystem(new Configuration());
      
            FileStatus[] jars = fs.listStatus(deployPath);
            for (int i = 0; i < jars.length; i++) {
              System.out.println(jars[i].getPath());
              DistributedCache.addFileToClassPath(jars[i].getPath(), job);
            }
      

      Launhing task gives stdout output:

      hdfs://localhost:9000/deploy/hello
      

      And "mapred.job.classpath.files" is set to "hdfs://localhost:9000/deploy/hello" by DistributedCache.
      And DistributedCache.getFileClassPaths returns incorrect paths like "9000/deploy/hello/home/gudok/Work/test/bin/../conf".

      For now, I've solved this problem by submitting Paths without scheme and port ("/deploy/hello").

      Other DistributedCache methods need to be reviewed to.

      Attachments

        Activity

          People

            Unassigned Unassigned
            gudok Andrei Gudkov
            Votes:
            1 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated: