Uploaded image for project: 'Hadoop Map/Reduce'
  1. Hadoop Map/Reduce
  2. MAPREDUCE-121

DistributedCache parses Paths with sheme or port components incorrectly

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: distributed-cache
    • Labels:
      None
    • Environment:

      linux ("path.separator" is ":")
      hdfs filesystem (not "local")

      Description

      When passing paths with scheme or port components set up (like
      "hdfs://localhost:9000/deploy/hello") to DistributedCache.addFileToClassPath, they are appended to configuration option "mapred.job.classpath.files" using delimeter "path.separator", which is ":".
      This misleads DistributedCache.getFileClassPath: same symbol is used to delimete parts of Path and whole paths.

      Example:
      I have some jars and conf-files in hdfs directory "/deploy". Next code adds them to job's classpath:

      Test.java
           Path deployPath = new Path("/deploy");
            FileSystem fs = deployPath.getFileSystem(new Configuration());
      
            FileStatus[] jars = fs.listStatus(deployPath);
            for (int i = 0; i < jars.length; i++) {
              System.out.println(jars[i].getPath());
              DistributedCache.addFileToClassPath(jars[i].getPath(), job);
            }
      

      Launhing task gives stdout output:

      hdfs://localhost:9000/deploy/hello
      

      And "mapred.job.classpath.files" is set to "hdfs://localhost:9000/deploy/hello" by DistributedCache.
      And DistributedCache.getFileClassPaths returns incorrect paths like "9000/deploy/hello/home/gudok/Work/test/bin/../conf".

      For now, I've solved this problem by submitting Paths without scheme and port ("/deploy/hello").

      Other DistributedCache methods need to be reviewed to.

        Attachments

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              gudok Andrei Gudkov
            • Votes:
              1 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated: