Hadoop Common
  1. Hadoop Common
  2. HADOOP-4717

Removal of default port# in NameNode.getUri() cause a map/reduce job failed to prompt temporay output

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Blocker Blocker
    • Resolution: Fixed
    • Affects Version/s: 0.18.0
    • Fix Version/s: 0.18.3
    • Component/s: None
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      Problem reported here is that when the default port number (8020) is specified in the output, job succeeds but no output is created. The cause of the problem is that "listStatus" call drops the port number because NameNode.getUri removes the default port#.

      Assuming that a map/reduce output directory is set to be "hdfs://localhost:8020/out", A call "listStatus" on any of its sub directory, for example, "hdfs://localhost:8020/out/tempXX", returns results like below:

      hdfs://localhost/out/tempXX/part-00005

      Because of this, Task.java
      574 private Path getFinalPath(Path jobOutputDir, Path taskOutput) {
      575 URI relativePath = taskOutputPath.toUri().relativize(taskOutput.toUri());

      does not get the correct relativePath because TaskOutputPath contain ports, but taskOutput doesn't.

      It seems to me that the problem could be fixed if we make Path.makeQualified() to return the same path not matter the input path contains the default port or not.

      1. relativePath1.patch
        4 kB
        Hairong Kuang
      2. relativePath.patch
        4 kB
        Hairong Kuang
      3. HADOOP-4717.patch
        1 kB
        Doug Cutting

        Issue Links

          Activity

          Hairong Kuang created issue -
          Hide
          Doug Cutting added a comment -

          It seems to me that the output directory should somewhere be normalized by calling FileSystem#makeQualified() on it, so that it's of the form "hdfs://localhost/out/".

          Show
          Doug Cutting added a comment - It seems to me that the output directory should somewhere be normalized by calling FileSystem#makeQualified() on it, so that it's of the form "hdfs://localhost/out/".
          Hide
          Hairong Kuang added a comment -

          I wrote a test. It showed that FileSystem#makeQualified() did not remove the default port# even if the input path contains the default port #.

          Show
          Hairong Kuang added a comment - I wrote a test. It showed that FileSystem#makeQualified() did not remove the default port# even if the input path contains the default port #.
          Hide
          Doug Cutting added a comment -

          Here's a patch that changes DistributedFileSystem#makeQualified() to remove the default port if it's specified. Does that fix things for you?

          Show
          Doug Cutting added a comment - Here's a patch that changes DistributedFileSystem#makeQualified() to remove the default port if it's specified. Does that fix things for you?
          Doug Cutting made changes -
          Field Original Value New Value
          Attachment HADOOP-4717.patch [ 12394705 ]
          Nigel Daley made changes -
          Priority Major [ 3 ] Blocker [ 1 ]
          Nigel Daley made changes -
          Assignee Hairong Kuang [ hairong ]
          Hairong Kuang made changes -
          Link This issue is blocked by HADOOP-4746 [ HADOOP-4746 ]
          Hide
          Hairong Kuang added a comment -

          Yes, it works as long as we also fix HADOOP-4746. Doug, could you please include a junit test?

          Show
          Hairong Kuang added a comment - Yes, it works as long as we also fix HADOOP-4746 . Doug, could you please include a junit test?
          Hairong Kuang made changes -
          Assignee Hairong Kuang [ hairong ] Doug Cutting [ cutting ]
          Hide
          Koji Noguchi added a comment -

          I understand that HADOOP-4717&HADOOP-4746 would fix the problem, but can we throw an Exception when

          575 URI relativePath = taskOutputPath.toUri().relativize(taskOutput.toUri());
          

          doesn't return a relativePath?
          If we hit a similar issue again, I would rather have the job fail
          than job returning 0 but silently deleting the output.

          Show
          Koji Noguchi added a comment - I understand that HADOOP-4717 & HADOOP-4746 would fix the problem, but can we throw an Exception when 575 URI relativePath = taskOutputPath.toUri().relativize(taskOutput.toUri()); doesn't return a relativePath? If we hit a similar issue again, I would rather have the job fail than job returning 0 but silently deleting the output.
          Hide
          Hairong Kuang added a comment -

          In addition to Doug's change, this patch
          1. throws IOException if relativitize fails as Koji suggested;
          2. add a unit test to make sure a map/reduce job with output path containing no port works.

          Show
          Hairong Kuang added a comment - In addition to Doug's change, this patch 1. throws IOException if relativitize fails as Koji suggested; 2. add a unit test to make sure a map/reduce job with output path containing no port works.
          Hairong Kuang made changes -
          Attachment relativePath.patch [ 12395242 ]
          Hide
          Hairong Kuang added a comment -

          This new patch makes two changes in my newly added unit test:
          1. When dfs cluster fails to start because of server binding exception, log the error and skip the test;
          2. The map/reduce job has a output path that includes the default NameNode port#.

          Show
          Hairong Kuang added a comment - This new patch makes two changes in my newly added unit test: 1. When dfs cluster fails to start because of server binding exception, log the error and skip the test; 2. The map/reduce job has a output path that includes the default NameNode port#.
          Hairong Kuang made changes -
          Attachment relativePath1.patch [ 12395320 ]
          Hide
          Doug Cutting added a comment -

          +1 This looks good to me.

          Show
          Doug Cutting added a comment - +1 This looks good to me.
          Hide
          Hairong Kuang added a comment -

          Ant test-core succeeded:
          BUILD SUCCESSFUL
          Total time: 113 minutes 54 seconds

          Ant test-patch succeeded:
          [exec] +1 overall.

          [exec] +1 @author. The patch does not contain any @author tags.

          [exec] +1 tests included. The patch appears to include 3 new or modified tests.

          [exec] +1 javadoc. The javadoc tool did not generate any warning messages.

          [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          [exec] +1 findbugs. The patch does not introduce any new Findbugs warnings.

          [exec] +1 Eclipse classpath. The patch retains Eclipse classpath integrity.

          Show
          Hairong Kuang added a comment - Ant test-core succeeded: BUILD SUCCESSFUL Total time: 113 minutes 54 seconds Ant test-patch succeeded: [exec] +1 overall. [exec] +1 @author. The patch does not contain any @author tags. [exec] +1 tests included. The patch appears to include 3 new or modified tests. [exec] +1 javadoc. The javadoc tool did not generate any warning messages. [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings. [exec] +1 findbugs. The patch does not introduce any new Findbugs warnings. [exec] +1 Eclipse classpath. The patch retains Eclipse classpath integrity.
          Hide
          Hairong Kuang added a comment -

          I've just committed this.

          Show
          Hairong Kuang added a comment - I've just committed this.
          Hairong Kuang made changes -
          Resolution Fixed [ 1 ]
          Hadoop Flags [Reviewed]
          Assignee Doug Cutting [ cutting ] Hairong Kuang [ hairong ]
          Status Open [ 1 ] Resolved [ 5 ]
          Hide
          Hudson added a comment -
          Show
          Hudson added a comment - Integrated in Hadoop-trunk #680 (See http://hudson.zones.apache.org/hudson/job/Hadoop-trunk/680/ )
          Nigel Daley made changes -
          Status Resolved [ 5 ] Closed [ 6 ]
          Owen O'Malley made changes -
          Component/s dfs [ 12310710 ]
          Koji Noguchi made changes -
          Link This issue relates to MAPREDUCE-837 [ MAPREDUCE-837 ]
          Transition Time In Source Status Execution Times Last Executer Last Execution Date
          Open Open Resolved Resolved
          11d 49m 1 Hairong Kuang 05/Dec/08 20:00
          Resolved Resolved Closed Closed
          56d 13m 1 Nigel Daley 30/Jan/09 20:14

            People

            • Assignee:
              Hairong Kuang
              Reporter:
              Hairong Kuang
            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development