Uploaded image for project: 'Hadoop Map/Reduce'
  1. Hadoop Map/Reduce
  2. MAPREDUCE-6639

Process hangs in LocatedFileStatusFetcher if FileSystem.get throws

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 2.7.2
    • Fix Version/s: 2.8.0, 3.0.0-alpha1
    • Component/s: mrv2
    • Labels:
      None

      Description

      ListLocatedFileStatusFetcher uses a thread pool, but one of the Callable thread functions, ProcessInitialInputPathCallable, doesn't catch exceptions (the callbacks do). When an exception is thrown, the thread exists and doesn't signal the error to the calling thread, which continues waiting to be signaled. This can happen when a FS implementation cannot be found.

        Activity

        Hide
        rdblue Ryan Blue added a comment -

        Attaching a fix. This is slightly different from what I suggested above. Because a result needs to be returned for the future, I've added an unknownError field to the results that the futures check. If there is an error, the futures call registerError.

        Show
        rdblue Ryan Blue added a comment - Attaching a fix. This is slightly different from what I suggested above. Because a result needs to be returned for the future, I've added an unknownError field to the results that the futures check. If there is an error, the futures call registerError.
        Hide
        rdblue Ryan Blue added a comment -

        Adding a better patch. The problem was in the error handling. The calling thread waits for the operation to succeed or until signaled that unknownError is set. When an error is passed to signal that thread, there's a check for whether another error has already been set, but the check is wrong. It only sets the error if another error is already set, rather than if no other error has been set. The result is that if there is an error, the caller is never signaled and waits indefinitely for all of the tasks to complete successfully.

        Show
        rdblue Ryan Blue added a comment - Adding a better patch. The problem was in the error handling. The calling thread waits for the operation to succeed or until signaled that unknownError is set. When an error is passed to signal that thread, there's a check for whether another error has already been set, but the check is wrong. It only sets the error if another error is already set, rather than if no other error has been set. The result is that if there is an error, the caller is never signaled and waits indefinitely for all of the tasks to complete successfully.
        Hide
        rdblue Ryan Blue added a comment -

        Robert Kanter, can you take a look at this?

        Show
        rdblue Ryan Blue added a comment - Robert Kanter , can you take a look at this?
        Hide
        hadoopqa Hadoop QA added a comment -
        -1 overall



        Vote Subsystem Runtime Comment
        0 reexec 0m 16s Docker mode activated.
        +1 @author 0m 0s The patch does not contain any @author tags.
        -1 test4tests 0m 0s The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch.
        +1 mvninstall 8m 31s trunk passed
        +1 compile 0m 30s trunk passed with JDK v1.8.0_77
        +1 compile 0m 29s trunk passed with JDK v1.7.0_95
        +1 checkstyle 0m 20s trunk passed
        +1 mvnsite 0m 37s trunk passed
        +1 mvneclipse 0m 16s trunk passed
        +1 findbugs 1m 16s trunk passed
        +1 javadoc 0m 30s trunk passed with JDK v1.8.0_77
        +1 javadoc 0m 30s trunk passed with JDK v1.7.0_95
        +1 mvninstall 0m 30s the patch passed
        +1 compile 0m 28s the patch passed with JDK v1.8.0_77
        +1 javac 0m 28s the patch passed
        +1 compile 0m 27s the patch passed with JDK v1.7.0_95
        +1 javac 0m 27s the patch passed
        +1 checkstyle 0m 18s the patch passed
        +1 mvnsite 0m 35s the patch passed
        +1 mvneclipse 0m 14s the patch passed
        +1 whitespace 0m 0s Patch has no whitespace issues.
        +1 findbugs 1m 29s the patch passed
        +1 javadoc 0m 27s the patch passed with JDK v1.8.0_77
        +1 javadoc 0m 28s the patch passed with JDK v1.7.0_95
        +1 unit 2m 41s hadoop-mapreduce-client-core in the patch passed with JDK v1.8.0_77.
        +1 unit 2m 46s hadoop-mapreduce-client-core in the patch passed with JDK v1.7.0_95.
        +1 asflicense 0m 21s Patch does not generate ASF License warnings.
        25m 14s



        Subsystem Report/Notes
        Docker Image:yetus/hadoop:fbe3e86
        JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12799855/MAPREDUCE-6639.2.patch
        JIRA Issue MAPREDUCE-6639
        Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle
        uname Linux b3f4bb5c3ea6 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
        Build tool maven
        Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
        git revision trunk / 63ac2db
        Default Java 1.7.0_95
        Multi-JDK versions /usr/lib/jvm/java-8-oracle:1.8.0_77 /usr/lib/jvm/java-7-openjdk-amd64:1.7.0_95
        findbugs v3.0.0
        JDK v1.7.0_95 Test Results https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/6451/testReport/
        modules C: hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core U: hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core
        Console output https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/6451/console
        Powered by Apache Yetus 0.2.0 http://yetus.apache.org

        This message was automatically generated.

        Show
        hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 16s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. -1 test4tests 0m 0s The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 mvninstall 8m 31s trunk passed +1 compile 0m 30s trunk passed with JDK v1.8.0_77 +1 compile 0m 29s trunk passed with JDK v1.7.0_95 +1 checkstyle 0m 20s trunk passed +1 mvnsite 0m 37s trunk passed +1 mvneclipse 0m 16s trunk passed +1 findbugs 1m 16s trunk passed +1 javadoc 0m 30s trunk passed with JDK v1.8.0_77 +1 javadoc 0m 30s trunk passed with JDK v1.7.0_95 +1 mvninstall 0m 30s the patch passed +1 compile 0m 28s the patch passed with JDK v1.8.0_77 +1 javac 0m 28s the patch passed +1 compile 0m 27s the patch passed with JDK v1.7.0_95 +1 javac 0m 27s the patch passed +1 checkstyle 0m 18s the patch passed +1 mvnsite 0m 35s the patch passed +1 mvneclipse 0m 14s the patch passed +1 whitespace 0m 0s Patch has no whitespace issues. +1 findbugs 1m 29s the patch passed +1 javadoc 0m 27s the patch passed with JDK v1.8.0_77 +1 javadoc 0m 28s the patch passed with JDK v1.7.0_95 +1 unit 2m 41s hadoop-mapreduce-client-core in the patch passed with JDK v1.8.0_77. +1 unit 2m 46s hadoop-mapreduce-client-core in the patch passed with JDK v1.7.0_95. +1 asflicense 0m 21s Patch does not generate ASF License warnings. 25m 14s Subsystem Report/Notes Docker Image:yetus/hadoop:fbe3e86 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12799855/MAPREDUCE-6639.2.patch JIRA Issue MAPREDUCE-6639 Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle uname Linux b3f4bb5c3ea6 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / 63ac2db Default Java 1.7.0_95 Multi-JDK versions /usr/lib/jvm/java-8-oracle:1.8.0_77 /usr/lib/jvm/java-7-openjdk-amd64:1.7.0_95 findbugs v3.0.0 JDK v1.7.0_95 Test Results https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/6451/testReport/ modules C: hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core U: hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core Console output https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/6451/console Powered by Apache Yetus 0.2.0 http://yetus.apache.org This message was automatically generated.
        Hide
        stevel@apache.org Steve Loughran added a comment -

        I concur that it is a bug and that the patch fixes it. Although there's no test for (it'd be possible, but tricky), the IDE highlights that the field unknownError is only set in the conditional clause. That is: the code which sets the field can only be reached if the field is set. Accordingly, the field can never be set. This patch fixes that.

        Show
        stevel@apache.org Steve Loughran added a comment - I concur that it is a bug and that the patch fixes it. Although there's no test for (it'd be possible, but tricky), the IDE highlights that the field unknownError is only set in the conditional clause. That is: the code which sets the field can only be reached if the field is set. Accordingly, the field can never be set. This patch fixes that.
        Hide
        hudson Hudson added a comment -

        SUCCESS: Integrated in Hadoop-trunk-Commit #9753 (See https://builds.apache.org/job/Hadoop-trunk-Commit/9753/)
        MAPREDUCE-6639 Process hangs in LocatedFileStatusFetcher if (stevel: rev 7eddecd357014d4793df4bf2e5d987add02289f5)

        • hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/LocatedFileStatusFetcher.java
        Show
        hudson Hudson added a comment - SUCCESS: Integrated in Hadoop-trunk-Commit #9753 (See https://builds.apache.org/job/Hadoop-trunk-Commit/9753/ ) MAPREDUCE-6639 Process hangs in LocatedFileStatusFetcher if (stevel: rev 7eddecd357014d4793df4bf2e5d987add02289f5) hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/LocatedFileStatusFetcher.java

          People

          • Assignee:
            rdblue Ryan Blue
            Reporter:
            rdblue Ryan Blue
          • Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development