Uploaded image for project: 'Hadoop Map/Reduce'
  1. Hadoop Map/Reduce
  2. MAPREDUCE-6684

High contention on scanning of user directory under immediate_done in Job History Server

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Critical
    • Resolution: Fixed
    • Affects Version/s: 2.7.0
    • Fix Version/s: 2.8.0, 3.0.0-alpha1
    • Component/s: jobhistoryserver
    • Labels:
      None
    • Target Version/s:
    • Hadoop Flags:
      Reviewed

      Description

      HistoryFileManager.scanIntermediateDirectory() in JHS acquires a lock on each user directory it tries to scan (move or delete files under the user directory as necessary). This method is called in a thread in JobHistory that performs periodical scanning of intermediate directory, and can also be called by web server threads for each Web API call made by a JHS client. In cases where there are many concurrent Web API calls/connections to JHS, all but one thread are blocked on the lock on the user directory. Eventually, client connects will time out, but the threads in JHS will not be killed and leave a lot of TCP connections in CLOSE_WAIT state.

      [systest@vb1120 ~]$ sudo netstat -nap | grep 63729 | sort -k 4
      tcp        0      0 10.17.202.19:10020          0.0.0.0:*                   LISTEN      63729/java          
      tcp        0      0 10.17.202.19:10020          10.17.198.30:33010          ESTABLISHED 63729/java          
      tcp        0      0 10.17.202.19:10020          10.17.200.30:33980          ESTABLISHED 63729/java          
      tcp        0      0 10.17.202.19:10020          10.17.202.10:59625          ESTABLISHED 63729/java          
      tcp        0      0 10.17.202.19:10020          10.17.202.13:35765          ESTABLISHED 63729/java          
      tcp        0      0 10.17.202.19:10033          0.0.0.0:*                   LISTEN      63729/java          
      tcp        0      0 10.17.202.19:19888          0.0.0.0:*                   LISTEN      63729/java          
      tcp        0      0 10.17.202.19:19888          10.17.198.30:35103          ESTABLISHED 63729/java          
      tcp      277      0 10.17.202.19:19888          10.17.198.30:43670          ESTABLISHED 63729/java          
      tcp        0      0 10.17.202.19:19888          10.17.198.30:45453          ESTABLISHED 63729/java          
      tcp      277      0 10.17.202.19:19888          10.17.198.30:49184          ESTABLISHED 63729/java          
      tcp        1      0 10.17.202.19:19888          10.17.202.13:49992          CLOSE_WAIT  63729/java          
      tcp      261      0 10.17.202.19:19888          10.17.202.13:52703          CLOSE_WAIT  63729/java          
      tcp      256      0 10.17.202.19:19888          10.17.202.13:52707          CLOSE_WAIT  63729/java          
      tcp      256      0 10.17.202.19:19888          10.17.202.13:52708          CLOSE_WAIT  63729/java          
      tcp      256      0 10.17.202.19:19888          10.17.202.13:52710          CLOSE_WAIT  63729/java          
      tcp      256      0 10.17.202.19:19888          10.17.202.13:52714          CLOSE_WAIT  63729/java          
      tcp      256      0 10.17.202.19:19888          10.17.202.13:52723          CLOSE_WAIT  63729/java          
      tcp      256      0 10.17.202.19:19888          10.17.202.13:52726          CLOSE_WAIT  63729/java          
      tcp      256      0 10.17.202.19:19888          10.17.202.13:52727          CLOSE_WAIT  63729/java          
      tcp      256      0 10.17.202.19:19888          10.17.202.13:52739          CLOSE_WAIT  63729/java          
      tcp      261      0 10.17.202.19:19888          10.17.202.13:52749          CLOSE_WAIT  63729/java          
      tcp      256      0 10.17.202.19:19888          10.17.202.13:52753          CLOSE_WAIT  63729/java          
      tcp      256      0 10.17.202.19:19888          10.17.202.13:52757          CLOSE_WAIT  63729/java          
      tcp      256      0 10.17.202.19:19888          10.17.202.13:52760          CLOSE_WAIT  63729/java          
      tcp      256      0 10.17.202.19:19888          10.17.202.13:52820          CLOSE_WAIT  63729/java          
      tcp      256      0 10.17.202.19:19888          10.17.202.13:52827          CLOSE_WAIT  63729/java          
      tcp      256      0 10.17.202.19:19888          10.17.202.13:52829          CLOSE_WAIT  63729/java          
      tcp      256      0 10.17.202.19:19888          10.17.202.13:52831          CLOSE_WAIT  63729/java          
      tcp      256      0 10.17.202.19:19888          10.17.202.13:52833          CLOSE_WAIT  63729/java          
      tcp      256      0 10.17.202.19:19888          10.17.202.13:52836          CLOSE_WAIT  63729/java          
      tcp      256      0 10.17.202.19:19888          10.17.202.13:52839          CLOSE_WAIT  63729/java          
      tcp      256      0 10.17.202.19:19888          10.17.202.13:52841          CLOSE_WAIT  63729/java          
      tcp      261      0 10.17.202.19:19888          10.17.202.13:52843          CLOSE_WAIT  63729/java          
      tcp      256      0 10.17.202.19:19888          10.17.202.13:52850          CLOSE_WAIT  63729/java          
      tcp      256      0 10.17.202.19:19888          10.17.202.13:52860          CLOSE_WAIT  63729/java          
      tcp      256      0 10.17.202.19:19888          10.17.202.13:52876          CLOSE_WAIT  63729/java          
      tcp      256      0 10.17.202.19:19888          10.17.202.13:52879          CLOSE_WAIT  63729/java          
      tcp      256      0 10.17.202.19:19888          10.17.202.13:52881          CLOSE_WAIT  63729/java          
      tcp      256      0 10.17.202.19:19888          10.17.202.13:52884          CLOSE_WAIT  63729/java          
      tcp      256      0 10.17.202.19:19888          10.17.202.13:52886          CLOSE_WAIT  63729/java          
      tcp      256      0 10.17.202.19:19888          10.17.202.13:52888          CLOSE_WAIT  63729/java          
      tcp      256      0 10.17.202.19:19888          10.17.202.13:52891          CLOSE_WAIT  63729/java          
      tcp      256      0 10.17.202.19:19888          10.17.202.13:52893          CLOSE_WAIT  63729/java          
      tcp      256      0 10.17.202.19:19888          10.17.202.13:52896          CLOSE_WAIT  63729/java          
      tcp      256      0 10.17.202.19:19888          10.17.202.13:52898          CLOSE_WAIT  63729/java          
      tcp      261      0 10.17.202.19:19888          10.17.202.13:52899          CLOSE_WAIT  63729/java          
      tcp      256      0 10.17.202.19:19888          10.17.202.13:52902          CLOSE_WAIT  63729/java          
      tcp      256      0 10.17.202.19:19888          10.17.202.13:52909          CLOSE_WAIT  63729/java          
      tcp      256      0 10.17.202.19:19888          10.17.202.13:52910          CLOSE_WAIT  63729/java          
      tcp      256      0 10.17.202.19:19888          10.17.202.13:52912          CLOSE_WAIT  63729/java          
      tcp      256      0 10.17.202.19:19888          10.17.202.13:52923          CLOSE_WAIT  63729/java          
      tcp      256      0 10.17.202.19:19888          10.17.202.13:52925          CLOSE_WAIT  63729/java          
      tcp      256      0 10.17.202.19:19888          10.17.202.13:52927          CLOSE_WAIT  63729/java          
      tcp      261      0 10.17.202.19:19888          10.17.202.13:52930          CLOSE_WAIT  63729/java          
      tcp      256      0 10.17.202.19:19888          10.17.202.13:52937          CLOSE_WAIT  63729/java          
      tcp      256      0 10.17.202.19:19888          10.17.202.13:52939          CLOSE_WAIT  63729/java          
      tcp      256      0 10.17.202.19:19888          10.17.202.13:52945          CLOSE_WAIT  63729/java          
      tcp      256      0 10.17.202.19:19888          10.17.202.13:52947          CLOSE_WAIT  63729/java          
      tcp      256      0 10.17.202.19:19888          10.17.202.13:52969          CLOSE_WAIT  63729/java          
      tcp      256      0 10.17.202.19:19888          10.17.202.13:52972          CLOSE_WAIT  63729/java          
      tcp      261      0 10.17.202.19:19888          10.17.202.13:52975          CLOSE_WAIT  63729/java          
      tcp      256      0 10.17.202.19:19888          10.17.202.13:53004          CLOSE_WAIT  63729/java          
      tcp      256      0 10.17.202.19:19888          10.17.202.13:53007          CLOSE_WAIT  63729/java          
      tcp      256      0 10.17.202.19:19888          10.17.202.13:53009          CLOSE_WAIT  63729/java          
      tcp      256      0 10.17.202.19:19888          10.17.202.13:53011          CLOSE_WAIT  63729/java          
      tcp      256      0 10.17.202.19:19888          10.17.202.13:53052          CLOSE_WAIT  63729/java          
      tcp      256      0 10.17.202.19:19888          10.17.202.13:53058          CLOSE_WAIT  63729/java          
      tcp      256      0 10.17.202.19:19888          10.17.202.13:53059          CLOSE_WAIT  63729/java          
      tcp      256      0 10.17.202.19:19888          10.17.202.13:53063          CLOSE_WAIT  63729/java          
      tcp      261      0 10.17.202.19:19888          10.17.202.13:53071          CLOSE_WAIT  63729/java          
      tcp      256      0 10.17.202.19:19888          10.17.202.13:53084          CLOSE_WAIT  63729/java          
      tcp      256      0 10.17.202.19:19888          10.17.202.13:53093          CLOSE_WAIT  63729/java          
      tcp      256      0 10.17.202.19:19888          10.17.202.13:53095          CLOSE_WAIT  63729/java          
      tcp      256      0 10.17.202.19:19888          10.17.202.13:53097          CLOSE_WAIT  63729/java          
      tcp      256      0 10.17.202.19:19888          10.17.202.13:53101          CLOSE_WAIT  63729/java          
      tcp      256      0 10.17.202.19:19888          10.17.202.13:53104          CLOSE_WAIT  63729/java          
      tcp      256      0 10.17.202.19:19888          10.17.202.13:53106          CLOSE_WAIT  63729/java          
      tcp      256      0 10.17.202.19:19888          10.17.202.13:53108          CLOSE_WAIT  63729/java          
      tcp      256      0 10.17.202.19:19888          10.17.202.13:53110          CLOSE_WAIT  63729/java          
      tcp      256      0 10.17.202.19:19888          10.17.202.13:53112          CLOSE_WAIT  63729/java          
      tcp      256      0 10.17.202.19:19888          10.17.202.13:53114          CLOSE_WAIT  63729/java          
      tcp      261      0 10.17.202.19:19888          10.17.202.13:53115          CLOSE_WAIT  63729/java          
      tcp      256      0 10.17.202.19:19888          10.17.202.13:53117          CLOSE_WAIT  63729/java          
      tcp      256      0 10.17.202.19:19888          10.17.202.13:53121          CLOSE_WAIT  63729/java          
      tcp      256      0 10.17.202.19:19888          10.17.202.13:53123          CLOSE_WAIT  63729/java          
      tcp      256      0 10.17.202.19:19888          10.17.202.13:53125          CLOSE_WAIT  63729/java          
      tcp      256      0 10.17.202.19:19888          10.17.202.13:53127          CLOSE_WAIT  63729/java          
      tcp      256      0 10.17.202.19:19888          10.17.202.13:53129          CLOSE_WAIT  63729/java          
      tcp      256      0 10.17.202.19:19888          10.17.202.13:53131          CLOSE_WAIT  63729/java          
      tcp      256      0 10.17.202.19:19888          10.17.202.13:53134          CLOSE_WAIT  63729/java          
      tcp      256      0 10.17.202.19:19888          10.17.202.13:53138          CLOSE_WAIT  63729/java          
      tcp      256      0 10.17.202.19:19888          10.17.202.13:53140          CLOSE_WAIT  63729/java          
      tcp      261      0 10.17.202.19:19888          10.17.202.13:53153          CLOSE_WAIT  63729/java          
      tcp      256      0 10.17.202.19:19888          10.17.202.13:53155          CLOSE_WAIT  63729/java          
      tcp      256      0 10.17.202.19:19888          10.17.202.13:53157          CLOSE_WAIT  63729/java          
      tcp      256      0 10.17.202.19:19888          10.17.202.13:53159          CLOSE_WAIT  63729/java          
      tcp      261      0 10.17.202.19:19888          10.17.202.13:53173          CLOSE_WAIT  63729/java          
      tcp      256      0 10.17.202.19:19888          10.17.202.13:53176          CLOSE_WAIT  63729/java          
      tcp      256      0 10.17.202.19:19888          10.17.202.13:53177          CLOSE_WAIT  63729/java          
      tcp      256      0 10.17.202.19:19888          10.17.202.13:53178          CLOSE_WAIT  63729/java          
      tcp      256      0 10.17.202.19:19888          10.17.202.13:53179          CLOSE_WAIT  63729/java          
      tcp      256      0 10.17.202.19:19888          10.17.202.13:53181          CLOSE_WAIT  63729/java          
      tcp      256      0 10.17.202.19:19888          10.17.202.13:53183          CLOSE_WAIT  63729/java          
      tcp      256      0 10.17.202.19:19888          10.17.202.13:53201          CLOSE_WAIT  63729/java          
      tcp      256      0 10.17.202.19:19888          10.17.202.13:53204          CLOSE_WAIT  63729/java          
      tcp      261      0 10.17.202.19:19888          10.17.202.13:53218          CLOSE_WAIT  63729/java          
      tcp      256      0 10.17.202.19:19888          10.17.202.13:53267          CLOSE_WAIT  63729/java          
      tcp      256      0 10.17.202.19:19888          10.17.202.13:53270          CLOSE_WAIT  63729/java          
      tcp      256      0 10.17.202.19:19888          10.17.202.13:53275          CLOSE_WAIT  63729/java          
      tcp      256      0 10.17.202.19:19888          10.17.202.13:53278          CLOSE_WAIT  63729/java          
      tcp      256      0 10.17.202.19:19888          10.17.202.13:53280          CLOSE_WAIT  63729/java          
      tcp      256      0 10.17.202.19:19888          10.17.202.13:53283          CLOSE_WAIT  63729/java          
      tcp      256      0 10.17.202.19:19888          10.17.202.13:53293          CLOSE_WAIT  63729/java          
      tcp      256      0 10.17.202.19:19888          10.17.202.13:53296          CLOSE_WAIT  63729/java          
      tcp      261      0 10.17.202.19:19888          10.17.202.13:53299          CLOSE_WAIT  63729/java          
      tcp      256      0 10.17.202.19:19888          10.17.202.13:53309          CLOSE_WAIT  63729/java          
      tcp      256      0 10.17.202.19:19888          10.17.202.13:53312          CLOSE_WAIT  63729/java          
      tcp      256      0 10.17.202.19:19888          10.17.202.13:53314          CLOSE_WAIT  63729/java          
      tcp      256      0 10.17.202.19:19888          10.17.202.13:53317          CLOSE_WAIT  63729/java          
      tcp      256      0 10.17.202.19:19888          10.17.202.13:53320          CLOSE_WAIT  63729/java          
      tcp      256      0 10.17.202.19:19888          10.17.202.13:53322          CLOSE_WAIT  63729/java          
      tcp      256      0 10.17.202.19:19888          10.17.202.13:53338          CLOSE_WAIT  63729/java          
      tcp      261      0 10.17.202.19:19888          10.17.202.13:53340          CLOSE_WAIT  63729/java          
      tcp      255      0 10.17.202.19:19888          10.17.202.13:53364          ESTABLISHED 63729/java          
      tcp      255      0 10.17.202.19:19888          10.17.202.13:53366          ESTABLISHED 63729/java          
      tcp      260      0 10.17.202.19:19888          10.17.202.13:53367          ESTABLISHED 63729/java          
      tcp      255      0 10.17.202.19:19888          10.17.202.13:53380          ESTABLISHED 63729/java          
      tcp      255      0 10.17.202.19:19888          10.17.202.13:53382          ESTABLISHED 63729/java          
      tcp      255      0 10.17.202.19:19888          10.17.202.13:53386          ESTABLISHED 63729/java          
      tcp      255      0 10.17.202.19:19888          10.17.202.13:53390          ESTABLISHED 63729/java          
      tcp      255      0 10.17.202.19:19888          10.17.202.13:53392          ESTABLISHED 63729/java          
      tcp     1278      0 10.17.202.19:19888          10.17.202.18:45301          CLOSE_WAIT  63729/java          
      tcp     1278      0 10.17.202.19:19888          10.17.202.18:45303          CLOSE_WAIT  63729/java          
      tcp     1277      0 10.17.202.19:19888          10.17.202.18:45306          ESTABLISHED 63729/java 
      
      1. jhs-jstacks-service-monitor-running.tar.gz
        408 kB
        Haibo Chen
      2. jhs-jstacks-service-monitor-stopped.tar.gz
        371 kB
        Haibo Chen
      3. mapreduce6684.001.patch
        18 kB
        Haibo Chen
      4. mapreduce6684.002.patch
        18 kB
        Haibo Chen
      5. mapreduce6684.003.patch
        18 kB
        Haibo Chen

        Issue Links

          Activity

          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in Hadoop-trunk-Commit #9739 (See https://builds.apache.org/job/Hadoop-trunk-Commit/9739/)
          MAPREDUCE-6684. High contention on scanning of user directory under (jlowe: rev 5ffb54694b52657f3b7de4560474ab740734e1b2)

          • hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/test/java/org/apache/hadoop/mapreduce/v2/hs/TestUnnecessaryBlockingOnHistoryFileInfo.java
          • hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/main/java/org/apache/hadoop/mapreduce/v2/hs/HistoryFileManager.java
          Show
          hudson Hudson added a comment - FAILURE: Integrated in Hadoop-trunk-Commit #9739 (See https://builds.apache.org/job/Hadoop-trunk-Commit/9739/ ) MAPREDUCE-6684 . High contention on scanning of user directory under (jlowe: rev 5ffb54694b52657f3b7de4560474ab740734e1b2) hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/test/java/org/apache/hadoop/mapreduce/v2/hs/TestUnnecessaryBlockingOnHistoryFileInfo.java hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/main/java/org/apache/hadoop/mapreduce/v2/hs/HistoryFileManager.java
          Hide
          jlowe Jason Lowe added a comment -

          Thanks, Haibo Chen! I committed this to trunk, branch-2, and branch-2.8.

          Show
          jlowe Jason Lowe added a comment - Thanks, Haibo Chen ! I committed this to trunk, branch-2, and branch-2.8.
          Hide
          jlowe Jason Lowe added a comment -

          +1 lgtm. Committing this.

          Show
          jlowe Jason Lowe added a comment - +1 lgtm. Committing this.
          Hide
          hadoopqa Hadoop QA added a comment -
          +1 overall



          Vote Subsystem Runtime Comment
          0 reexec 0m 13s Docker mode activated.
          +1 @author 0m 0s The patch does not contain any @author tags.
          +1 test4tests 0m 0s The patch appears to include 1 new or modified test files.
          +1 mvninstall 6m 57s trunk passed
          +1 compile 0m 15s trunk passed with JDK v1.8.0_91
          +1 compile 0m 17s trunk passed with JDK v1.7.0_95
          +1 checkstyle 0m 13s trunk passed
          +1 mvnsite 0m 22s trunk passed
          +1 mvneclipse 0m 13s trunk passed
          +1 findbugs 0m 33s trunk passed
          +1 javadoc 0m 14s trunk passed with JDK v1.8.0_91
          +1 javadoc 0m 15s trunk passed with JDK v1.7.0_95
          +1 mvninstall 0m 18s the patch passed
          +1 compile 0m 13s the patch passed with JDK v1.8.0_91
          +1 javac 0m 13s the patch passed
          +1 compile 0m 15s the patch passed with JDK v1.7.0_95
          +1 javac 0m 15s the patch passed
          +1 checkstyle 0m 12s the patch passed
          +1 mvnsite 0m 22s the patch passed
          +1 mvneclipse 0m 10s the patch passed
          +1 whitespace 0m 0s Patch has no whitespace issues.
          +1 findbugs 0m 43s the patch passed
          +1 javadoc 0m 12s the patch passed with JDK v1.8.0_91
          +1 javadoc 0m 13s the patch passed with JDK v1.7.0_95
          +1 unit 5m 50s hadoop-mapreduce-client-hs in the patch passed with JDK v1.8.0_91.
          +1 unit 6m 9s hadoop-mapreduce-client-hs in the patch passed with JDK v1.7.0_95.
          +1 asflicense 0m 19s Patch does not generate ASF License warnings.
          25m 26s



          Subsystem Report/Notes
          Docker Image:yetus/hadoop:cf2ee45
          JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12803034/mapreduce6684.003.patch
          JIRA Issue MAPREDUCE-6684
          Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle
          uname Linux 0dc6564cc107 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
          Build tool maven
          Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
          git revision trunk / 411fb4b
          Default Java 1.7.0_95
          Multi-JDK versions /usr/lib/jvm/java-8-oracle:1.8.0_91 /usr/lib/jvm/java-7-openjdk-amd64:1.7.0_95
          findbugs v3.0.0
          JDK v1.7.0_95 Test Results https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/6488/testReport/
          modules C: hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs U: hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs
          Console output https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/6488/console
          Powered by Apache Yetus 0.2.0 http://yetus.apache.org

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - +1 overall Vote Subsystem Runtime Comment 0 reexec 0m 13s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. +1 test4tests 0m 0s The patch appears to include 1 new or modified test files. +1 mvninstall 6m 57s trunk passed +1 compile 0m 15s trunk passed with JDK v1.8.0_91 +1 compile 0m 17s trunk passed with JDK v1.7.0_95 +1 checkstyle 0m 13s trunk passed +1 mvnsite 0m 22s trunk passed +1 mvneclipse 0m 13s trunk passed +1 findbugs 0m 33s trunk passed +1 javadoc 0m 14s trunk passed with JDK v1.8.0_91 +1 javadoc 0m 15s trunk passed with JDK v1.7.0_95 +1 mvninstall 0m 18s the patch passed +1 compile 0m 13s the patch passed with JDK v1.8.0_91 +1 javac 0m 13s the patch passed +1 compile 0m 15s the patch passed with JDK v1.7.0_95 +1 javac 0m 15s the patch passed +1 checkstyle 0m 12s the patch passed +1 mvnsite 0m 22s the patch passed +1 mvneclipse 0m 10s the patch passed +1 whitespace 0m 0s Patch has no whitespace issues. +1 findbugs 0m 43s the patch passed +1 javadoc 0m 12s the patch passed with JDK v1.8.0_91 +1 javadoc 0m 13s the patch passed with JDK v1.7.0_95 +1 unit 5m 50s hadoop-mapreduce-client-hs in the patch passed with JDK v1.8.0_91. +1 unit 6m 9s hadoop-mapreduce-client-hs in the patch passed with JDK v1.7.0_95. +1 asflicense 0m 19s Patch does not generate ASF License warnings. 25m 26s Subsystem Report/Notes Docker Image:yetus/hadoop:cf2ee45 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12803034/mapreduce6684.003.patch JIRA Issue MAPREDUCE-6684 Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle uname Linux 0dc6564cc107 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / 411fb4b Default Java 1.7.0_95 Multi-JDK versions /usr/lib/jvm/java-8-oracle:1.8.0_91 /usr/lib/jvm/java-7-openjdk-amd64:1.7.0_95 findbugs v3.0.0 JDK v1.7.0_95 Test Results https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/6488/testReport/ modules C: hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs U: hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs Console output https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/6488/console Powered by Apache Yetus 0.2.0 http://yetus.apache.org This message was automatically generated.
          Hide
          haibochen Haibo Chen added a comment -

          Updated the patch to fix checkstyle issues. Not sure why Jenkin did not catch this in this first run.

          Show
          haibochen Haibo Chen added a comment - Updated the patch to fix checkstyle issues. Not sure why Jenkin did not catch this in this first run.
          Hide
          hadoopqa Hadoop QA added a comment -
          -1 overall



          Vote Subsystem Runtime Comment
          0 reexec 0m 17s Docker mode activated.
          +1 @author 0m 0s The patch does not contain any @author tags.
          +1 test4tests 0m 0s The patch appears to include 1 new or modified test files.
          +1 mvninstall 8m 39s trunk passed
          +1 compile 0m 21s trunk passed with JDK v1.8.0_91
          +1 compile 0m 20s trunk passed with JDK v1.7.0_95
          +1 checkstyle 0m 18s trunk passed
          +1 mvnsite 0m 28s trunk passed
          +1 mvneclipse 0m 15s trunk passed
          +1 findbugs 0m 40s trunk passed
          +1 javadoc 0m 17s trunk passed with JDK v1.8.0_91
          +1 javadoc 0m 19s trunk passed with JDK v1.7.0_95
          +1 mvninstall 0m 22s the patch passed
          +1 compile 0m 19s the patch passed with JDK v1.8.0_91
          +1 javac 0m 19s the patch passed
          +1 compile 0m 19s the patch passed with JDK v1.7.0_95
          +1 javac 0m 19s the patch passed
          -1 checkstyle 0m 15s hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs: patch generated 10 new + 16 unchanged - 0 fixed = 26 total (was 16)
          +1 mvnsite 0m 25s the patch passed
          +1 mvneclipse 0m 13s the patch passed
          +1 whitespace 0m 0s Patch has no whitespace issues.
          +1 findbugs 0m 52s the patch passed
          +1 javadoc 0m 15s the patch passed with JDK v1.8.0_91
          +1 javadoc 0m 16s the patch passed with JDK v1.7.0_95
          +1 unit 6m 48s hadoop-mapreduce-client-hs in the patch passed with JDK v1.8.0_91.
          +1 unit 6m 51s hadoop-mapreduce-client-hs in the patch passed with JDK v1.7.0_95.
          +1 asflicense 0m 22s Patch does not generate ASF License warnings.
          30m 22s



          Subsystem Report/Notes
          Docker Image:yetus/hadoop:cf2ee45
          JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12803018/mapreduce6684.002.patch
          JIRA Issue MAPREDUCE-6684
          Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle
          uname Linux 9a2156e55760 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
          Build tool maven
          Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
          git revision trunk / 411fb4b
          Default Java 1.7.0_95
          Multi-JDK versions /usr/lib/jvm/java-8-oracle:1.8.0_91 /usr/lib/jvm/java-7-openjdk-amd64:1.7.0_95
          findbugs v3.0.0
          checkstyle https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/6486/artifact/patchprocess/diff-checkstyle-hadoop-mapreduce-project_hadoop-mapreduce-client_hadoop-mapreduce-client-hs.txt
          JDK v1.7.0_95 Test Results https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/6486/testReport/
          modules C: hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs U: hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs
          Console output https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/6486/console
          Powered by Apache Yetus 0.2.0 http://yetus.apache.org

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 17s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. +1 test4tests 0m 0s The patch appears to include 1 new or modified test files. +1 mvninstall 8m 39s trunk passed +1 compile 0m 21s trunk passed with JDK v1.8.0_91 +1 compile 0m 20s trunk passed with JDK v1.7.0_95 +1 checkstyle 0m 18s trunk passed +1 mvnsite 0m 28s trunk passed +1 mvneclipse 0m 15s trunk passed +1 findbugs 0m 40s trunk passed +1 javadoc 0m 17s trunk passed with JDK v1.8.0_91 +1 javadoc 0m 19s trunk passed with JDK v1.7.0_95 +1 mvninstall 0m 22s the patch passed +1 compile 0m 19s the patch passed with JDK v1.8.0_91 +1 javac 0m 19s the patch passed +1 compile 0m 19s the patch passed with JDK v1.7.0_95 +1 javac 0m 19s the patch passed -1 checkstyle 0m 15s hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs: patch generated 10 new + 16 unchanged - 0 fixed = 26 total (was 16) +1 mvnsite 0m 25s the patch passed +1 mvneclipse 0m 13s the patch passed +1 whitespace 0m 0s Patch has no whitespace issues. +1 findbugs 0m 52s the patch passed +1 javadoc 0m 15s the patch passed with JDK v1.8.0_91 +1 javadoc 0m 16s the patch passed with JDK v1.7.0_95 +1 unit 6m 48s hadoop-mapreduce-client-hs in the patch passed with JDK v1.8.0_91. +1 unit 6m 51s hadoop-mapreduce-client-hs in the patch passed with JDK v1.7.0_95. +1 asflicense 0m 22s Patch does not generate ASF License warnings. 30m 22s Subsystem Report/Notes Docker Image:yetus/hadoop:cf2ee45 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12803018/mapreduce6684.002.patch JIRA Issue MAPREDUCE-6684 Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle uname Linux 9a2156e55760 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / 411fb4b Default Java 1.7.0_95 Multi-JDK versions /usr/lib/jvm/java-8-oracle:1.8.0_91 /usr/lib/jvm/java-7-openjdk-amd64:1.7.0_95 findbugs v3.0.0 checkstyle https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/6486/artifact/patchprocess/diff-checkstyle-hadoop-mapreduce-project_hadoop-mapreduce-client_hadoop-mapreduce-client-hs.txt JDK v1.7.0_95 Test Results https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/6486/testReport/ modules C: hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs U: hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs Console output https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/6486/console Powered by Apache Yetus 0.2.0 http://yetus.apache.org This message was automatically generated.
          Hide
          haibochen Haibo Chen added a comment -

          Thanks a lot for your review, Jason! Patch updated according to your comments.

          Show
          haibochen Haibo Chen added a comment - Thanks a lot for your review, Jason! Patch updated according to your comments.
          Hide
          jlowe Jason Lowe added a comment -

          Thanks for the patch, Haibo! Looks good to me other than one nit: createMove2DoneThreadPool is inconsistent with the rest of the code. We should be consistent with 'moveToDone' vs. 'move2Done'.

          Show
          jlowe Jason Lowe added a comment - Thanks for the patch, Haibo! Looks good to me other than one nit: createMove2DoneThreadPool is inconsistent with the rest of the code. We should be consistent with 'moveToDone' vs. 'move2Done'.
          Hide
          hadoopqa Hadoop QA added a comment -
          +1 overall



          Vote Subsystem Runtime Comment
          0 reexec 0m 10s Docker mode activated.
          +1 @author 0m 0s The patch does not contain any @author tags.
          +1 test4tests 0m 0s The patch appears to include 1 new or modified test files.
          +1 mvninstall 7m 1s trunk passed
          +1 compile 0m 17s trunk passed with JDK v1.8.0_91
          +1 compile 0m 17s trunk passed with JDK v1.7.0_95
          +1 checkstyle 0m 14s trunk passed
          +1 mvnsite 0m 23s trunk passed
          +1 mvneclipse 0m 12s trunk passed
          +1 findbugs 0m 33s trunk passed
          +1 javadoc 0m 14s trunk passed with JDK v1.8.0_91
          +1 javadoc 0m 15s trunk passed with JDK v1.7.0_95
          +1 mvninstall 0m 18s the patch passed
          +1 compile 0m 15s the patch passed with JDK v1.8.0_91
          +1 javac 0m 15s the patch passed
          +1 compile 0m 16s the patch passed with JDK v1.7.0_95
          +1 javac 0m 16s the patch passed
          +1 checkstyle 0m 11s the patch passed
          +1 mvnsite 0m 21s the patch passed
          +1 mvneclipse 0m 11s the patch passed
          +1 whitespace 0m 0s Patch has no whitespace issues.
          +1 findbugs 0m 41s the patch passed
          +1 javadoc 0m 10s the patch passed with JDK v1.8.0_91
          +1 javadoc 0m 12s the patch passed with JDK v1.7.0_95
          +1 unit 5m 52s hadoop-mapreduce-client-hs in the patch passed with JDK v1.8.0_91.
          +1 unit 6m 9s hadoop-mapreduce-client-hs in the patch passed with JDK v1.7.0_95.
          +1 asflicense 0m 19s Patch does not generate ASF License warnings.
          25m 28s



          Subsystem Report/Notes
          Docker Image:yetus/hadoop:cf2ee45
          JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12801818/mapreduce6684.001.patch
          JIRA Issue MAPREDUCE-6684
          Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle
          uname Linux 4ccdab9d0c8d 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
          Build tool maven
          Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
          git revision trunk / 9e8411d
          Default Java 1.7.0_95
          Multi-JDK versions /usr/lib/jvm/java-8-oracle:1.8.0_91 /usr/lib/jvm/java-7-openjdk-amd64:1.7.0_95
          findbugs v3.0.0
          JDK v1.7.0_95 Test Results https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/6479/testReport/
          modules C: hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs U: hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs
          Console output https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/6479/console
          Powered by Apache Yetus 0.2.0 http://yetus.apache.org

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - +1 overall Vote Subsystem Runtime Comment 0 reexec 0m 10s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. +1 test4tests 0m 0s The patch appears to include 1 new or modified test files. +1 mvninstall 7m 1s trunk passed +1 compile 0m 17s trunk passed with JDK v1.8.0_91 +1 compile 0m 17s trunk passed with JDK v1.7.0_95 +1 checkstyle 0m 14s trunk passed +1 mvnsite 0m 23s trunk passed +1 mvneclipse 0m 12s trunk passed +1 findbugs 0m 33s trunk passed +1 javadoc 0m 14s trunk passed with JDK v1.8.0_91 +1 javadoc 0m 15s trunk passed with JDK v1.7.0_95 +1 mvninstall 0m 18s the patch passed +1 compile 0m 15s the patch passed with JDK v1.8.0_91 +1 javac 0m 15s the patch passed +1 compile 0m 16s the patch passed with JDK v1.7.0_95 +1 javac 0m 16s the patch passed +1 checkstyle 0m 11s the patch passed +1 mvnsite 0m 21s the patch passed +1 mvneclipse 0m 11s the patch passed +1 whitespace 0m 0s Patch has no whitespace issues. +1 findbugs 0m 41s the patch passed +1 javadoc 0m 10s the patch passed with JDK v1.8.0_91 +1 javadoc 0m 12s the patch passed with JDK v1.7.0_95 +1 unit 5m 52s hadoop-mapreduce-client-hs in the patch passed with JDK v1.8.0_91. +1 unit 6m 9s hadoop-mapreduce-client-hs in the patch passed with JDK v1.7.0_95. +1 asflicense 0m 19s Patch does not generate ASF License warnings. 25m 28s Subsystem Report/Notes Docker Image:yetus/hadoop:cf2ee45 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12801818/mapreduce6684.001.patch JIRA Issue MAPREDUCE-6684 Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle uname Linux 4ccdab9d0c8d 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / 9e8411d Default Java 1.7.0_95 Multi-JDK versions /usr/lib/jvm/java-8-oracle:1.8.0_91 /usr/lib/jvm/java-7-openjdk-amd64:1.7.0_95 findbugs v3.0.0 JDK v1.7.0_95 Test Results https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/6479/testReport/ modules C: hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs U: hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs Console output https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/6479/console Powered by Apache Yetus 0.2.0 http://yetus.apache.org This message was automatically generated.
          Hide
          haibochen Haibo Chen added a comment -

          Thanks a lot for you all's comments. Uploading a patch according to the solution suggested by Jason. Will come to this discussion if future improvement of HistoryFileManager is necessarily needed.

          Show
          haibochen Haibo Chen added a comment - Thanks a lot for you all's comments. Uploading a patch according to the solution suggested by Jason. Will come to this discussion if future improvement of HistoryFileManager is necessarily needed.
          Hide
          kasha Karthik Kambatla added a comment -

          If ATSv2 is going to completely replace the JHS

          While the need for writing out and copying/parsing jhist files might be replaced by accesses to ATS, I would think the JHS itself would stick around to provide MR-specific UI it does today. Not sure if ATS would be able to expose the right abstractions for each framework to plugin the appropriate UI.

          Because it is complex with lots of race conditions to work through I worry we can introduce new subtle bugs when it's rewritten.

          But I think a lot of the races are inherent in the problem, some users are trying to read data and get a consistent view of it while at the same time the data is moving from one place to another.

          Agree with you both. Given it is mostly stable barring the occasional inefficiency, we should probably just leave it alone. If we run into more issues around this, may be we could revisit cleaning it up.

          In terms of potential cleanups:

          1. When fetching information regarding one job, we scan the entire intermediate directory potentially triggering other moves. Instead, it would be nice if it looks only for that specific job. And, if that very file is being moved, wait for the move to complete. That way, we don't have to think about a read thread accessing files corresponding to other jobs.
          2. HistoryFileManager is rather long - moving out some of the inner classes might make for cleaner abstractions and more explicit locking where needed for readability.
          3. I keep thinking about doing the move in a separate class/thread and some way for the read threads to communicate with this move thread, but that might just result in more complicated code. I will have to poke around to figure out which way is simpler; don't think I can get it to anytime soon though.
          Show
          kasha Karthik Kambatla added a comment - If ATSv2 is going to completely replace the JHS While the need for writing out and copying/parsing jhist files might be replaced by accesses to ATS, I would think the JHS itself would stick around to provide MR-specific UI it does today. Not sure if ATS would be able to expose the right abstractions for each framework to plugin the appropriate UI. Because it is complex with lots of race conditions to work through I worry we can introduce new subtle bugs when it's rewritten. But I think a lot of the races are inherent in the problem, some users are trying to read data and get a consistent view of it while at the same time the data is moving from one place to another. Agree with you both. Given it is mostly stable barring the occasional inefficiency, we should probably just leave it alone. If we run into more issues around this, may be we could revisit cleaning it up. In terms of potential cleanups: When fetching information regarding one job, we scan the entire intermediate directory potentially triggering other moves. Instead, it would be nice if it looks only for that specific job. And, if that very file is being moved, wait for the move to complete. That way, we don't have to think about a read thread accessing files corresponding to other jobs. HistoryFileManager is rather long - moving out some of the inner classes might make for cleaner abstractions and more explicit locking where needed for readability. I keep thinking about doing the move in a separate class/thread and some way for the read threads to communicate with this move thread, but that might just result in more complicated code. I will have to poke around to figure out which way is simpler; don't think I can get it to anytime soon though.
          Hide
          revans2 Robert Joseph Evans added a comment -

          I agree with Jason. The code is complex and there are a lot of races in there. Making the code simpler and still solving the problem would be great. But I think a lot of the races are inherent in the problem, some users are trying to read data and get a consistent view of it while at the same time the data is moving from one place to another. In many cases making something have less lock contention inherently makes the code more complex. Although I am happy to be proven wrong so if you have some specific suggestions on what and how to clean it up that would be helpful in understanding if it is worth the risk.

          Show
          revans2 Robert Joseph Evans added a comment - I agree with Jason. The code is complex and there are a lot of races in there. Making the code simpler and still solving the problem would be great. But I think a lot of the races are inherent in the problem, some users are trying to read data and get a consistent view of it while at the same time the data is moving from one place to another. In many cases making something have less lock contention inherently makes the code more complex. Although I am happy to be proven wrong so if you have some specific suggestions on what and how to clean it up that would be helpful in understanding if it is worth the risk.
          Hide
          jlowe Jason Lowe added a comment -

          Yes, the move logic is not ideal, and it would be nice if it were cleaner. Because it is complex with lots of race conditions to work through I worry we can introduce new subtle bugs when it's rewritten. If ATSv2 is going to completely replace the JHS in a reasonable timeframe then I'm thinking it's probably not worth the rewrite. However if the JHS is going to live for quite a while longer then it could be a worthwhile effort if done carefully.

          Show
          jlowe Jason Lowe added a comment - Yes, the move logic is not ideal, and it would be nice if it were cleaner. Because it is complex with lots of race conditions to work through I worry we can introduce new subtle bugs when it's rewritten. If ATSv2 is going to completely replace the JHS in a reasonable timeframe then I'm thinking it's probably not worth the rewrite. However if the JHS is going to live for quite a while longer then it could be a worthwhile effort if done carefully.
          Hide
          kasha Karthik Kambatla added a comment -

          Jason Lowe - thanks for the suggestion, looks like that should do the trick and is definitely a lot simpler and less risky than the options proposed earlier.

          Robert Joseph Evans - thanks for pointing out the load on NN, we hadn't considered that either.

          Unrelated, the code surrounding moving the files is pretty complex. Do you think there is any value in cleaning it up at all? The code seems to work fine. And, if we are going to make any changes, I guess we would probably just want to use ATS v2 when that is ready.

          Show
          kasha Karthik Kambatla added a comment - Jason Lowe - thanks for the suggestion, looks like that should do the trick and is definitely a lot simpler and less risky than the options proposed earlier. Robert Joseph Evans - thanks for pointing out the load on NN, we hadn't considered that either. Unrelated, the code surrounding moving the files is pretty complex. Do you think there is any value in cleaning it up at all? The code seems to work fine. And, if we are going to make any changes, I guess we would probably just want to use ATS v2 when that is ready.
          Hide
          jlowe Jason Lowe added a comment -

          We need to get to the bottom of why the lock is being held so long. The move itself is done via an executor pool so that shouldn't be slow.

          According to the jstacks, every single one has this stacktrace for the thread that has the lock on the user's intermediate directory:

             java.lang.Thread.State: BLOCKED (on object monitor)
                  at org.apache.hadoop.mapreduce.v2.hs.HistoryFileManager$HistoryFileInfo.didMoveFail(HistoryFileManager.java:356)
                  - waiting to lock <0x000000056106dd80> (a org.apache.hadoop.mapreduce.v2.hs.HistoryFileManager$HistoryFileInfo)
                  at org.apache.hadoop.mapreduce.v2.hs.HistoryFileManager.scanIntermediateDirectory(HistoryFileManager.java:873)
                  at org.apache.hadoop.mapreduce.v2.hs.HistoryFileManager.access$200(HistoryFileManager.java:82)
                  at org.apache.hadoop.mapreduce.v2.hs.HistoryFileManager$UserLogDir.scanIfNeeded(HistoryFileManager.java:315)
                  - locked <0x00000004fae96ee0> (a org.apache.hadoop.mapreduce.v2.hs.HistoryFileManager$UserLogDir)
                  at org.apache.hadoop.mapreduce.v2.hs.HistoryFileManager.scanIntermediateDirectory(HistoryFileManager.java:839)
                  at org.apache.hadoop.mapreduce.v2.hs.HistoryFileManager.getFileInfo(HistoryFileManager.java:978)
                  at org.apache.hadoop.mapreduce.v2.hs.CachedHistoryStorage.loadJob(CachedHistoryStorage.java:174)
          [...]
          

          So the thread has the lock, but it's waiting a long time on some other thread which isn't ideal. Every jstack shows the other thread doing this:

          "473394034@qtp-1617102129-245" daemon prio=10 tid=0x00007f66c8185000 nid=0x6437 runnable [0x00007f6693b6c000]
             java.lang.Thread.State: RUNNABLE
                at java.util.HashMap.put(HashMap.java:494)
                at org.apache.avro.util.WeakIdentityHashMap.put(WeakIdentityHashMap.java:108)
                at org.apache.avro.LogicalTypes.fromSchemaIgnoreInvalid(LogicalTypes.java:62)
                at org.apache.avro.Schema.parse(Schema.java:1318)
                at org.apache.avro.Schema.parse(Schema.java:1260)
                at org.apache.avro.Schema.parse(Schema.java:1331)
                at org.apache.avro.Schema.parse(Schema.java:1260)
                at org.apache.avro.Schema$Parser.parse(Schema.java:1024)
                at org.apache.avro.Schema$Parser.parse(Schema.java:1012)
                at org.apache.avro.Schema.parse(Schema.java:1064)
                at org.apache.hadoop.mapreduce.jobhistory.EventReader.<init>(EventReader.java:71)
                at org.apache.hadoop.mapreduce.jobhistory.JobHistoryParser.parse(JobHistoryParser.java:139)
                - locked <0x00000005a05fd4e8> (a org.apache.hadoop.mapreduce.jobhistory.JobHistoryParser)
                at org.apache.hadoop.mapreduce.v2.hs.CompletedJob.loadFullHistoryData(CompletedJob.java:339)
                - locked <0x0000000577bd9b98> (a org.apache.hadoop.mapreduce.v2.hs.CompletedJob)
                at org.apache.hadoop.mapreduce.v2.hs.CompletedJob.<init>(CompletedJob.java:101)
                at org.apache.hadoop.mapreduce.v2.hs.HistoryFileManager$HistoryFileInfo.loadJob(HistoryFileManager.java:451)
                - locked <0x000000056106dd80> (a org.apache.hadoop.mapreduce.v2.hs.HistoryFileManager$HistoryFileInfo)
          [...]
          

          So I don't think any of the above three proposals will fix the issue completely. The core issue is the scan being held up for an arbitrarily long jhist parse. I suspect things will get a lot better if we can decouple these two processes. I haven't gone through all the race conditions, but I'm wondering if we can simply make the state field volatile and allow the didMoveFail and isMovePending methods to avoid grabbing the lock.

          Show
          jlowe Jason Lowe added a comment - We need to get to the bottom of why the lock is being held so long. The move itself is done via an executor pool so that shouldn't be slow. According to the jstacks, every single one has this stacktrace for the thread that has the lock on the user's intermediate directory: java.lang.Thread.State: BLOCKED (on object monitor) at org.apache.hadoop.mapreduce.v2.hs.HistoryFileManager$HistoryFileInfo.didMoveFail(HistoryFileManager.java:356) - waiting to lock <0x000000056106dd80> (a org.apache.hadoop.mapreduce.v2.hs.HistoryFileManager$HistoryFileInfo) at org.apache.hadoop.mapreduce.v2.hs.HistoryFileManager.scanIntermediateDirectory(HistoryFileManager.java:873) at org.apache.hadoop.mapreduce.v2.hs.HistoryFileManager.access$200(HistoryFileManager.java:82) at org.apache.hadoop.mapreduce.v2.hs.HistoryFileManager$UserLogDir.scanIfNeeded(HistoryFileManager.java:315) - locked <0x00000004fae96ee0> (a org.apache.hadoop.mapreduce.v2.hs.HistoryFileManager$UserLogDir) at org.apache.hadoop.mapreduce.v2.hs.HistoryFileManager.scanIntermediateDirectory(HistoryFileManager.java:839) at org.apache.hadoop.mapreduce.v2.hs.HistoryFileManager.getFileInfo(HistoryFileManager.java:978) at org.apache.hadoop.mapreduce.v2.hs.CachedHistoryStorage.loadJob(CachedHistoryStorage.java:174) [...] So the thread has the lock, but it's waiting a long time on some other thread which isn't ideal. Every jstack shows the other thread doing this: "473394034@qtp-1617102129-245" daemon prio=10 tid=0x00007f66c8185000 nid=0x6437 runnable [0x00007f6693b6c000] java.lang.Thread.State: RUNNABLE at java.util.HashMap.put(HashMap.java:494) at org.apache.avro.util.WeakIdentityHashMap.put(WeakIdentityHashMap.java:108) at org.apache.avro.LogicalTypes.fromSchemaIgnoreInvalid(LogicalTypes.java:62) at org.apache.avro.Schema.parse(Schema.java:1318) at org.apache.avro.Schema.parse(Schema.java:1260) at org.apache.avro.Schema.parse(Schema.java:1331) at org.apache.avro.Schema.parse(Schema.java:1260) at org.apache.avro.Schema$Parser.parse(Schema.java:1024) at org.apache.avro.Schema$Parser.parse(Schema.java:1012) at org.apache.avro.Schema.parse(Schema.java:1064) at org.apache.hadoop.mapreduce.jobhistory.EventReader.<init>(EventReader.java:71) at org.apache.hadoop.mapreduce.jobhistory.JobHistoryParser.parse(JobHistoryParser.java:139) - locked <0x00000005a05fd4e8> (a org.apache.hadoop.mapreduce.jobhistory.JobHistoryParser) at org.apache.hadoop.mapreduce.v2.hs.CompletedJob.loadFullHistoryData(CompletedJob.java:339) - locked <0x0000000577bd9b98> (a org.apache.hadoop.mapreduce.v2.hs.CompletedJob) at org.apache.hadoop.mapreduce.v2.hs.CompletedJob.<init>(CompletedJob.java:101) at org.apache.hadoop.mapreduce.v2.hs.HistoryFileManager$HistoryFileInfo.loadJob(HistoryFileManager.java:451) - locked <0x000000056106dd80> (a org.apache.hadoop.mapreduce.v2.hs.HistoryFileManager$HistoryFileInfo) [...] So I don't think any of the above three proposals will fix the issue completely. The core issue is the scan being held up for an arbitrarily long jhist parse. I suspect things will get a lot better if we can decouple these two processes. I haven't gone through all the race conditions, but I'm wondering if we can simply make the state field volatile and allow the didMoveFail and isMovePending methods to avoid grabbing the lock.
          Hide
          haibochen Haibo Chen added a comment -

          Thanks a lot for your insight on why intermediated directory is scanned before done directory and potential name node issue, Robert Joseph Evans. That makes a lot of sense. Per offline discussion with Karthik, we'd like to propose three approaches.

          1. For web API requests for individual jobs, the intermediate directory is still scanned first, but inside scanIntermediateDir(), we could add checking of existence of the jhst files of the associated job (), and only when the files do exist do we move files in intermediate directory to done directory. The assumption is that file existence is not expensive, and if the files do not exist in intermediate directory, we only acquire the lock on the user directory for a short period of time.

          2. For web API requests of individual jobs, when intermediate directory is scanned, check the existence of the job files, and only files of the job associated with the request are moved from intermediate directory to done directory. This reduces the time for which each job web request thread blocks, but may have much smaller overall throughput that the previous approach when file moving is done in batch.

          3. Have a dedicated thread to scan the intermediate directory and other threads to wait on a monitor associated with a particular job. When the dedicated thread finishes, threads waiting on the monitors will be notified. By having a single writer, the contention on the user directory lock can be reduced. But it does have the problem of conflicting with clients' expectation as Robert Joseph Evans pointed out in previous comment.

          Can you please share some of your thoughts on them, Robert Joseph Evans, Jason Lowe?

          Show
          haibochen Haibo Chen added a comment - Thanks a lot for your insight on why intermediated directory is scanned before done directory and potential name node issue, Robert Joseph Evans . That makes a lot of sense. Per offline discussion with Karthik , we'd like to propose three approaches. 1. For web API requests for individual jobs, the intermediate directory is still scanned first, but inside scanIntermediateDir(), we could add checking of existence of the jhst files of the associated job (), and only when the files do exist do we move files in intermediate directory to done directory. The assumption is that file existence is not expensive, and if the files do not exist in intermediate directory, we only acquire the lock on the user directory for a short period of time. 2. For web API requests of individual jobs, when intermediate directory is scanned, check the existence of the job files, and only files of the job associated with the request are moved from intermediate directory to done directory. This reduces the time for which each job web request thread blocks, but may have much smaller overall throughput that the previous approach when file moving is done in batch. 3. Have a dedicated thread to scan the intermediate directory and other threads to wait on a monitor associated with a particular job. When the dedicated thread finishes, threads waiting on the monitors will be notified. By having a single writer, the contention on the user directory lock can be reduced. But it does have the problem of conflicting with clients' expectation as Robert Joseph Evans pointed out in previous comment. Can you please share some of your thoughts on them, Robert Joseph Evans , Jason Lowe ?
          Hide
          revans2 Robert Joseph Evans added a comment -

          It has been a while since I did anything with this (2013 from the edit logs). As such I don't remember too much about the code so feel free to take my advice or leave it, as I could be wrong. Now that I finished with the all the disclaimers I believe that the reason we scan the intermediate directory first is to not drop any files that might be in the middle of being moved from intermediate_done to done. So if you look in the older directory first you might miss some files that are in the middle of being moved, and then when you scan intermediate_done you wouldn't see those files because they have already been moved. So if you think there is a high probability that you will find the files you want in the done dir then you would need to

          scan the done dir
          if not found scan intermediate_done
          //Because something might have been in flight
          if still not found scan the done dir again.

          From our experience the majority of the hits to the history server for a job/application happen right after the application finishes. So at least for our usage pattern this would not help much, and would probably add a lot more load to the namenode, something that we are much more concerned about.

          Having the scanner be on a background thread would help, but you would need some synchronization that is a bit more complex then just a waking the thread up and then waiting for a notification that it was done. From the clients perspective if a history file was placed in the intermediate_done dir and then they made a call to the history server it had better pick up that file and return a result. So we need to be sure that the waiting thread will only be woken up in the case where nothing was found if the scan started after the request was made.

          Show
          revans2 Robert Joseph Evans added a comment - It has been a while since I did anything with this (2013 from the edit logs). As such I don't remember too much about the code so feel free to take my advice or leave it, as I could be wrong. Now that I finished with the all the disclaimers I believe that the reason we scan the intermediate directory first is to not drop any files that might be in the middle of being moved from intermediate_done to done. So if you look in the older directory first you might miss some files that are in the middle of being moved, and then when you scan intermediate_done you wouldn't see those files because they have already been moved. So if you think there is a high probability that you will find the files you want in the done dir then you would need to scan the done dir if not found scan intermediate_done //Because something might have been in flight if still not found scan the done dir again. From our experience the majority of the hits to the history server for a job/application happen right after the application finishes. So at least for our usage pattern this would not help much, and would probably add a lot more load to the namenode, something that we are much more concerned about. Having the scanner be on a background thread would help, but you would need some synchronization that is a bit more complex then just a waking the thread up and then waiting for a notification that it was done. From the clients perspective if a history file was placed in the intermediate_done dir and then they made a call to the history server it had better pick up that file and return a result. So we need to be sure that the waiting thread will only be woken up in the case where nothing was found if the scan started after the request was made.
          Hide
          kasha Karthik Kambatla added a comment -

          Is it okay to scan the done directory first as Haibo suggested?

          In addition to that, I wonder if there should be a single thread (with sleeps) moving files from intermediate to done. Other threads needing a particular job's jhist file could wake this moving thread and register for updates for that file. Once a file is moved, the moving thread could notify these other waiting threads. That way, we ensure serving threads wait only as long as required.

          Jason Lowe, Robert Joseph Evans - what do you think?

          Show
          kasha Karthik Kambatla added a comment - Is it okay to scan the done directory first as Haibo suggested? In addition to that, I wonder if there should be a single thread (with sleeps) moving files from intermediate to done. Other threads needing a particular job's jhist file could wake this moving thread and register for updates for that file. Once a file is moved, the moving thread could notify these other waiting threads. That way, we ensure serving threads wait only as long as required. Jason Lowe , Robert Joseph Evans - what do you think?
          Hide
          kasha Karthik Kambatla added a comment -

          Was looking into this along with Haibo Chen. The two tarballs of jstacks attached here correspond to a monitoring process turned on/off; the monitoring process has threads connecting to the JHS and asking for details about a job. We have seen a few problems:

          1. JHS continues to process a request (scan directories and fetch file) even after the requesting connection has terminated.
          2. As Haibo mentioned, each thread scans the intermediate directory (and moves any available files to done dir). This scan is protected by a lock on the intermediate directory. When there are multiple threads, all threads but one are blocked. Couple of problems with this:
            1. Threads that are trying to load a file that is already in the done_dir also seem to be blocked on scanning the intermediate dir
            2. Even though a thread with the lock might process files required by other threads, those threads redo the scan (potentially moving other files they don't need) leading to longer turn-around-time per thread and maxing out the number of threads (default of 256).
          Show
          kasha Karthik Kambatla added a comment - Was looking into this along with Haibo Chen . The two tarballs of jstacks attached here correspond to a monitoring process turned on/off; the monitoring process has threads connecting to the JHS and asking for details about a job. We have seen a few problems: JHS continues to process a request (scan directories and fetch file) even after the requesting connection has terminated. As Haibo mentioned, each thread scans the intermediate directory (and moves any available files to done dir). This scan is protected by a lock on the intermediate directory. When there are multiple threads, all threads but one are blocked. Couple of problems with this: Threads that are trying to load a file that is already in the done_dir also seem to be blocked on scanning the intermediate dir Even though a thread with the lock might process files required by other threads, those threads redo the scan (potentially moving other files they don't need) leading to longer turn-around-time per thread and maxing out the number of threads (default of 256).
          Hide
          haibochen Haibo Chen added a comment -

          Most web API calls tend to invoke HistoryFileManager.getFileInfo(JobId) that looks up the job history files in the intermediate directory first before in the done directory. If jobs are already in done directory in most cases, we could scan the done directory first then intermediate directory and therefore reduce the contention on the intermediate user directories.

          Show
          haibochen Haibo Chen added a comment - Most web API calls tend to invoke HistoryFileManager.getFileInfo(JobId) that looks up the job history files in the intermediate directory first before in the done directory. If jobs are already in done directory in most cases, we could scan the done directory first then intermediate directory and therefore reduce the contention on the intermediate user directories.
          Hide
          haibochen Haibo Chen added a comment -

          Attaching the jstack logs

          Show
          haibochen Haibo Chen added a comment - Attaching the jstack logs

            People

            • Assignee:
              haibochen Haibo Chen
              Reporter:
              haibochen Haibo Chen
            • Votes:
              0 Vote for this issue
              Watchers:
              9 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development