Hadoop Map/Reduce
  1. Hadoop Map/Reduce
  2. MAPREDUCE-3583

ProcfsBasedProcessTree#constructProcessInfo() may throw NumberFormatException

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Critical Critical
    • Resolution: Fixed
    • Affects Version/s: 0.20.205.0
    • Fix Version/s: 1.0.2, 0.23.2
    • Component/s: None
    • Labels:
      None
    • Environment:

      64-bit Linux:
      asf011.sp2.ygridcore.net
      Linux asf011.sp2.ygridcore.net 2.6.32-33-server #71-Ubuntu SMP Wed Jul 20 17:42:25 UTC 2011 x86_64 GNU/Linux

    • Hadoop Flags:
      Reviewed

      Description

      HBase PreCommit builds frequently gave us NumberFormatException.

      From https://builds.apache.org/job/PreCommit-HBASE-Build/553//testReport/org.apache.hadoop.hbase.mapreduce/TestHFileOutputFormat/testMRIncrementalLoad/:

      2011-12-20 01:44:01,180 WARN  [main] mapred.JobClient(784): No job jar file set.  User classes may not be found. See JobConf(Class) or JobConf#setJar(String).
      java.lang.NumberFormatException: For input string: "18446743988060683582"
      	at java.lang.NumberFormatException.forInputString(NumberFormatException.java:48)
      	at java.lang.Long.parseLong(Long.java:422)
      	at java.lang.Long.parseLong(Long.java:468)
      	at org.apache.hadoop.util.ProcfsBasedProcessTree.constructProcessInfo(ProcfsBasedProcessTree.java:413)
      	at org.apache.hadoop.util.ProcfsBasedProcessTree.getProcessTree(ProcfsBasedProcessTree.java:148)
      	at org.apache.hadoop.util.LinuxResourceCalculatorPlugin.getProcResourceValues(LinuxResourceCalculatorPlugin.java:401)
      	at org.apache.hadoop.mapred.Task.initialize(Task.java:536)
      	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:353)
      	at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
      	at java.security.AccessController.doPrivileged(Native Method)
      	at javax.security.auth.Subject.doAs(Subject.java:396)
      	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1083)
      	at org.apache.hadoop.mapred.Child.main(Child.java:249)
      

      From hadoop 0.20.205 source code, looks like ppid was 18446743988060683582, causing NFE:

              // Set (name) (ppid) (pgrpId) (session) (utime) (stime) (vsize) (rss)
               pinfo.updateProcessInfo(m.group(2), Integer.parseInt(m.group(3)),
      

      You can find information on the OS at the beginning of https://builds.apache.org/job/PreCommit-HBASE-Build/553/console:

      asf011.sp2.ygridcore.net
      Linux asf011.sp2.ygridcore.net 2.6.32-33-server #71-Ubuntu SMP Wed Jul 20 17:42:25 UTC 2011 x86_64 GNU/Linux
      core file size          (blocks, -c) 0
      data seg size           (kbytes, -d) unlimited
      scheduling priority             (-e) 20
      file size               (blocks, -f) unlimited
      pending signals                 (-i) 16382
      max locked memory       (kbytes, -l) 64
      max memory size         (kbytes, -m) unlimited
      open files                      (-n) 60000
      pipe size            (512 bytes, -p) 8
      POSIX message queues     (bytes, -q) 819200
      real-time priority              (-r) 0
      stack size              (kbytes, -s) 8192
      cpu time               (seconds, -t) unlimited
      max user processes              (-u) 2048
      virtual memory          (kbytes, -v) unlimited
      file locks                      (-x) unlimited
      60000
      Running in Jenkins mode
      

      From Nicolas Sze:

      It looks like that the ppid is a 64-bit positive integer but Java long is signed and so only works with 63-bit positive integers.  In your case,
      
        2^64 > 18446743988060683582 > 2^63.
      
      Therefore, there is a NFE. 
      

      I propose changing allProcessInfo to Map<String, ProcessInfo> so that we don't encounter this problem by avoiding parsing large integer.

      1. mapreduce-3583.txt
        8 kB
        ramkrishna.s.vasudevan
      2. mapreduce-3583-v2.txt
        9 kB
        Ted Yu
      3. mapreduce-3583-v3.txt
        9 kB
        Ted Yu
      4. mapreduce-3583-v4.txt
        9 kB
        Ted Yu
      5. mapreduce-3583-v5.txt
        9 kB
        Ted Yu
      6. mapreduce-3583-trunk.txt
        22 kB
        Ted Yu
      7. mapreduce-3583-trunk-v2.txt
        22 kB
        Ted Yu
      8. mapreduce-3583-trunk-v2.txt
        22 kB
        Ted Yu
      9. mapreduce-3583-trunk-v3.txt
        22 kB
        Ted Yu
      10. mapreduce-3583-trunk-v4.txt
        22 kB
        Ted Yu
      11. mapreduce-3583-trunk-v5.txt
        23 kB
        Ted Yu
      12. mapreduce-3583-trunk-v6.txt
        23 kB
        Ted Yu
      13. mapreduce-3583-trunk-v7.txt
        24 kB
        Ted Yu
      14. mapreduce-3583-v6.txt
        9 kB
        Ted Yu
      15. mapreduce-3583-v7.txt
        10 kB
        Ted Yu

        Issue Links

          Activity

          Hide
          Ted Yu added a comment -

          For trunk, the following files should be included in the patch:

          ./hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/util/ProcfsBasedProcessTree.java
          ./hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/util/ProcfsBasedProcessTree.java
          
          Show
          Ted Yu added a comment - For trunk, the following files should be included in the patch: ./hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/util/ProcfsBasedProcessTree.java ./hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/util/ProcfsBasedProcessTree.java
          Hide
          ramkrishna.s.vasudevan added a comment -

          Pls provide your comments

          Show
          ramkrishna.s.vasudevan added a comment - Pls provide your comments
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12508247/mapreduce-3583.txt
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          -1 tests included. The patch doesn't appear to include any new or modified tests.
          Please justify why no new tests are needed for this patch.
          Also please list what manual steps were performed to verify this patch.

          -1 patch. The patch command could not apply the patch.

          Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1489//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12508247/mapreduce-3583.txt against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1489//console This message is automatically generated.
          Hide
          ramkrishna.s.vasudevan added a comment -

          The patch is for 0.20.205
          String is used to store pid and ppid.

          TestProcfsBasedProcessTree passes.
          test-core passes except for TestSaslRPC whose failure shouldn't be related to the patch.

          Show
          ramkrishna.s.vasudevan added a comment - The patch is for 0.20.205 String is used to store pid and ppid. TestProcfsBasedProcessTree passes. test-core passes except for TestSaslRPC whose failure shouldn't be related to the patch.
          Hide
          Ted Yu added a comment -

          See https://issues.apache.org/jira/browse/HBASE-5064?focusedCommentId=13176830&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13176830 for one scenario where NumberFormatException should have been fixed so that other exceptions can be more easily uncovered.

          Show
          Ted Yu added a comment - See https://issues.apache.org/jira/browse/HBASE-5064?focusedCommentId=13176830&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13176830 for one scenario where NumberFormatException should have been fixed so that other exceptions can be more easily uncovered.
          Hide
          Evan Pollan added a comment -

          This became a critical blocking issue for me today. This is preventing distcp commands from completing successfully on two different CDH3 update 2 environment's I'm using, meaning I cannot do any offline log processing/analytics.

          I think the above analysis of the failure is a bit off – it's not actually the pid that's blowing up the number parsing: it's one of the (presumed) longs. The code is extracting capture groups 7, 8, 10, and 11, parsing them as signed 64-bit longs, and interpreting them as utime, stime, vsize, and rss, respectively.

          Here's an example of the contents of a /proc/X/stat file on one of my affected systems, listed in conjunction with how the man page describes each field

          pid 1686
          comm (ssh)
          state S
          ppid 1685
          pgrp 1672
          session 1415
          tty_nr 34816
          tpgid 4884
          flags 4202496
          minflt 1922
          cminflt 0
          majflt 3
          cmajflt 0
          utime 67
          stime 82
          cutime 0
          cstime 0
          priority 20
          nice 0
          num_threads 1
          itrealvalue 0
          starttime 144184
          vsize 62341120
          rss 1120
          rsslim 18,446,744,073,709,500,000
          startcode 139,935,780,638,720
          endcode 139,935,781,007,452
          startstack 140,735,070,560,080
          kstkesp 140,735,070,553,640
          kstkeip 139,935,743,316,835
          signal 0
          blocked 0
          sigignore 4102
          sigcatch 134234113
          wchan 18,446,744,071,579,900,000
          nswap 0
          cnswap 0
          exit_signal 17
          processor 0
          rt_priority 0
          policy 0
          delayacct_blkio_ticks 2
          guest_time 0
          cguest_time 0

          As I said, I'm using cloudera CDH3U2, and the relevant regexp pattern used to capture /proc/X/stat fields is:

            private static final Pattern PROCFS_STAT_FILE_FORMAT = Pattern
                .compile("^([0-9-]+)\\s([^\\s]+)\\s[^\\s]\\s([0-9-]+)\\s([0-9-]+)\\s([0-9-]+)\\s([0-9-]+\\s){16}([0-9]+)(\\s[0-9-]+){16}");
          

          The parsing code is:

                  // Set ( name ) ( ppid ) ( pgrpId ) (session ) (vsize )
                  pinfo.updateProcessInfo(m.group(2), Integer.parseInt(m.group(3)), Integer
                      .parseInt(m.group(4)), Integer.parseInt(m.group(5)), Long
                      .parseLong(m.group(7)));
          

          The thing that's baffling me is that the field the Long.parseLong is choking on is nowhere to be found in the contents of any /proc/X/stat file that exists while the job is running. E.g., :

          2/01/31 23:31:03 INFO tools.DistCp: sourcePathsCount=1
          12/01/31 23:31:03 INFO tools.DistCp: filesToCopyCount=1
          12/01/31 23:31:03 INFO tools.DistCp: bytesToCopyCount=122.0k
          12/01/31 23:31:03 INFO mapred.JobClient: Running job: job_201201312321_0002
          12/01/31 23:31:04 INFO mapred.JobClient:  map 0% reduce 0%
          12/01/31 23:31:08 INFO mapred.JobClient: Task Id : attempt_201201312321_0002_m_000002_0, Status : FAILED
          java.lang.NumberFormatException: For input string: "18446744073709551532"
          	at java.lang.NumberFormatException.forInputString(NumberFormatException.java:48)
          	at java.lang.Long.parseLong(Long.java:422)
          	at java.lang.Long.parseLong(Long.java:468)
          	at org.apache.hadoop.util.ProcfsBasedProcessTree.constructProcessInfo(ProcfsBasedProcessTree.java:413)
          	at org.apache.hadoop.util.ProcfsBasedProcessTree.getProcessTree(ProcfsBasedProcessTree.java:148)
          	at org.apache.hadoop.util.LinuxResourceCalculatorPlugin.getProcResourceValues(LinuxResourceCalculatorPlugin.java:401)
          	at org.apache.hadoop.mapred.Task.initialize(Task.java:532)
          	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:306)
          	at org.apache.hadoop.mapred.Child$4.run(Child.java:270)
          	at java.security.AccessController.doPrivileged(Native Method)
          	at javax.security.auth.Subject.doAs(Subject.java:396)
          	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1157)
          	at org.apache.hadoop.mapred.Child.main(Child.java:264)
          

          Here's what the entire set of /proc/X/stat files look like while this job is running (I'm looking at the /proc file system on the only task tracker/data node in the cluster) – if Long.parseLong was going to fail, I assume it would choke on '18446744073709551615'.:

          10 (async/mgr) S 2 0 0 0 -1 2149613632 0 0 0 0 0 0 0 0 20 0 1 0 93 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 0 0 0 0 0 0
          11 (xenwatch) S 2 0 0 0 -1 2149613888 0 0 0 0 1 0 0 0 20 0 1 0 93 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 0 0 0 0 0 0
          120 (upstart-udev-br) S 1 119 119 0 -1 4202560 215 0 0 0 6 0 0 0 20 0 1 0 234 17444864 239 18446744073709551615 140724289748992 140724289787412 0 0 0 0 0 4097 81920 18446744073709551615 0 0 17 1 0 0 0 0 0
          122 (udevd) S 1 122 122 0 -1 4202816 636 23063 0 13 1 3 105 15 16 -4 1 0 235 17289216 164 18446744073709551615 140382903398400 140382903499908 0 0 0 0 2147221247 0 0 18446744073709551615 0 0 17 0 0 0 0 0 0
          12 (xenbus) S 2 0 0 0 -1 2149613632 0 0 0 0 1 0 0 0 20 0 1 0 93 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 0 0 0 0 0 0
          14 (migration/1) S 2 0 0 0 -1 2216722496 0 0 0 0 1 0 0 0 -100 0 1 0 93 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 1 99 1 0 0 0
          15 (ksoftirqd/1) S 2 0 0 0 -1 2216722496 0 0 0 0 1 0 0 0 20 0 1 0 93 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 1 0 0 0 0 0
          16 (watchdog/1) S 2 0 0 0 -1 2216722752 0 0 0 0 0 0 0 0 -100 0 1 0 93 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 1 99 1 0 0 0
          1719 (avahi-daemon) S 1 1718 1718 0 -1 4202816 446 0 0 0 2 0 0 0 20 0 1 0 5195 34873344 418 18446744073709551615 4194304 4307028 0 0 0 0 0 3674112 16903 18446744073709551615 0 0 17 0 0 0 0 0 0
          1720 (avahi-daemon) S 1719 1720 1720 0 -1 4202560 90 0 0 0 0 0 0 0 20 0 1 0 5195 34742272 143 18446744073709551615 4194304 4307028 0 0 0 0 0 3670016 0 18446744073709551615 0 0 17 1 0 0 0 0 0
          17 (events/1) S 2 0 0 0 -1 2216722496 0 0 0 0 1 0 0 0 20 0 1 0 93 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 1 0 0 0 0 0
          188 (udevd) S 122 122 122 0 -1 4202816 91 0 0 0 0 0 0 0 18 -2 1 0 241 17285120 160 18446744073709551615 140382903398400 140382903499908 0 0 0 0 2147196671 0 24576 18446744073709551615 0 0 17 2 0 0 0 0 0
          189 (udevd) S 122 122 122 0 -1 4202816 89 0 0 0 0 0 0 0 18 -2 1 0 241 17285120 159 18446744073709551615 140382903398400 140382903499908 0 0 0 0 2147196671 0 24576 18446744073709551615 0 0 17 3 0 0 0 0 0
          18 (migration/2) S 2 0 0 0 -1 2216722496 0 0 0 0 1 0 0 0 -100 0 1 0 126 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 2 99 1 0 0 0
          19 (ksoftirqd/2) S 2 0 0 0 -1 2216722496 0 0 0 0 1 0 0 0 20 0 1 0 126 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 2 0 0 0 0 0
          1 (init) S 0 1 1 0 -1 4202752 5633 1150472 28 692 8 15 -5995191823955592639 -3228180212899177193 20 0 1 0 93 24281088 475 18446744073709551615 140133429972992 140133430091084 0 0 0 0 0 4096 536946211 18446744073709551615 0 0 0 0 0 0 0 0 0
          20 (watchdog/2) S 2 0 0 0 -1 2216722752 0 0 0 0 0 0 0 0 -100 0 1 0 126 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 2 99 1 0 0 0
          21 (events/2) S 2 0 0 0 -1 2216722496 0 0 0 0 1 0 0 0 20 0 1 0 126 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 2 0 0 0 0 0
          22 (migration/3) S 2 0 0 0 -1 2216722496 0 0 0 0 1 0 0 0 -100 0 1 0 158 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 3 99 1 0 0 0
          23 (ksoftirqd/3) S 2 0 0 0 -1 2216722496 0 0 0 0 1 0 0 0 20 0 1 0 158 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 3 0 0 0 0 0
          24 (watchdog/3) S 2 0 0 0 -1 2216722752 0 0 0 0 0 0 0 0 -100 0 1 0 158 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 3 99 1 0 0 0
          25 (events/3) S 2 0 0 0 -1 2216722496 0 0 0 0 1 0 0 0 20 0 1 0 158 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 3 0 0 0 0 0
          26 (sync_supers) S 2 0 0 0 -1 2149613632 0 0 0 0 0 0 0 0 20 0 1 0 194 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 1 0 0 0 0 0
          27 (bdi-default) S 2 0 0 0 -1 2157973568 0 0 0 0 0 0 0 0 20 0 1 0 194 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 3 0 0 0 0 0
          28 (kintegrityd/0) S 2 0 0 0 -1 2216722496 0 0 0 0 0 0 0 0 20 0 1 0 194 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 0 0 0 0 0 0
          2928 (su) S 1 2772 2772 0 -1 4202752 853 0 1 0 1 0 0 0 20 0 1 0 11368 48869376 442 18446744073709551615 4194304 4224396 0 0 0 0 2147196671 1 16384 18446744073709551615 0 0 17 0 0 0 0 0 0
          2937 (java) S 2928 2772 2772 0 -1 4202496 32799 579 0 1 233 19 2 0 20 0 40 0 11383 1438593024 22308 18446744073709551615 1073741824 1073778416 0 0 0 0 0 1 16800974 18446744073709551615 0 0 17 3 0 0 0 0 0
          29 (kintegrityd/1) S 2 0 0 0 -1 2216722496 0 0 0 0 0 0 0 0 20 0 1 0 194 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 1 0 0 0 0 0
          2 (kthreadd) S 0 0 0 0 -1 2149613632 0 0 0 0 1 0 0 0 20 0 1 0 93 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 0 2 0 0 0 0 0
          30 (kintegrityd/2) S 2 0 0 0 -1 2216722496 0 0 0 0 0 0 0 0 20 0 1 0 194 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 2 0 0 0 0 0
          3106 (su) S 1 2772 2772 0 -1 4202752 853 0 0 0 1 0 0 0 20 0 1 0 12056 48869376 443 18446744073709551615 4194304 4224396 0 0 0 0 2147196671 1 16384 18446744073709551615 0 0 17 0 0 0 0 0 0
          3115 (java) S 3106 2772 2772 0 -1 4202496 45170 166286 0 1 451 18446744073709551522 1666 10 20 0 42 0 12058 1450819584 31461 18446744073709551615 1073741824 1073778416 0 0 0 0 0 1 16800974 18446744073709551615 0 0 17 3 0 0 0 0 0
          319 (dhclient3) S 1 319 319 0 -1 4202560 59 0 0 0 0 0 0 0 20 0 1 0 599 6713344 85 18446744073709551615 140272284925952 140272285354020 0 0 0 0 0 0 0 18446744073709551615 0 0 17 0 0 0 0 0 0
          31 (kintegrityd/3) S 2 0 0 0 -1 2216722496 0 0 0 0 0 0 0 0 20 0 1 0 194 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 3 0 0 0 0 0
          3222 (sshd) S 474 3222 3222 0 -1 4202752 1152 26602 0 0 2 0 76 16 20 0 1 0 22284 83111936 879 18446744073709551615 139945989926912 139945990366548 0 0 0 0 0 4096 16387 18446744073709551615 0 0 17 0 0 0 0 0 0
          32 (kblockd/0) S 2 0 0 0 -1 2216722496 0 0 0 0 0 0 0 0 20 0 1 0 194 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 0 0 0 0 0 0
          3301 (sshd) S 3222 3222 3222 0 -1 4202816 295 0 0 0 1 0 0 0 20 0 1 0 22441 83111936 415 18446744073709551615 139945989926912 139945990366548 0 0 0 0 0 4096 65536 18446744073709551615 0 0 17 0 0 0 0 0 0
          3302 (bash) S 3301 3302 3302 34816 3826 4202496 8064 78426 1 2 3 12 291 31 20 0 1 0 22442 19914752 553 18446744073709551615 4194304 5087404 140734376532864 18446744073709551615 139668755529598 0 65536 3686404 1266761467 18446744071579111781 0 0 17 3 0 0 0 0 0
          33 (kblockd/1) S 2 0 0 0 -1 2216722496 0 0 0 0 0 0 0 0 20 0 1 0 194 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 1 0 0 0 0 0
          34 (kblockd/2) S 2 0 0 0 -1 2216722496 0 0 0 0 0 0 0 0 20 0 1 0 194 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 2 0 0 0 0 0
          35 (kblockd/3) S 2 0 0 0 -1 2216722496 0 0 0 0 0 0 0 0 20 0 1 0 194 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 3 0 0 0 0 0
          36 (kseriod) S 2 0 0 0 -1 2149580864 0 0 0 0 0 0 0 0 20 0 1 0 194 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 1 0 0 0 0 0
          377 (flush-1:0) S 2 0 0 0 -1 2157973568 0 0 0 0 0 0 0 0 20 0 1 0 701 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 2 0 0 0 0 0
          378 (flush-1:1) S 2 0 0 0 -1 2157973568 0 0 0 0 0 0 0 0 20 0 1 0 701 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 1 0 0 0 0 0
          379 (flush-1:2) S 2 0 0 0 -1 2157973568 0 0 0 0 0 0 0 0 20 0 1 0 701 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 1 0 0 0 0 0
          380 (flush-1:3) S 2 0 0 0 -1 2157973568 0 0 0 0 0 0 0 0 20 0 1 0 701 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 0 0 0 0 0 0
          381 (flush-1:4) S 2 0 0 0 -1 2157973568 0 0 0 0 0 0 0 0 20 0 1 0 701 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 1 0 0 0 0 0
          382 (flush-1:5) S 2 0 0 0 -1 2157973568 0 0 0 0 0 0 0 0 20 0 1 0 701 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 2 0 0 0 0 0
          383 (flush-1:6) S 2 0 0 0 -1 2157973568 0 0 0 0 0 0 0 0 20 0 1 0 701 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 0 0 0 0 0 0
          384 (flush-1:7) S 2 0 0 0 -1 2157973568 0 0 0 0 0 0 0 0 20 0 1 0 701 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 2 0 0 0 0 0
          385 (flush-1:8) S 2 0 0 0 -1 2157973568 0 0 0 0 0 0 0 0 20 0 1 0 701 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 2 0 0 0 0 0
          386 (flush-1:9) S 2 0 0 0 -1 2157973568 0 0 0 0 0 0 0 0 20 0 1 0 701 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 3 0 0 0 0 0
          387 (flush-1:10) S 2 0 0 0 -1 2157973568 0 0 0 0 0 0 0 0 20 0 1 0 701 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 2 0 0 0 0 0
          388 (flush-1:11) S 2 0 0 0 -1 2157973568 0 0 0 0 0 0 0 0 20 0 1 0 701 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 0 0 0 0 0 0
          389 (flush-1:12) S 2 0 0 0 -1 2157973568 0 0 0 0 0 0 0 0 20 0 1 0 701 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 1 0 0 0 0 0
          390 (flush-1:13) S 2 0 0 0 -1 2157973568 0 0 0 0 0 0 0 0 20 0 1 0 701 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 3 0 0 0 0 0
          391 (flush-1:14) S 2 0 0 0 -1 2157973568 0 0 0 0 0 0 0 0 20 0 1 0 701 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 3 0 0 0 0 0
          392 (flush-1:15) S 2 0 0 0 -1 2157973568 0 0 0 0 0 0 0 0 20 0 1 0 701 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 3 0 0 0 0 0
          393 (flush-8:1) S 2 0 0 0 -1 2157973568 0 0 0 0 1 14 0 0 20 0 1 0 701 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 1 0 0 0 0 0
          394 (flush-8:16) S 2 0 0 0 -1 2157973568 0 0 0 0 0 0 0 0 20 0 1 0 701 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 0 0 0 0 0 0
          395 (flush-8:32) S 2 0 0 0 -1 2157973568 0 0 0 0 0 0 0 0 20 0 1 0 701 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 1 0 0 0 0 0
          396 (flush-8:48) S 2 0 0 0 -1 2157973568 0 0 0 0 0 0 0 0 20 0 1 0 701 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 2 0 0 0 0 0
          397 (flush-8:64) S 2 0 0 0 -1 2157973568 0 0 0 0 0 0 0 0 20 0 1 0 701 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 2 0 0 0 0 0
          3 (migration/0) S 2 0 0 0 -1 2216722496 0 0 0 0 1 0 0 0 -100 0 1 0 93 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 0 99 1 0 0 0
          41 (khungtaskd) S 2 0 0 0 -1 2149613632 0 0 0 0 0 0 0 0 20 0 1 0 195 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 0 0 0 0 0 0
          43 (kswapd0) S 2 0 0 0 -1 2158233664 0 0 0 0 0 0 0 0 20 0 1 0 195 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 1 0 0 0 0 0
          44 (aio/0) S 2 0 0 0 -1 2216722496 0 0 0 0 0 0 0 0 20 0 1 0 195 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 0 0 0 0 0 0
          458 (rsyslogd) S 1 426 426 0 -1 4202816 386 0 1 0 1 2 0 0 20 0 4 0 851 133304320 395 18446744073709551615 4194304 4462780 0 0 0 0 0 16781830 85025 18446744073709551615 0 0 17 0 0 0 0 0 0
          45 (aio/1) S 2 0 0 0 -1 2216722496 0 0 0 0 0 0 0 0 20 0 1 0 195 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 1 0 0 0 0 0
          46 (aio/2) S 2 0 0 0 -1 2216722496 0 0 0 0 0 0 0 0 20 0 1 0 195 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 2 0 0 0 0 0
          474 (sshd) S 1 474 474 0 -1 4202816 229 63238 0 40 1 0 283 28 20 0 1 0 854 50442240 271 18446744073709551615 140147055022080 140147055461716 0 0 0 0 0 4096 81925 18446744073709551615 0 0 17 1 0 0 0 0 0
          475 (dbus-daemon) S 1 475 475 0 -1 4202816 368 54 0 0 2 0 0 0 20 0 1 0 855 24141824 342 18446744073709551615 140490573344768 140490573663756 0 0 0 0 0 4096 16385 18446744073709551615 0 0 17 0 0 0 0 0 0
          478 (kjournald) S 2 0 0 0 -1 2149613632 0 0 0 0 1 0 0 0 20 0 1 0 856 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 1 0 0 0 0 0
          47 (aio/3) S 2 0 0 0 -1 2216722496 0 0 0 0 0 0 0 0 20 0 1 0 195 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 3 0 0 0 0 0
          48 (jfsIO) S 2 0 0 0 -1 2149613632 0 0 0 0 0 0 0 0 20 0 1 0 195 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 1 0 0 0 0 0
          49 (jfsCommit) S 2 0 0 0 -1 2149613632 0 0 0 0 0 0 0 0 20 0 1 0 195 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 1 0 0 0 0 0
          4 (ksoftirqd/0) S 2 0 0 0 -1 2216722496 0 0 0 0 1 0 0 0 20 0 1 0 93 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 0 0 0 0 0 0
          503 (atd) S 1 503 503 0 -1 4202560 85 0 0 0 0 0 0 0 20 0 1 0 886 19337216 116 18446744073709551615 4194304 4210820 0 0 0 0 0 0 81923 18446744073709551615 0 0 17 0 0 0 0 0 0
          504 (cron) S 1 504 504 0 -1 4202560 257 0 0 0 1 0 0 0 20 0 1 0 886 21581824 254 18446744073709551615 4194304 4228572 0 0 0 0 0 0 65537 18446744073709551615 0 0 17 0 0 0 0 0 0
          50 (jfsCommit) S 2 0 0 0 -1 2149613632 0 0 0 0 0 0 0 0 20 0 1 0 195 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 1 0 0 0 0 0
          51 (jfsCommit) S 2 0 0 0 -1 2149613632 0 0 0 0 0 0 0 0 20 0 1 0 195 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 1 0 0 0 0 0
          52 (jfsCommit) S 2 0 0 0 -1 2149613632 0 0 0 0 0 0 0 0 20 0 1 0 195 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 1 0 0 0 0 0
          53 (jfsSync) S 2 0 0 0 -1 2149613632 0 0 0 0 0 0 0 0 20 0 1 0 195 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 1 0 0 0 0 0
          54 (xfs_mru_cache) S 2 0 0 0 -1 2149613632 0 0 0 0 0 0 0 0 20 0 1 0 195 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 1 0 0 0 0 0
          558 (getty) S 1 558 558 1025 558 4202496 199 0 1 0 0 1 0 0 20 0 1 0 931 6225920 162 18446744073709551615 4194304 4210980 0 0 0 0 0 0 0 18446744073709551615 0 0 17 1 0 0 0 0 0
          55 (xfslogd/0) S 2 0 0 0 -1 2216722496 0 0 0 0 0 0 0 0 20 0 1 0 195 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 0 0 0 0 0 0
          563 (console-kit-dae) S 1 475 475 0 -1 4202752 2810 8142 26 7 4611686018427387902 0 22 2 20 0 3 0 1288 327077888 1000 18446744073709551615 4194304 4326916 0 0 0 0 0 4096 66048 18446744073709551615 0 0 17 0 0 0 0 0 0
          56 (xfslogd/1) S 2 0 0 0 -1 2216722496 0 0 0 0 0 0 0 0 20 0 1 0 195 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 1 0 0 0 0 0
          57 (xfslogd/2) S 2 0 0 0 -1 2216722496 0 0 0 0 0 0 0 0 20 0 1 0 195 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 2 0 0 0 0 0
          58 (xfslogd/3) S 2 0 0 0 -1 2216722496 0 0 0 0 0 0 0 0 20 0 1 0 195 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 3 0 0 0 0 0
          59 (xfsdatad/0) S 2 0 0 0 -1 2216722496 0 0 0 0 0 0 0 0 20 0 1 0 195 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 0 0 0 0 0 0
          5 (watchdog/0) S 2 0 0 0 -1 2216722752 0 0 0 0 0 0 0 0 -100 0 1 0 93 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 0 99 1 0 0 0
          60 (xfsdatad/1) S 2 0 0 0 -1 2216722496 0 0 0 0 0 0 0 0 20 0 1 0 195 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 1 0 0 0 0 0
          61 (xfsdatad/2) S 2 0 0 0 -1 2216722496 0 0 0 0 0 0 0 0 20 0 1 0 195 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 2 0 0 0 0 0
          62 (xfsdatad/3) S 2 0 0 0 -1 2216722496 0 0 0 0 0 0 0 0 20 0 1 0 195 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 3 0 0 0 0 0
          63 (xfsconvertd/0) S 2 0 0 0 -1 2216722496 0 0 0 0 0 0 0 0 20 0 1 0 195 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 0 0 0 0 0 0
          64 (xfsconvertd/1) S 2 0 0 0 -1 2216722496 0 0 0 0 0 0 0 0 20 0 1 0 195 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 1 0 0 0 0 0
          65 (xfsconvertd/2) S 2 0 0 0 -1 2216722496 0 0 0 0 0 0 0 0 20 0 1 0 195 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 2 0 0 0 0 0
          66 (xfsconvertd/3) S 2 0 0 0 -1 2216722496 0 0 0 0 0 0 0 0 20 0 1 0 195 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 3 0 0 0 0 0
          67 (glock_workqueue) S 2 0 0 0 -1 2216722496 0 0 0 0 0 0 0 0 20 0 1 0 195 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 0 0 0 0 0 0
          68 (glock_workqueue) S 2 0 0 0 -1 2216722496 0 0 0 0 0 0 0 0 20 0 1 0 195 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 1 0 0 0 0 0
          69 (glock_workqueue) S 2 0 0 0 -1 2216722496 0 0 0 0 0 0 0 0 20 0 1 0 195 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 2 0 0 0 0 0
          6 (events/0) S 2 0 0 0 -1 2216722496 0 0 0 0 1 0 0 0 20 0 1 0 93 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 0 0 0 0 0 0
          70 (glock_workqueue) S 2 0 0 0 -1 2216722496 0 0 0 0 0 0 0 0 20 0 1 0 195 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 3 0 0 0 0 0
          71 (delete_workqueu) S 2 0 0 0 -1 2216722496 0 0 0 0 0 0 0 0 20 0 1 0 195 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 0 0 0 0 0 0
          72 (delete_workqueu) S 2 0 0 0 -1 2216722496 0 0 0 0 0 0 0 0 20 0 1 0 195 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 1 0 0 0 0 0
          73 (delete_workqueu) S 2 0 0 0 -1 2216722496 0 0 0 0 0 0 0 0 20 0 1 0 195 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 2 0 0 0 0 0
          74 (delete_workqueu) S 2 0 0 0 -1 2216722496 0 0 0 0 0 0 0 0 20 0 1 0 195 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 3 0 0 0 0 0
          75 (kslowd000) S 2 0 0 0 -1 2149580864 0 0 0 0 0 0 0 0 15 -5 1 0 195 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 1 0 0 0 0 0
          76 (kslowd001) S 2 0 0 0 -1 2149580864 0 0 0 0 0 0 0 0 15 -5 1 0 195 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 1 0 0 0 0 0
          77 (crypto/0) S 2 0 0 0 -1 2216722496 0 0 0 0 0 0 0 0 20 0 1 0 195 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 0 0 0 0 0 0
          78 (crypto/1) S 2 0 0 0 -1 2216722496 0 0 0 0 0 0 0 0 20 0 1 0 195 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 1 0 0 0 0 0
          79 (crypto/2) S 2 0 0 0 -1 2216722496 0 0 0 0 0 0 0 0 20 0 1 0 195 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 2 0 0 0 0 0
          7 (cpuset) S 2 0 0 0 -1 2149613632 0 0 0 0 0 0 0 0 20 0 1 0 93 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 0 0 0 0 0 0
          80 (crypto/3) S 2 0 0 0 -1 2216722496 0 0 0 0 0 0 0 0 20 0 1 0 195 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 3 0 0 0 0 0
          83 (net_accel/0) S 2 0 0 0 -1 2216722496 0 0 0 0 0 0 0 0 20 0 1 0 200 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 0 0 0 0 0 0
          84 (net_accel/1) S 2 0 0 0 -1 2216722496 0 0 0 0 0 0 0 0 20 0 1 0 200 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 1 0 0 0 0 0
          85 (net_accel/2) S 2 0 0 0 -1 2216722496 0 0 0 0 0 0 0 0 20 0 1 0 200 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 2 0 0 0 0 0
          86 (net_accel/3) S 2 0 0 0 -1 2216722496 0 0 0 0 0 0 0 0 20 0 1 0 200 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 3 0 0 0 0 0
          87 (sfc_netfront/0) S 2 0 0 0 -1 2216722496 0 0 0 0 0 0 0 0 20 0 1 0 203 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 0 0 0 0 0 0
          88 (sfc_netfront/1) S 2 0 0 0 -1 2216722496 0 0 0 0 0 0 0 0 20 0 1 0 203 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 1 0 0 0 0 0
          89 (sfc_netfront/2) S 2 0 0 0 -1 2216722496 0 0 0 0 0 0 0 0 20 0 1 0 203 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 2 0 0 0 0 0
          8 (khelper) S 2 0 0 0 -1 2149613632 0 0 0 0 0 0 0 0 20 0 1 0 93 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 0 0 0 0 0 0
          90 (sfc_netfront/3) S 2 0 0 0 -1 2216722496 0 0 0 0 0 0 0 0 20 0 1 0 203 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 3 0 0 0 0 0
          91 (kstriped) S 2 0 0 0 -1 2149613632 0 0 0 0 0 0 0 0 20 0 1 0 203 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 2 0 0 0 0 0
          92 (kjournald) S 2 0 0 0 -1 2149613632 0 0 0 0 6 0 0 0 20 0 1 0 216 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 1 0 0 0 0 0
          9 (netns) S 2 0 0 0 -1 2149613632 0 0 0 0 0 0 0 0 20 0 1 0 93 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 0 0 0 0 0 0
          3916 (cat) R 3302 3916 3302 34816 3916 4202496 248 0 0 0 0 0 0 0 20 0 1 0 26900 5636096 182 18446744073709551615 4194304 4247204 140736357334480 18446744073709551615 139795312997536 0 0 0 0 0 0 0 17 1 0 0 0 0 0
          

          This is using the Ubuntu 10.04 64 bit AMI (us-east-1/ami-da0cf8b3), cluster created by whirr-0.6.0-incubating.

          Any ideas here? I'm dead in the water. I'm going to take a stab at using CDH3U3, which just came out, but since this defect isn't yet resolved, I'm not holding out much hope.

          Show
          Evan Pollan added a comment - This became a critical blocking issue for me today. This is preventing distcp commands from completing successfully on two different CDH3 update 2 environment's I'm using, meaning I cannot do any offline log processing/analytics. I think the above analysis of the failure is a bit off – it's not actually the pid that's blowing up the number parsing: it's one of the (presumed) longs. The code is extracting capture groups 7, 8, 10, and 11, parsing them as signed 64-bit longs, and interpreting them as utime, stime, vsize, and rss, respectively. Here's an example of the contents of a /proc/X/stat file on one of my affected systems, listed in conjunction with how the man page describes each field pid 1686 comm (ssh) state S ppid 1685 pgrp 1672 session 1415 tty_nr 34816 tpgid 4884 flags 4202496 minflt 1922 cminflt 0 majflt 3 cmajflt 0 utime 67 stime 82 cutime 0 cstime 0 priority 20 nice 0 num_threads 1 itrealvalue 0 starttime 144184 vsize 62341120 rss 1120 rsslim 18,446,744,073,709,500,000 startcode 139,935,780,638,720 endcode 139,935,781,007,452 startstack 140,735,070,560,080 kstkesp 140,735,070,553,640 kstkeip 139,935,743,316,835 signal 0 blocked 0 sigignore 4102 sigcatch 134234113 wchan 18,446,744,071,579,900,000 nswap 0 cnswap 0 exit_signal 17 processor 0 rt_priority 0 policy 0 delayacct_blkio_ticks 2 guest_time 0 cguest_time 0 As I said, I'm using cloudera CDH3U2, and the relevant regexp pattern used to capture /proc/X/stat fields is: private static final Pattern PROCFS_STAT_FILE_FORMAT = Pattern .compile( "^([0-9-]+)\\s([^\\s]+)\\s[^\\s]\\s([0-9-]+)\\s([0-9-]+)\\s([0-9-]+)\\s([0-9-]+\\s){16}([0-9]+)(\\s[0-9-]+){16}" ); The parsing code is: // Set ( name ) ( ppid ) ( pgrpId ) (session ) (vsize ) pinfo.updateProcessInfo(m.group(2), Integer .parseInt(m.group(3)), Integer .parseInt(m.group(4)), Integer .parseInt(m.group(5)), Long .parseLong(m.group(7))); The thing that's baffling me is that the field the Long.parseLong is choking on is nowhere to be found in the contents of any /proc/X/stat file that exists while the job is running. E.g., : 2/01/31 23:31:03 INFO tools.DistCp: sourcePathsCount=1 12/01/31 23:31:03 INFO tools.DistCp: filesToCopyCount=1 12/01/31 23:31:03 INFO tools.DistCp: bytesToCopyCount=122.0k 12/01/31 23:31:03 INFO mapred.JobClient: Running job: job_201201312321_0002 12/01/31 23:31:04 INFO mapred.JobClient: map 0% reduce 0% 12/01/31 23:31:08 INFO mapred.JobClient: Task Id : attempt_201201312321_0002_m_000002_0, Status : FAILED java.lang.NumberFormatException: For input string: "18446744073709551532" at java.lang.NumberFormatException.forInputString(NumberFormatException.java:48) at java.lang. Long .parseLong( Long .java:422) at java.lang. Long .parseLong( Long .java:468) at org.apache.hadoop.util.ProcfsBasedProcessTree.constructProcessInfo(ProcfsBasedProcessTree.java:413) at org.apache.hadoop.util.ProcfsBasedProcessTree.getProcessTree(ProcfsBasedProcessTree.java:148) at org.apache.hadoop.util.LinuxResourceCalculatorPlugin.getProcResourceValues(LinuxResourceCalculatorPlugin.java:401) at org.apache.hadoop.mapred.Task.initialize(Task.java:532) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:306) at org.apache.hadoop.mapred.Child$4.run(Child.java:270) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1157) at org.apache.hadoop.mapred.Child.main(Child.java:264) Here's what the entire set of /proc/X/stat files look like while this job is running (I'm looking at the /proc file system on the only task tracker/data node in the cluster) – if Long.parseLong was going to fail, I assume it would choke on '18446744073709551615'.: 10 (async/mgr) S 2 0 0 0 -1 2149613632 0 0 0 0 0 0 0 0 20 0 1 0 93 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 0 0 0 0 0 0 11 (xenwatch) S 2 0 0 0 -1 2149613888 0 0 0 0 1 0 0 0 20 0 1 0 93 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 0 0 0 0 0 0 120 (upstart-udev-br) S 1 119 119 0 -1 4202560 215 0 0 0 6 0 0 0 20 0 1 0 234 17444864 239 18446744073709551615 140724289748992 140724289787412 0 0 0 0 0 4097 81920 18446744073709551615 0 0 17 1 0 0 0 0 0 122 (udevd) S 1 122 122 0 -1 4202816 636 23063 0 13 1 3 105 15 16 -4 1 0 235 17289216 164 18446744073709551615 140382903398400 140382903499908 0 0 0 0 2147221247 0 0 18446744073709551615 0 0 17 0 0 0 0 0 0 12 (xenbus) S 2 0 0 0 -1 2149613632 0 0 0 0 1 0 0 0 20 0 1 0 93 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 0 0 0 0 0 0 14 (migration/1) S 2 0 0 0 -1 2216722496 0 0 0 0 1 0 0 0 -100 0 1 0 93 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 1 99 1 0 0 0 15 (ksoftirqd/1) S 2 0 0 0 -1 2216722496 0 0 0 0 1 0 0 0 20 0 1 0 93 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 1 0 0 0 0 0 16 (watchdog/1) S 2 0 0 0 -1 2216722752 0 0 0 0 0 0 0 0 -100 0 1 0 93 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 1 99 1 0 0 0 1719 (avahi-daemon) S 1 1718 1718 0 -1 4202816 446 0 0 0 2 0 0 0 20 0 1 0 5195 34873344 418 18446744073709551615 4194304 4307028 0 0 0 0 0 3674112 16903 18446744073709551615 0 0 17 0 0 0 0 0 0 1720 (avahi-daemon) S 1719 1720 1720 0 -1 4202560 90 0 0 0 0 0 0 0 20 0 1 0 5195 34742272 143 18446744073709551615 4194304 4307028 0 0 0 0 0 3670016 0 18446744073709551615 0 0 17 1 0 0 0 0 0 17 (events/1) S 2 0 0 0 -1 2216722496 0 0 0 0 1 0 0 0 20 0 1 0 93 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 1 0 0 0 0 0 188 (udevd) S 122 122 122 0 -1 4202816 91 0 0 0 0 0 0 0 18 -2 1 0 241 17285120 160 18446744073709551615 140382903398400 140382903499908 0 0 0 0 2147196671 0 24576 18446744073709551615 0 0 17 2 0 0 0 0 0 189 (udevd) S 122 122 122 0 -1 4202816 89 0 0 0 0 0 0 0 18 -2 1 0 241 17285120 159 18446744073709551615 140382903398400 140382903499908 0 0 0 0 2147196671 0 24576 18446744073709551615 0 0 17 3 0 0 0 0 0 18 (migration/2) S 2 0 0 0 -1 2216722496 0 0 0 0 1 0 0 0 -100 0 1 0 126 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 2 99 1 0 0 0 19 (ksoftirqd/2) S 2 0 0 0 -1 2216722496 0 0 0 0 1 0 0 0 20 0 1 0 126 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 2 0 0 0 0 0 1 (init) S 0 1 1 0 -1 4202752 5633 1150472 28 692 8 15 -5995191823955592639 -3228180212899177193 20 0 1 0 93 24281088 475 18446744073709551615 140133429972992 140133430091084 0 0 0 0 0 4096 536946211 18446744073709551615 0 0 0 0 0 0 0 0 0 20 (watchdog/2) S 2 0 0 0 -1 2216722752 0 0 0 0 0 0 0 0 -100 0 1 0 126 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 2 99 1 0 0 0 21 (events/2) S 2 0 0 0 -1 2216722496 0 0 0 0 1 0 0 0 20 0 1 0 126 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 2 0 0 0 0 0 22 (migration/3) S 2 0 0 0 -1 2216722496 0 0 0 0 1 0 0 0 -100 0 1 0 158 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 3 99 1 0 0 0 23 (ksoftirqd/3) S 2 0 0 0 -1 2216722496 0 0 0 0 1 0 0 0 20 0 1 0 158 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 3 0 0 0 0 0 24 (watchdog/3) S 2 0 0 0 -1 2216722752 0 0 0 0 0 0 0 0 -100 0 1 0 158 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 3 99 1 0 0 0 25 (events/3) S 2 0 0 0 -1 2216722496 0 0 0 0 1 0 0 0 20 0 1 0 158 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 3 0 0 0 0 0 26 (sync_supers) S 2 0 0 0 -1 2149613632 0 0 0 0 0 0 0 0 20 0 1 0 194 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 1 0 0 0 0 0 27 (bdi- default ) S 2 0 0 0 -1 2157973568 0 0 0 0 0 0 0 0 20 0 1 0 194 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 3 0 0 0 0 0 28 (kintegrityd/0) S 2 0 0 0 -1 2216722496 0 0 0 0 0 0 0 0 20 0 1 0 194 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 0 0 0 0 0 0 2928 (su) S 1 2772 2772 0 -1 4202752 853 0 1 0 1 0 0 0 20 0 1 0 11368 48869376 442 18446744073709551615 4194304 4224396 0 0 0 0 2147196671 1 16384 18446744073709551615 0 0 17 0 0 0 0 0 0 2937 (java) S 2928 2772 2772 0 -1 4202496 32799 579 0 1 233 19 2 0 20 0 40 0 11383 1438593024 22308 18446744073709551615 1073741824 1073778416 0 0 0 0 0 1 16800974 18446744073709551615 0 0 17 3 0 0 0 0 0 29 (kintegrityd/1) S 2 0 0 0 -1 2216722496 0 0 0 0 0 0 0 0 20 0 1 0 194 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 1 0 0 0 0 0 2 (kthreadd) S 0 0 0 0 -1 2149613632 0 0 0 0 1 0 0 0 20 0 1 0 93 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 0 2 0 0 0 0 0 30 (kintegrityd/2) S 2 0 0 0 -1 2216722496 0 0 0 0 0 0 0 0 20 0 1 0 194 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 2 0 0 0 0 0 3106 (su) S 1 2772 2772 0 -1 4202752 853 0 0 0 1 0 0 0 20 0 1 0 12056 48869376 443 18446744073709551615 4194304 4224396 0 0 0 0 2147196671 1 16384 18446744073709551615 0 0 17 0 0 0 0 0 0 3115 (java) S 3106 2772 2772 0 -1 4202496 45170 166286 0 1 451 18446744073709551522 1666 10 20 0 42 0 12058 1450819584 31461 18446744073709551615 1073741824 1073778416 0 0 0 0 0 1 16800974 18446744073709551615 0 0 17 3 0 0 0 0 0 319 (dhclient3) S 1 319 319 0 -1 4202560 59 0 0 0 0 0 0 0 20 0 1 0 599 6713344 85 18446744073709551615 140272284925952 140272285354020 0 0 0 0 0 0 0 18446744073709551615 0 0 17 0 0 0 0 0 0 31 (kintegrityd/3) S 2 0 0 0 -1 2216722496 0 0 0 0 0 0 0 0 20 0 1 0 194 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 3 0 0 0 0 0 3222 (sshd) S 474 3222 3222 0 -1 4202752 1152 26602 0 0 2 0 76 16 20 0 1 0 22284 83111936 879 18446744073709551615 139945989926912 139945990366548 0 0 0 0 0 4096 16387 18446744073709551615 0 0 17 0 0 0 0 0 0 32 (kblockd/0) S 2 0 0 0 -1 2216722496 0 0 0 0 0 0 0 0 20 0 1 0 194 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 0 0 0 0 0 0 3301 (sshd) S 3222 3222 3222 0 -1 4202816 295 0 0 0 1 0 0 0 20 0 1 0 22441 83111936 415 18446744073709551615 139945989926912 139945990366548 0 0 0 0 0 4096 65536 18446744073709551615 0 0 17 0 0 0 0 0 0 3302 (bash) S 3301 3302 3302 34816 3826 4202496 8064 78426 1 2 3 12 291 31 20 0 1 0 22442 19914752 553 18446744073709551615 4194304 5087404 140734376532864 18446744073709551615 139668755529598 0 65536 3686404 1266761467 18446744071579111781 0 0 17 3 0 0 0 0 0 33 (kblockd/1) S 2 0 0 0 -1 2216722496 0 0 0 0 0 0 0 0 20 0 1 0 194 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 1 0 0 0 0 0 34 (kblockd/2) S 2 0 0 0 -1 2216722496 0 0 0 0 0 0 0 0 20 0 1 0 194 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 2 0 0 0 0 0 35 (kblockd/3) S 2 0 0 0 -1 2216722496 0 0 0 0 0 0 0 0 20 0 1 0 194 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 3 0 0 0 0 0 36 (kseriod) S 2 0 0 0 -1 2149580864 0 0 0 0 0 0 0 0 20 0 1 0 194 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 1 0 0 0 0 0 377 (flush-1:0) S 2 0 0 0 -1 2157973568 0 0 0 0 0 0 0 0 20 0 1 0 701 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 2 0 0 0 0 0 378 (flush-1:1) S 2 0 0 0 -1 2157973568 0 0 0 0 0 0 0 0 20 0 1 0 701 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 1 0 0 0 0 0 379 (flush-1:2) S 2 0 0 0 -1 2157973568 0 0 0 0 0 0 0 0 20 0 1 0 701 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 1 0 0 0 0 0 380 (flush-1:3) S 2 0 0 0 -1 2157973568 0 0 0 0 0 0 0 0 20 0 1 0 701 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 0 0 0 0 0 0 381 (flush-1:4) S 2 0 0 0 -1 2157973568 0 0 0 0 0 0 0 0 20 0 1 0 701 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 1 0 0 0 0 0 382 (flush-1:5) S 2 0 0 0 -1 2157973568 0 0 0 0 0 0 0 0 20 0 1 0 701 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 2 0 0 0 0 0 383 (flush-1:6) S 2 0 0 0 -1 2157973568 0 0 0 0 0 0 0 0 20 0 1 0 701 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 0 0 0 0 0 0 384 (flush-1:7) S 2 0 0 0 -1 2157973568 0 0 0 0 0 0 0 0 20 0 1 0 701 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 2 0 0 0 0 0 385 (flush-1:8) S 2 0 0 0 -1 2157973568 0 0 0 0 0 0 0 0 20 0 1 0 701 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 2 0 0 0 0 0 386 (flush-1:9) S 2 0 0 0 -1 2157973568 0 0 0 0 0 0 0 0 20 0 1 0 701 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 3 0 0 0 0 0 387 (flush-1:10) S 2 0 0 0 -1 2157973568 0 0 0 0 0 0 0 0 20 0 1 0 701 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 2 0 0 0 0 0 388 (flush-1:11) S 2 0 0 0 -1 2157973568 0 0 0 0 0 0 0 0 20 0 1 0 701 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 0 0 0 0 0 0 389 (flush-1:12) S 2 0 0 0 -1 2157973568 0 0 0 0 0 0 0 0 20 0 1 0 701 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 1 0 0 0 0 0 390 (flush-1:13) S 2 0 0 0 -1 2157973568 0 0 0 0 0 0 0 0 20 0 1 0 701 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 3 0 0 0 0 0 391 (flush-1:14) S 2 0 0 0 -1 2157973568 0 0 0 0 0 0 0 0 20 0 1 0 701 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 3 0 0 0 0 0 392 (flush-1:15) S 2 0 0 0 -1 2157973568 0 0 0 0 0 0 0 0 20 0 1 0 701 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 3 0 0 0 0 0 393 (flush-8:1) S 2 0 0 0 -1 2157973568 0 0 0 0 1 14 0 0 20 0 1 0 701 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 1 0 0 0 0 0 394 (flush-8:16) S 2 0 0 0 -1 2157973568 0 0 0 0 0 0 0 0 20 0 1 0 701 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 0 0 0 0 0 0 395 (flush-8:32) S 2 0 0 0 -1 2157973568 0 0 0 0 0 0 0 0 20 0 1 0 701 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 1 0 0 0 0 0 396 (flush-8:48) S 2 0 0 0 -1 2157973568 0 0 0 0 0 0 0 0 20 0 1 0 701 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 2 0 0 0 0 0 397 (flush-8:64) S 2 0 0 0 -1 2157973568 0 0 0 0 0 0 0 0 20 0 1 0 701 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 2 0 0 0 0 0 3 (migration/0) S 2 0 0 0 -1 2216722496 0 0 0 0 1 0 0 0 -100 0 1 0 93 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 0 99 1 0 0 0 41 (khungtaskd) S 2 0 0 0 -1 2149613632 0 0 0 0 0 0 0 0 20 0 1 0 195 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 0 0 0 0 0 0 43 (kswapd0) S 2 0 0 0 -1 2158233664 0 0 0 0 0 0 0 0 20 0 1 0 195 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 1 0 0 0 0 0 44 (aio/0) S 2 0 0 0 -1 2216722496 0 0 0 0 0 0 0 0 20 0 1 0 195 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 0 0 0 0 0 0 458 (rsyslogd) S 1 426 426 0 -1 4202816 386 0 1 0 1 2 0 0 20 0 4 0 851 133304320 395 18446744073709551615 4194304 4462780 0 0 0 0 0 16781830 85025 18446744073709551615 0 0 17 0 0 0 0 0 0 45 (aio/1) S 2 0 0 0 -1 2216722496 0 0 0 0 0 0 0 0 20 0 1 0 195 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 1 0 0 0 0 0 46 (aio/2) S 2 0 0 0 -1 2216722496 0 0 0 0 0 0 0 0 20 0 1 0 195 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 2 0 0 0 0 0 474 (sshd) S 1 474 474 0 -1 4202816 229 63238 0 40 1 0 283 28 20 0 1 0 854 50442240 271 18446744073709551615 140147055022080 140147055461716 0 0 0 0 0 4096 81925 18446744073709551615 0 0 17 1 0 0 0 0 0 475 (dbus-daemon) S 1 475 475 0 -1 4202816 368 54 0 0 2 0 0 0 20 0 1 0 855 24141824 342 18446744073709551615 140490573344768 140490573663756 0 0 0 0 0 4096 16385 18446744073709551615 0 0 17 0 0 0 0 0 0 478 (kjournald) S 2 0 0 0 -1 2149613632 0 0 0 0 1 0 0 0 20 0 1 0 856 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 1 0 0 0 0 0 47 (aio/3) S 2 0 0 0 -1 2216722496 0 0 0 0 0 0 0 0 20 0 1 0 195 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 3 0 0 0 0 0 48 (jfsIO) S 2 0 0 0 -1 2149613632 0 0 0 0 0 0 0 0 20 0 1 0 195 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 1 0 0 0 0 0 49 (jfsCommit) S 2 0 0 0 -1 2149613632 0 0 0 0 0 0 0 0 20 0 1 0 195 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 1 0 0 0 0 0 4 (ksoftirqd/0) S 2 0 0 0 -1 2216722496 0 0 0 0 1 0 0 0 20 0 1 0 93 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 0 0 0 0 0 0 503 (atd) S 1 503 503 0 -1 4202560 85 0 0 0 0 0 0 0 20 0 1 0 886 19337216 116 18446744073709551615 4194304 4210820 0 0 0 0 0 0 81923 18446744073709551615 0 0 17 0 0 0 0 0 0 504 (cron) S 1 504 504 0 -1 4202560 257 0 0 0 1 0 0 0 20 0 1 0 886 21581824 254 18446744073709551615 4194304 4228572 0 0 0 0 0 0 65537 18446744073709551615 0 0 17 0 0 0 0 0 0 50 (jfsCommit) S 2 0 0 0 -1 2149613632 0 0 0 0 0 0 0 0 20 0 1 0 195 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 1 0 0 0 0 0 51 (jfsCommit) S 2 0 0 0 -1 2149613632 0 0 0 0 0 0 0 0 20 0 1 0 195 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 1 0 0 0 0 0 52 (jfsCommit) S 2 0 0 0 -1 2149613632 0 0 0 0 0 0 0 0 20 0 1 0 195 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 1 0 0 0 0 0 53 (jfsSync) S 2 0 0 0 -1 2149613632 0 0 0 0 0 0 0 0 20 0 1 0 195 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 1 0 0 0 0 0 54 (xfs_mru_cache) S 2 0 0 0 -1 2149613632 0 0 0 0 0 0 0 0 20 0 1 0 195 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 1 0 0 0 0 0 558 (getty) S 1 558 558 1025 558 4202496 199 0 1 0 0 1 0 0 20 0 1 0 931 6225920 162 18446744073709551615 4194304 4210980 0 0 0 0 0 0 0 18446744073709551615 0 0 17 1 0 0 0 0 0 55 (xfslogd/0) S 2 0 0 0 -1 2216722496 0 0 0 0 0 0 0 0 20 0 1 0 195 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 0 0 0 0 0 0 563 (console-kit-dae) S 1 475 475 0 -1 4202752 2810 8142 26 7 4611686018427387902 0 22 2 20 0 3 0 1288 327077888 1000 18446744073709551615 4194304 4326916 0 0 0 0 0 4096 66048 18446744073709551615 0 0 17 0 0 0 0 0 0 56 (xfslogd/1) S 2 0 0 0 -1 2216722496 0 0 0 0 0 0 0 0 20 0 1 0 195 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 1 0 0 0 0 0 57 (xfslogd/2) S 2 0 0 0 -1 2216722496 0 0 0 0 0 0 0 0 20 0 1 0 195 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 2 0 0 0 0 0 58 (xfslogd/3) S 2 0 0 0 -1 2216722496 0 0 0 0 0 0 0 0 20 0 1 0 195 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 3 0 0 0 0 0 59 (xfsdatad/0) S 2 0 0 0 -1 2216722496 0 0 0 0 0 0 0 0 20 0 1 0 195 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 0 0 0 0 0 0 5 (watchdog/0) S 2 0 0 0 -1 2216722752 0 0 0 0 0 0 0 0 -100 0 1 0 93 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 0 99 1 0 0 0 60 (xfsdatad/1) S 2 0 0 0 -1 2216722496 0 0 0 0 0 0 0 0 20 0 1 0 195 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 1 0 0 0 0 0 61 (xfsdatad/2) S 2 0 0 0 -1 2216722496 0 0 0 0 0 0 0 0 20 0 1 0 195 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 2 0 0 0 0 0 62 (xfsdatad/3) S 2 0 0 0 -1 2216722496 0 0 0 0 0 0 0 0 20 0 1 0 195 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 3 0 0 0 0 0 63 (xfsconvertd/0) S 2 0 0 0 -1 2216722496 0 0 0 0 0 0 0 0 20 0 1 0 195 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 0 0 0 0 0 0 64 (xfsconvertd/1) S 2 0 0 0 -1 2216722496 0 0 0 0 0 0 0 0 20 0 1 0 195 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 1 0 0 0 0 0 65 (xfsconvertd/2) S 2 0 0 0 -1 2216722496 0 0 0 0 0 0 0 0 20 0 1 0 195 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 2 0 0 0 0 0 66 (xfsconvertd/3) S 2 0 0 0 -1 2216722496 0 0 0 0 0 0 0 0 20 0 1 0 195 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 3 0 0 0 0 0 67 (glock_workqueue) S 2 0 0 0 -1 2216722496 0 0 0 0 0 0 0 0 20 0 1 0 195 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 0 0 0 0 0 0 68 (glock_workqueue) S 2 0 0 0 -1 2216722496 0 0 0 0 0 0 0 0 20 0 1 0 195 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 1 0 0 0 0 0 69 (glock_workqueue) S 2 0 0 0 -1 2216722496 0 0 0 0 0 0 0 0 20 0 1 0 195 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 2 0 0 0 0 0 6 (events/0) S 2 0 0 0 -1 2216722496 0 0 0 0 1 0 0 0 20 0 1 0 93 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 0 0 0 0 0 0 70 (glock_workqueue) S 2 0 0 0 -1 2216722496 0 0 0 0 0 0 0 0 20 0 1 0 195 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 3 0 0 0 0 0 71 (delete_workqueu) S 2 0 0 0 -1 2216722496 0 0 0 0 0 0 0 0 20 0 1 0 195 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 0 0 0 0 0 0 72 (delete_workqueu) S 2 0 0 0 -1 2216722496 0 0 0 0 0 0 0 0 20 0 1 0 195 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 1 0 0 0 0 0 73 (delete_workqueu) S 2 0 0 0 -1 2216722496 0 0 0 0 0 0 0 0 20 0 1 0 195 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 2 0 0 0 0 0 74 (delete_workqueu) S 2 0 0 0 -1 2216722496 0 0 0 0 0 0 0 0 20 0 1 0 195 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 3 0 0 0 0 0 75 (kslowd000) S 2 0 0 0 -1 2149580864 0 0 0 0 0 0 0 0 15 -5 1 0 195 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 1 0 0 0 0 0 76 (kslowd001) S 2 0 0 0 -1 2149580864 0 0 0 0 0 0 0 0 15 -5 1 0 195 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 1 0 0 0 0 0 77 (crypto/0) S 2 0 0 0 -1 2216722496 0 0 0 0 0 0 0 0 20 0 1 0 195 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 0 0 0 0 0 0 78 (crypto/1) S 2 0 0 0 -1 2216722496 0 0 0 0 0 0 0 0 20 0 1 0 195 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 1 0 0 0 0 0 79 (crypto/2) S 2 0 0 0 -1 2216722496 0 0 0 0 0 0 0 0 20 0 1 0 195 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 2 0 0 0 0 0 7 (cpuset) S 2 0 0 0 -1 2149613632 0 0 0 0 0 0 0 0 20 0 1 0 93 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 0 0 0 0 0 0 80 (crypto/3) S 2 0 0 0 -1 2216722496 0 0 0 0 0 0 0 0 20 0 1 0 195 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 3 0 0 0 0 0 83 (net_accel/0) S 2 0 0 0 -1 2216722496 0 0 0 0 0 0 0 0 20 0 1 0 200 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 0 0 0 0 0 0 84 (net_accel/1) S 2 0 0 0 -1 2216722496 0 0 0 0 0 0 0 0 20 0 1 0 200 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 1 0 0 0 0 0 85 (net_accel/2) S 2 0 0 0 -1 2216722496 0 0 0 0 0 0 0 0 20 0 1 0 200 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 2 0 0 0 0 0 86 (net_accel/3) S 2 0 0 0 -1 2216722496 0 0 0 0 0 0 0 0 20 0 1 0 200 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 3 0 0 0 0 0 87 (sfc_netfront/0) S 2 0 0 0 -1 2216722496 0 0 0 0 0 0 0 0 20 0 1 0 203 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 0 0 0 0 0 0 88 (sfc_netfront/1) S 2 0 0 0 -1 2216722496 0 0 0 0 0 0 0 0 20 0 1 0 203 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 1 0 0 0 0 0 89 (sfc_netfront/2) S 2 0 0 0 -1 2216722496 0 0 0 0 0 0 0 0 20 0 1 0 203 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 2 0 0 0 0 0 8 (khelper) S 2 0 0 0 -1 2149613632 0 0 0 0 0 0 0 0 20 0 1 0 93 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 0 0 0 0 0 0 90 (sfc_netfront/3) S 2 0 0 0 -1 2216722496 0 0 0 0 0 0 0 0 20 0 1 0 203 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 3 0 0 0 0 0 91 (kstriped) S 2 0 0 0 -1 2149613632 0 0 0 0 0 0 0 0 20 0 1 0 203 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 2 0 0 0 0 0 92 (kjournald) S 2 0 0 0 -1 2149613632 0 0 0 0 6 0 0 0 20 0 1 0 216 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 1 0 0 0 0 0 9 (netns) S 2 0 0 0 -1 2149613632 0 0 0 0 0 0 0 0 20 0 1 0 93 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 0 0 0 0 0 0 3916 (cat) R 3302 3916 3302 34816 3916 4202496 248 0 0 0 0 0 0 0 20 0 1 0 26900 5636096 182 18446744073709551615 4194304 4247204 140736357334480 18446744073709551615 139795312997536 0 0 0 0 0 0 0 17 1 0 0 0 0 0 This is using the Ubuntu 10.04 64 bit AMI (us-east-1/ami-da0cf8b3), cluster created by whirr-0.6.0-incubating. Any ideas here? I'm dead in the water. I'm going to take a stab at using CDH3U3, which just came out, but since this defect isn't yet resolved, I'm not holding out much hope.
          Hide
          Evan Pollan added a comment -

          Turns out, the bug is in CDH3U3's version of ProcfsBasedProcessTree. Once I updated my cluster creation automation to specifically use CDH3U2 (rather than the latest update to CDH3), the problem went away.

          CDH3U3's version of ProcfsBasedProcessTree is much closer to the trunk's version than CDH3U2 (as you would expect). So, it could be that more recent versions of this class have introduced incompatibilities with 64 bit Ubuntu (and possibly other distros).

          Show
          Evan Pollan added a comment - Turns out, the bug is in CDH3U3's version of ProcfsBasedProcessTree. Once I updated my cluster creation automation to specifically use CDH3U2 (rather than the latest update to CDH3), the problem went away. CDH3U3's version of ProcfsBasedProcessTree is much closer to the trunk's version than CDH3U2 (as you would expect). So, it could be that more recent versions of this class have introduced incompatibilities with 64 bit Ubuntu (and possibly other distros).
          Hide
          Evan Pollan added a comment -

          I yanked the regexp and capture group logic out of the CDH3U3 version of ProcfsBasedProcessTree, and ran against all pid stat files on a 64 bit Ubuntu 10.04 system in EC2. Sure enough, the namenode's JVM stat file parses out to:

          name:	(java)
          ppid:	2616
          gid:	2276
          sess:	2276
          utime:	32025597350190191
          stime:	18446744073709551581
          vsize:	1438765056
          rss:	28505
          

          The 18446744073709551581 is definitely in the 15th field, documented per the Ubuntu proc man page as stime. It just happens to not fit into a signed 64 bit integer...

          Here's the actual contents of the stat file:

          2625 (java) S 2616 2276 2276 0 -1 4202496 36120 35150 0 1 32025597350190191 18446744073709551581 55 2 20 0 41 0 4286 1438765056 28567 18446744073709551615 1073741824 1073778416 0 0 0 0 0 1 16800974 18446744073709551615 0 0 17 3 0 0 0 0 0
          
          Show
          Evan Pollan added a comment - I yanked the regexp and capture group logic out of the CDH3U3 version of ProcfsBasedProcessTree, and ran against all pid stat files on a 64 bit Ubuntu 10.04 system in EC2. Sure enough, the namenode's JVM stat file parses out to: name: (java) ppid: 2616 gid: 2276 sess: 2276 utime: 32025597350190191 stime: 18446744073709551581 vsize: 1438765056 rss: 28505 The 18446744073709551581 is definitely in the 15th field, documented per the Ubuntu proc man page as stime. It just happens to not fit into a signed 64 bit integer... Here's the actual contents of the stat file: 2625 (java) S 2616 2276 2276 0 -1 4202496 36120 35150 0 1 32025597350190191 18446744073709551581 55 2 20 0 41 0 4286 1438765056 28567 18446744073709551615 1073741824 1073778416 0 0 0 0 0 1 16800974 18446744073709551615 0 0 17 3 0 0 0 0 0
          Hide
          Ted Yu added a comment -

          Thanks for sharing, Evan.

                        : (this.utime + this.stime) - (oldInfo.utime + oldInfo.stime));
          

          How do you think we should deal with the above calculation involving stime ?

          Show
          Ted Yu added a comment - Thanks for sharing, Evan. : ( this .utime + this .stime) - (oldInfo.utime + oldInfo.stime)); How do you think we should deal with the above calculation involving stime ?
          Hide
          Evan Pollan added a comment -

          Zhihong – First off, I'm assuming that this particular stime value is not legitimate. However, if you search around, there are discussions of how to handle wrapped jiffies values in system-level code (e.g. here).

          Regardless, I believe these time values are unsigned 64 bit ints in the linux kernel code, no? Why don't you just parse and handle them internally as BigIntegers?

          Just be aware that the values apparently can wrap even within an unsigned 64 bit number – so pay attention to the math.

          Show
          Evan Pollan added a comment - Zhihong – First off, I'm assuming that this particular stime value is not legitimate. However, if you search around, there are discussions of how to handle wrapped jiffies values in system-level code (e.g. here ). Regardless, I believe these time values are unsigned 64 bit ints in the linux kernel code, no? Why don't you just parse and handle them internally as BigIntegers? Just be aware that the values apparently can wrap even within an unsigned 64 bit number – so pay attention to the math.
          Hide
          Ted Yu added a comment -

          I am fine with changing the type of stime to BigInteger.

          This issue has been dormant for over 40 days. I want to get some comment from committers about the proposed change before attaching the next patch.

          Show
          Ted Yu added a comment - I am fine with changing the type of stime to BigInteger. This issue has been dormant for over 40 days. I want to get some comment from committers about the proposed change before attaching the next patch.
          Hide
          Ted Yu added a comment -

          If we change stime to BigInteger, dtime should be changed too.
          Since dtime is used in getCumulativeCpuTime(), we need to change getCumulativeCpuTime() to return BigInteger as well.
          There seems to be some ripple effect in terms of API changes.

          Show
          Ted Yu added a comment - If we change stime to BigInteger, dtime should be changed too. Since dtime is used in getCumulativeCpuTime(), we need to change getCumulativeCpuTime() to return BigInteger as well. There seems to be some ripple effect in terms of API changes.
          Hide
          Tom White added a comment -

          Changing to use BigInteger sounds like the right thing to do.

          > Since dtime is used in getCumulativeCpuTime(), we need to change getCumulativeCpuTime() to return BigInteger as well.

          Is it not possible to do the calculations using BigIntegers (to avoid overflow) then convert to a long (since the final result can be represented in a long)?

          Show
          Tom White added a comment - Changing to use BigInteger sounds like the right thing to do. > Since dtime is used in getCumulativeCpuTime(), we need to change getCumulativeCpuTime() to return BigInteger as well. Is it not possible to do the calculations using BigIntegers (to avoid overflow) then convert to a long (since the final result can be represented in a long)?
          Hide
          Ted Yu added a comment -

          Here is the method body:

              public void updateJiffy(ProcessInfo oldInfo) {
                this.dtime = (oldInfo == null ? this.utime + this.stime
                        : (this.utime + this.stime) - (oldInfo.utime + oldInfo.stime));
              }
          

          If oldInfo == null, I am not sure if (this.utime + this.stime) can be stored in a long.

          Show
          Ted Yu added a comment - Here is the method body: public void updateJiffy(ProcessInfo oldInfo) { this .dtime = (oldInfo == null ? this .utime + this .stime : ( this .utime + this .stime) - (oldInfo.utime + oldInfo.stime)); } If oldInfo == null, I am not sure if (this.utime + this.stime) can be stored in a long.
          Hide
          Evan Pollan added a comment -

          Right – you'd have to protect against that. But, does initializing dtime as the sum of utime+stime even make sense? Seems like you could just as well initialize it to zero until you can calculate the first difference (disclaimer: I have no idea how the dtime property is used

          Also, you could use BigInteger arithmetic to replace the existing utime/stime sum and differencing logic, and just ensure that what you assign to dtime (if you keep it typed as a primitive long) will be ceil()'ed at Long.MAX_VALUE.

          Show
          Evan Pollan added a comment - Right – you'd have to protect against that. But, does initializing dtime as the sum of utime+stime even make sense? Seems like you could just as well initialize it to zero until you can calculate the first difference (disclaimer: I have no idea how the dtime property is used Also, you could use BigInteger arithmetic to replace the existing utime/stime sum and differencing logic, and just ensure that what you assign to dtime (if you keep it typed as a primitive long) will be ceil()'ed at Long.MAX_VALUE.
          Hide
          Ted Yu added a comment -

          I don't think we can cap 18446744073709551581, reported above, at Long.MAX_VALUE.

          Show
          Ted Yu added a comment - I don't think we can cap 18446744073709551581, reported above, at Long.MAX_VALUE.
          Hide
          Tom White added a comment -

          The CPU measurements are exposed as a MR counter so you can see cumulative time for each task. The measurements are taken every 3 seconds by the task (see mapred.Task).

          The idea that you get the initial dtime as utime+stime does make sense - since it's the initial reading for the process, then all updates are deltas. However, looking at the code in Task, it looks like the initial value is removed (cpuTime -= initCpuCumulativeTime;) to account for JVM reuse. (It actually looks like initCpuCumulativeTime is removed for every update, which is incorrect if updates are deltas - so that's another bug.) So initializing dtime to 0 would have the same effect.

          I think the options are the following:

          1. Start from first delta (throws away 3 seconds of CPU time)
          2. Use initial utime/stime values only if they look reasonable (less than Long.MAX_INT)
          3. Use BigInteger for dtime (will always report underlying proc values, needs API changes)

          Show
          Tom White added a comment - The CPU measurements are exposed as a MR counter so you can see cumulative time for each task. The measurements are taken every 3 seconds by the task (see mapred.Task). The idea that you get the initial dtime as utime+stime does make sense - since it's the initial reading for the process, then all updates are deltas. However, looking at the code in Task, it looks like the initial value is removed ( cpuTime -= initCpuCumulativeTime; ) to account for JVM reuse. (It actually looks like initCpuCumulativeTime is removed for every update, which is incorrect if updates are deltas - so that's another bug.) So initializing dtime to 0 would have the same effect. I think the options are the following: 1. Start from first delta (throws away 3 seconds of CPU time) 2. Use initial utime/stime values only if they look reasonable (less than Long.MAX_INT) 3. Use BigInteger for dtime (will always report underlying proc values, needs API changes)
          Hide
          Ted Yu added a comment -

          Thanks for the digging, Tom. Appreciate it.

          Do you have a preference among the three choices ?

          Show
          Ted Yu added a comment - Thanks for the digging, Tom. Appreciate it. Do you have a preference among the three choices ?
          Hide
          Tom White added a comment -

          #2, and print a warning if utime+stime > Long.MAX_INT.

          Show
          Tom White added a comment - #2, and print a warning if utime+stime > Long.MAX_INT.
          Hide
          Tom White added a comment -

          BTW there is discussion of the stime/utime stuff in MAPREDUCE-1201 where this feature was introduced.

          Show
          Tom White added a comment - BTW there is discussion of the stime/utime stuff in MAPREDUCE-1201 where this feature was introduced.
          Hide
          Ted Yu added a comment -

          For clarification: should my patch be based on hadoop 1.0 or TRUNK ?

          Show
          Ted Yu added a comment - For clarification: should my patch be based on hadoop 1.0 or TRUNK ?
          Hide
          Ted Yu added a comment -

          Patch v2 is based on hadoop 1.0, using choice #2.

          Show
          Ted Yu added a comment - Patch v2 is based on hadoop 1.0, using choice #2.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12513194/mapreduce-3583-v2.txt
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          -1 tests included. The patch doesn't appear to include any new or modified tests.
          Please justify why no new tests are needed for this patch.
          Also please list what manual steps were performed to verify this patch.

          -1 patch. The patch command could not apply the patch.

          Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1769//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12513194/mapreduce-3583-v2.txt against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1769//console This message is automatically generated.
          Hide
          Tom White added a comment -

          Looks good. A few comments:

          • Should utime be represented internally as a BigInteger too?
          • (Nit) new BigInteger("0") -> BigInteger.ZERO
          • Can you write a unit test that uses some of the values from /proc (above) to test that the patch works in those cases.
          • The JVM reuse bug I mentioned above needs to be fixed too.

          The patch should also be prepared against trunk so that jenkins can test it.

          Have you tested this on a real machine?

          Show
          Tom White added a comment - Looks good. A few comments: Should utime be represented internally as a BigInteger too? (Nit) new BigInteger("0") -> BigInteger.ZERO Can you write a unit test that uses some of the values from /proc (above) to test that the patch works in those cases. The JVM reuse bug I mentioned above needs to be fixed too. The patch should also be prepared against trunk so that jenkins can test it. Have you tested this on a real machine?
          Hide
          Evan Pollan added a comment -

          I just tested this out by applying the patch to the CDH3U3 codebase and deployed the patched hadoop-core jar file. The task trackers failed to start based on a NullPointerException:

          
          2012-02-06 23:32:21,392 ERROR org.apache.hadoop.mapred.TaskTracker: Can not start task tracker because java.lang.NullPointerException
                  at java.util.regex.Matcher.getTextLength(Matcher.java:1140)
                  at java.util.regex.Matcher.reset(Matcher.java:291)
                  at java.util.regex.Matcher.<init>(Matcher.java:211)
                  at java.util.regex.Pattern.matcher(Pattern.java:888)
                  at org.apache.hadoop.util.ProcfsBasedProcessTree.getValidPID(ProcfsBasedProcessTree.java:347)
                  at org.apache.hadoop.util.ProcfsBasedProcessTree.<init>(ProcfsBasedProcessTree.java:105)
                  at org.apache.hadoop.util.ProcfsBasedProcessTree.<init>(ProcfsBasedProcessTree.java:101)
                  at org.apache.hadoop.util.ProcfsBasedProcessTree.<init>(ProcfsBasedProcessTree.java:97)
                  at org.apache.hadoop.util.LinuxResourceCalculatorPlugin.<init>(LinuxResourceCalculatorPlugin.java:108)
                  at org.apache.hadoop.util.ResourceCalculatorPlugin.getResourceCalculatorPlugin(ResourceCalculatorPlugin.java:149)
                  at org.apache.hadoop.mapred.TaskTracker.initialize(TaskTracker.java:966)
                  at org.apache.hadoop.mapred.TaskTracker.<init>(TaskTracker.java:1611)
                  at org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:3857)
          

          This was running on the same Ubuntu 10.04 64 bit EC2 AMI I referenced above.

          Show
          Evan Pollan added a comment - I just tested this out by applying the patch to the CDH3U3 codebase and deployed the patched hadoop-core jar file. The task trackers failed to start based on a NullPointerException: 2012-02-06 23:32:21,392 ERROR org.apache.hadoop.mapred.TaskTracker: Can not start task tracker because java.lang.NullPointerException at java.util.regex.Matcher.getTextLength(Matcher.java:1140) at java.util.regex.Matcher.reset(Matcher.java:291) at java.util.regex.Matcher.<init>(Matcher.java:211) at java.util.regex.Pattern.matcher(Pattern.java:888) at org.apache.hadoop.util.ProcfsBasedProcessTree.getValidPID(ProcfsBasedProcessTree.java:347) at org.apache.hadoop.util.ProcfsBasedProcessTree.<init>(ProcfsBasedProcessTree.java:105) at org.apache.hadoop.util.ProcfsBasedProcessTree.<init>(ProcfsBasedProcessTree.java:101) at org.apache.hadoop.util.ProcfsBasedProcessTree.<init>(ProcfsBasedProcessTree.java:97) at org.apache.hadoop.util.LinuxResourceCalculatorPlugin.<init>(LinuxResourceCalculatorPlugin.java:108) at org.apache.hadoop.util.ResourceCalculatorPlugin.getResourceCalculatorPlugin(ResourceCalculatorPlugin.java:149) at org.apache.hadoop.mapred.TaskTracker.initialize(TaskTracker.java:966) at org.apache.hadoop.mapred.TaskTracker.<init>(TaskTracker.java:1611) at org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:3857) This was running on the same Ubuntu 10.04 64 bit EC2 AMI I referenced above.
          Hide
          Ted Yu added a comment -

          The cause for the NPE was that JVM_PID wasn't set in the environment:

              String pid = System.getenv().get("JVM_PID");
              pTree = new ProcfsBasedProcessTree(pid);
          

          ProcfsBasedProcessTree.getValidPID() should be handle the above case gracefully.

          Show
          Ted Yu added a comment - The cause for the NPE was that JVM_PID wasn't set in the environment: String pid = System .getenv().get( "JVM_PID" ); pTree = new ProcfsBasedProcessTree(pid); ProcfsBasedProcessTree.getValidPID() should be handle the above case gracefully.
          Hide
          Ted Yu added a comment -

          Patch v3 handles the case where pid is null in getValidPID().

          Show
          Ted Yu added a comment - Patch v3 handles the case where pid is null in getValidPID().
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12513534/mapreduce-3583-v3.txt
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          -1 tests included. The patch doesn't appear to include any new or modified tests.
          Please justify why no new tests are needed for this patch.
          Also please list what manual steps were performed to verify this patch.

          -1 patch. The patch command could not apply the patch.

          Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1805//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12513534/mapreduce-3583-v3.txt against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1805//console This message is automatically generated.
          Hide
          Mahadev konar added a comment -

          Took a look at the patch. Looks good to me. I think we should be able to add a unit test for this. Take a look at TestProcfsBasedProcessTree.java. Also, there should be a trunk patch for this jira since trunk has the same issue.

          Show
          Mahadev konar added a comment - Took a look at the patch. Looks good to me. I think we should be able to add a unit test for this. Take a look at TestProcfsBasedProcessTree.java. Also, there should be a trunk patch for this jira since trunk has the same issue.
          Hide
          Tsz Wo Nicholas Sze added a comment -

          I like the patch that it cleans up a lot of integer-string back and forth conversions. Some comments:

          • The numberPattern matches 0, which is supposed to be an invalid pid. The pattern should be
            "[1-9][0-9]*"
            

            , which also disallows leading zeros.

          • Some integer checks are skipped. In particular, the following code allows any dir name to be added even if it is not a number. BTW, it should not catch NumberFormatException anymore.
                   for (String dir : processDirs) {
                     try {
            -          int pd = Integer.parseInt(dir);
                       if ((new File(procfsDir, dir)).isDirectory()) {
            -            processList.add(Integer.valueOf(pd));
            +            processList.add(dir);
                       }
                     } catch (NumberFormatException n) {
            
          • It is better to not use BigInteger since it is expensive. Let me think about it in more details.
          Show
          Tsz Wo Nicholas Sze added a comment - I like the patch that it cleans up a lot of integer-string back and forth conversions. Some comments: The numberPattern matches 0, which is supposed to be an invalid pid. The pattern should be "[1-9][0-9]*" , which also disallows leading zeros. Some integer checks are skipped. In particular, the following code allows any dir name to be added even if it is not a number. BTW, it should not catch NumberFormatException anymore. for ( String dir : processDirs) { try { - int pd = Integer .parseInt(dir); if (( new File(procfsDir, dir)).isDirectory()) { - processList.add( Integer .valueOf(pd)); + processList.add(dir); } } catch (NumberFormatException n) { It is better to not use BigInteger since it is expensive. Let me think about it in more details.
          Hide
          Tsz Wo Nicholas Sze added a comment -

          BigInteger can be avoided by the following code. I assumed that stime >= 0 and utime >= 0. We should add some checks if necessary.

              public void updateJiffy(ProcessInfo oldInfo) {
                if (oldInfo == null) {
                  dtime = stime + utime;
                  if (dtime < 0) { //Overflow, assumed that stime >= 0 and utime >= 0.
                    LOG.warn("stime + utime > Long.MAX_VALUE, where stime=" + stime
                        + ", utime=" + utime + ", Long.MAX_VALUE=" + Long.MAX_VALUE
                        + ".  However, dtime=" + dtime);
                    dtime = 0L;
                  }
                } else {
                  this.dtime = (this.utime - oldInfo.utime) + (this.stime - oldInfo.stime);
                }
              }
          
          Show
          Tsz Wo Nicholas Sze added a comment - BigInteger can be avoided by the following code. I assumed that stime >= 0 and utime >= 0. We should add some checks if necessary. public void updateJiffy(ProcessInfo oldInfo) { if (oldInfo == null ) { dtime = stime + utime; if (dtime < 0) { //Overflow, assumed that stime >= 0 and utime >= 0. LOG.warn( "stime + utime > Long .MAX_VALUE, where stime=" + stime + ", utime=" + utime + ", Long .MAX_VALUE=" + Long .MAX_VALUE + ". However, dtime=" + dtime); dtime = 0L; } } else { this .dtime = ( this .utime - oldInfo.utime) + ( this .stime - oldInfo.stime); } }
          Hide
          Ted Yu added a comment -

          Thanks for the feedback, Mahadev and Nicolas.

          Here is patch v4.

          I will provide patch for TRUNK after we finalize patch for hadoop 1.0

          w.r.t. writing new test, I am not sure how to do fault injection.
          Will investigate.

          Show
          Ted Yu added a comment - Thanks for the feedback, Mahadev and Nicolas. Here is patch v4. I will provide patch for TRUNK after we finalize patch for hadoop 1.0 w.r.t. writing new test, I am not sure how to do fault injection. Will investigate.
          Hide
          Ted Yu added a comment -

          w.r.t. updateJiffy(), I put the check for negative dtime outside if/else block.

          Show
          Ted Yu added a comment - w.r.t. updateJiffy(), I put the check for negative dtime outside if/else block.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12514427/mapreduce-3583-v4.txt
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          -1 tests included. The patch doesn't appear to include any new or modified tests.
          Please justify why no new tests are needed for this patch.
          Also please list what manual steps were performed to verify this patch.

          -1 patch. The patch command could not apply the patch.

          Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1850//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12514427/mapreduce-3583-v4.txt against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1850//console This message is automatically generated.
          Hide
          Tsz Wo Nicholas Sze added a comment -

          > w.r.t. updateJiffy(), I put the check for negative dtime outside if/else block.

          You are right that the check should be outside if/else. Then, we also have to update the log message and the comment. How about we add a setDtime(..) method that also checks s<0 and u<0?

              /** set dtime = s + u; also check for overflow and negative values. */
              private void setDtime(final long s, final long u) {
                if (s < 0L) {
                  LOG.warn("System time, s = " + s + " < 0");
                }
                if (u < 0L) {
                  LOG.warn("User time, u = " + u + " < 0");
                }
                dtime = s + u;
                if (dtime < 0L) {
                  LOG.warn("s + u = " + dtime + " < 0, where s=" + s + " and u=" + u);
                  dtime = 0L;
                }
              }
          
              public void updateJiffy(ProcessInfo oldInfo) {
                if (oldInfo == null) {
                  setDtime(stime, utime);
                } else {
                  setDtime(this.stime - oldInfo.stime,  this.utime - oldInfo.utime);
                }
              }
          

          BTW, the BigInteger import should be removed.

          +import java.math.BigInteger;
          
          Show
          Tsz Wo Nicholas Sze added a comment - > w.r.t. updateJiffy(), I put the check for negative dtime outside if/else block. You are right that the check should be outside if/else. Then, we also have to update the log message and the comment. How about we add a setDtime(..) method that also checks s<0 and u<0? /** set dtime = s + u; also check for overflow and negative values. */ private void setDtime( final long s, final long u) { if (s < 0L) { LOG.warn( " System time, s = " + s + " < 0" ); } if (u < 0L) { LOG.warn( "User time, u = " + u + " < 0" ); } dtime = s + u; if (dtime < 0L) { LOG.warn( "s + u = " + dtime + " < 0, where s=" + s + " and u=" + u); dtime = 0L; } } public void updateJiffy(ProcessInfo oldInfo) { if (oldInfo == null ) { setDtime(stime, utime); } else { setDtime( this .stime - oldInfo.stime, this .utime - oldInfo.utime); } } BTW, the BigInteger import should be removed. + import java.math.BigInteger;
          Hide
          Ted Yu added a comment -

          build/test/testsfailed is empty after running 'ant test-core' based on patch v4.

          Show
          Ted Yu added a comment - build/test/testsfailed is empty after running 'ant test-core' based on patch v4.
          Hide
          Ted Yu added a comment -

          Patch v5 addresses Nicolas' comments

          Show
          Ted Yu added a comment - Patch v5 addresses Nicolas' comments
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12514440/mapreduce-3583-v5.txt
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          -1 tests included. The patch doesn't appear to include any new or modified tests.
          Please justify why no new tests are needed for this patch.
          Also please list what manual steps were performed to verify this patch.

          -1 patch. The patch command could not apply the patch.

          Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1852//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12514440/mapreduce-3583-v5.txt against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1852//console This message is automatically generated.
          Hide
          Tsz Wo Nicholas Sze added a comment -

          +1 patch looks good. Thanks a lot!

          Show
          Tsz Wo Nicholas Sze added a comment - +1 patch looks good. Thanks a lot!
          Hide
          Ted Yu added a comment -

          For TRUNK, should both of the following be included in patch ?

          hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/util/ProcfsBasedProcessTree.java
          hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/util/ProcfsBasedProcessTree.java
          
          Show
          Ted Yu added a comment - For TRUNK, should both of the following be included in patch ? hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/util/ProcfsBasedProcessTree.java hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/util/ProcfsBasedProcessTree.java
          Hide
          Ted Yu added a comment -

          Patch for TRUNK.

          All tests under hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core passed.

          TestProcfsBasedProcessTree passed as well.

          Show
          Ted Yu added a comment - Patch for TRUNK. All tests under hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core passed. TestProcfsBasedProcessTree passed as well.
          Hide
          Mahadev konar added a comment -

          Looks like jenkins is down. Will run the trunk patch through hudson as soon as the build machines are up!

          Show
          Mahadev konar added a comment - Looks like jenkins is down. Will run the trunk patch through hudson as soon as the build machines are up!
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12514572/mapreduce-3583-trunk.txt
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 3 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 eclipse:eclipse. The patch built with eclipse:eclipse.

          -1 findbugs. The patch appears to introduce 3 new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed these unit tests:
          org.apache.hadoop.mapreduce.util.TestProcfsBasedProcessTree

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1858//testReport/
          Findbugs warnings: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1858//artifact/trunk/hadoop-mapreduce-project/patchprocess/newPatchFindbugsWarningshadoop-mapreduce-client-core.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1858//artifact/trunk/hadoop-mapreduce-project/patchprocess/newPatchFindbugsWarningshadoop-yarn-common.html
          Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1858//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12514572/mapreduce-3583-trunk.txt against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 eclipse:eclipse. The patch built with eclipse:eclipse. -1 findbugs. The patch appears to introduce 3 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.mapreduce.util.TestProcfsBasedProcessTree +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1858//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1858//artifact/trunk/hadoop-mapreduce-project/patchprocess/newPatchFindbugsWarningshadoop-mapreduce-client-core.html Findbugs warnings: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1858//artifact/trunk/hadoop-mapreduce-project/patchprocess/newPatchFindbugsWarningshadoop-yarn-common.html Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1858//console This message is automatically generated.
          Hide
          Ted Yu added a comment -

          Patch v2 for TRUNK.
          Patch v1 missed PROCESSTREE_DUMP_FORMAT in hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/util/ProcfsBasedProcessTree.java

          Show
          Ted Yu added a comment - Patch v2 for TRUNK. Patch v1 missed PROCESSTREE_DUMP_FORMAT in hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/util/ProcfsBasedProcessTree.java
          Hide
          Ted Yu added a comment -

          Reattaching patch v2 for TRUNK with --no-prefix

          Show
          Ted Yu added a comment - Reattaching patch v2 for TRUNK with --no-prefix
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12514624/mapreduce-3583-trunk-v2.txt
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 3 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 eclipse:eclipse. The patch built with eclipse:eclipse.

          -1 findbugs. The patch appears to introduce 2 new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed these unit tests:
          org.apache.hadoop.mapreduce.util.TestProcfsBasedProcessTree

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1859//testReport/
          Findbugs warnings: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1859//artifact/trunk/hadoop-mapreduce-project/patchprocess/newPatchFindbugsWarningshadoop-mapreduce-client-core.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1859//artifact/trunk/hadoop-mapreduce-project/patchprocess/newPatchFindbugsWarningshadoop-yarn-common.html
          Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1859//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12514624/mapreduce-3583-trunk-v2.txt against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 eclipse:eclipse. The patch built with eclipse:eclipse. -1 findbugs. The patch appears to introduce 2 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.mapreduce.util.TestProcfsBasedProcessTree +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1859//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1859//artifact/trunk/hadoop-mapreduce-project/patchprocess/newPatchFindbugsWarningshadoop-mapreduce-client-core.html Findbugs warnings: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1859//artifact/trunk/hadoop-mapreduce-project/patchprocess/newPatchFindbugsWarningshadoop-yarn-common.html Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1859//console This message is automatically generated.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12514626/mapreduce-3583-trunk-v2.txt
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 3 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 eclipse:eclipse. The patch built with eclipse:eclipse.

          -1 findbugs. The patch appears to introduce 2 new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed these unit tests:
          org.apache.hadoop.mapreduce.util.TestProcfsBasedProcessTree

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1860//testReport/
          Findbugs warnings: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1860//artifact/trunk/hadoop-mapreduce-project/patchprocess/newPatchFindbugsWarningshadoop-mapreduce-client-core.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1860//artifact/trunk/hadoop-mapreduce-project/patchprocess/newPatchFindbugsWarningshadoop-yarn-common.html
          Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1860//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12514626/mapreduce-3583-trunk-v2.txt against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 eclipse:eclipse. The patch built with eclipse:eclipse. -1 findbugs. The patch appears to introduce 2 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.mapreduce.util.TestProcfsBasedProcessTree +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1860//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1860//artifact/trunk/hadoop-mapreduce-project/patchprocess/newPatchFindbugsWarningshadoop-mapreduce-client-core.html Findbugs warnings: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1860//artifact/trunk/hadoop-mapreduce-project/patchprocess/newPatchFindbugsWarningshadoop-yarn-common.html Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1860//console This message is automatically generated.
          Hide
          Ted Yu added a comment -

          I couldn't reproduce the remaining test failure:

            737  cd hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient
            738  mvn test -Dtest=TestProcfsBasedProcessTree
          
          Show
          Ted Yu added a comment - I couldn't reproduce the remaining test failure: 737 cd hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient 738 mvn test -Dtest=TestProcfsBasedProcessTree
          Hide
          Ted Yu added a comment -

          Noticed the exception thrown from checkPidPgrpidForMatch(285)

          Patch v3 adds pgrpId to the exception message

          Show
          Ted Yu added a comment - Noticed the exception thrown from checkPidPgrpidForMatch(285) Patch v3 adds pgrpId to the exception message
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12514689/mapreduce-3583-trunk-v3.txt
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 3 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 eclipse:eclipse. The patch built with eclipse:eclipse.

          -1 findbugs. The patch appears to introduce 2 new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed these unit tests:
          org.apache.hadoop.mapreduce.util.TestProcfsBasedProcessTree

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1866//testReport/
          Findbugs warnings: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1866//artifact/trunk/hadoop-mapreduce-project/patchprocess/newPatchFindbugsWarningshadoop-mapreduce-client-core.html
          Findbugs warnings: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1866//artifact/trunk/hadoop-mapreduce-project/patchprocess/newPatchFindbugsWarningshadoop-yarn-common.html
          Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1866//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12514689/mapreduce-3583-trunk-v3.txt against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 eclipse:eclipse. The patch built with eclipse:eclipse. -1 findbugs. The patch appears to introduce 2 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.mapreduce.util.TestProcfsBasedProcessTree +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1866//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1866//artifact/trunk/hadoop-mapreduce-project/patchprocess/newPatchFindbugsWarningshadoop-mapreduce-client-core.html Findbugs warnings: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1866//artifact/trunk/hadoop-mapreduce-project/patchprocess/newPatchFindbugsWarningshadoop-yarn-common.html Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1866//console This message is automatically generated.
          Hide
          Ted Yu added a comment -

          Patch v4 for TRUNK should pass.

          Show
          Ted Yu added a comment - Patch v4 for TRUNK should pass.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12514704/mapreduce-3583-trunk-v4.txt
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 3 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 eclipse:eclipse. The patch built with eclipse:eclipse.

          -1 findbugs. The patch appears to introduce 1 new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed unit tests in .

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1868//testReport/
          Findbugs warnings: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1868//artifact/trunk/hadoop-mapreduce-project/patchprocess/newPatchFindbugsWarningshadoop-yarn-common.html
          Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1868//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12514704/mapreduce-3583-trunk-v4.txt against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 eclipse:eclipse. The patch built with eclipse:eclipse. -1 findbugs. The patch appears to introduce 1 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed unit tests in . +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1868//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1868//artifact/trunk/hadoop-mapreduce-project/patchprocess/newPatchFindbugsWarningshadoop-yarn-common.html Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1868//console This message is automatically generated.
          Hide
          Ted Yu added a comment -

          @Matt, @Mahadev, @Nicolas:
          Can you take another look ?
          Patches for hadoop 1.0 and TRUNK are good to go.

          Show
          Ted Yu added a comment - @Matt, @Mahadev, @Nicolas: Can you take another look ? Patches for hadoop 1.0 and TRUNK are good to go.
          Hide
          Tsz Wo Nicholas Sze added a comment -

          Hi Zhihong, thanks for all the hard works! There is a findbugs warning in the last build. Could you take a look?

          Show
          Tsz Wo Nicholas Sze added a comment - Hi Zhihong, thanks for all the hard works! There is a findbugs warning in the last build. Could you take a look?
          Hide
          Ted Yu added a comment -

          Patch v5 for TRUNK fixes the warning.

          Show
          Ted Yu added a comment - Patch v5 for TRUNK fixes the warning.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12514728/mapreduce-3583-trunk-v5.txt
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 3 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 eclipse:eclipse. The patch built with eclipse:eclipse.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed these unit tests:
          org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.TestContainersMonitor

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1872//testReport/
          Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1872//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12514728/mapreduce-3583-trunk-v5.txt against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 eclipse:eclipse. The patch built with eclipse:eclipse. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.TestContainersMonitor +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1872//testReport/ Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1872//console This message is automatically generated.
          Hide
          Mahadev konar added a comment -

          Ted,
          Looks like one of test cases (TestContainerMonitor) failed with:

          2012-02-15 23:20:54,516 WARN  [Container Monitor] monitor.ContainersMonitorImpl (ContainersMonitorImpl.java:run(456)) - Uncaught exception in ContainerMemoryManager while managing memory of container_0_0000_01_000000
          java.lang.NumberFormatException: For input string: "18446743988089421650"
          	at java.lang.NumberFormatException.forInputString(NumberFormatException.java:48)
          	at java.lang.Long.parseLong(Long.java:422)
          	at java.lang.Long.parseLong(Long.java:468)
          	at org.apache.hadoop.yarn.util.ProcfsBasedProcessTree.constructProcessInfo(ProcfsBasedProcessTree.java:424)
          	at org.apache.hadoop.yarn.util.ProcfsBasedProcessTree.getProcessTree(ProcfsBasedProcessTree.java:170)
          

          This should have been fixed with the patch right?

          Show
          Mahadev konar added a comment - Ted, Looks like one of test cases (TestContainerMonitor) failed with: 2012-02-15 23:20:54,516 WARN [Container Monitor] monitor.ContainersMonitorImpl (ContainersMonitorImpl.java:run(456)) - Uncaught exception in ContainerMemoryManager while managing memory of container_0_0000_01_000000 java.lang.NumberFormatException: For input string: "18446743988089421650" at java.lang.NumberFormatException.forInputString(NumberFormatException.java:48) at java.lang. Long .parseLong( Long .java:422) at java.lang. Long .parseLong( Long .java:468) at org.apache.hadoop.yarn.util.ProcfsBasedProcessTree.constructProcessInfo(ProcfsBasedProcessTree.java:424) at org.apache.hadoop.yarn.util.ProcfsBasedProcessTree.getProcessTree(ProcfsBasedProcessTree.java:170) This should have been fixed with the patch right?
          Hide
          Tom White added a comment -

          Can you add a test for the overflow case too please.

          How do you want to handle the JVM reuse bug I mentioned above in https://issues.apache.org/jira/browse/MAPREDUCE-3583?focusedCommentId=13200004&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13200004

          Show
          Tom White added a comment - Can you add a test for the overflow case too please. How do you want to handle the JVM reuse bug I mentioned above in https://issues.apache.org/jira/browse/MAPREDUCE-3583?focusedCommentId=13200004&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13200004
          Hide
          Ted Yu added a comment -

          Patch v6 adds LOG so that we can know which one of (utime) (stime) (vsize) (rss) was extremely large.

          TestContainersMonitor passes on MacBook.

          Show
          Ted Yu added a comment - Patch v6 adds LOG so that we can know which one of (utime) (stime) (vsize) (rss) was extremely large. TestContainersMonitor passes on MacBook.
          Hide
          Ted Yu added a comment -

          Actually the latest test failure was kind of expected because the latest patches don't use BigInteger:

          -         pinfo.updateProcessInfo(m.group(2), Integer.parseInt(m.group(3)),
          +         pinfo.updateProcessInfo(m.group(2), m.group(3),
                            Integer.parseInt(m.group(4)), Integer.parseInt(m.group(5)),
                            Long.parseLong(m.group(7)), Long.parseLong(m.group(8)),
                            Long.parseLong(m.group(10)), Long.parseLong(m.group(11)));
          

          The NumberFormatException was due to stime, e.g., being too large.

          Show
          Ted Yu added a comment - Actually the latest test failure was kind of expected because the latest patches don't use BigInteger: - pinfo.updateProcessInfo(m.group(2), Integer .parseInt(m.group(3)), + pinfo.updateProcessInfo(m.group(2), m.group(3), Integer .parseInt(m.group(4)), Integer .parseInt(m.group(5)), Long .parseLong(m.group(7)), Long .parseLong(m.group(8)), Long .parseLong(m.group(10)), Long .parseLong(m.group(11))); The NumberFormatException was due to stime, e.g., being too large.
          Hide
          Mahadev konar added a comment -

          @Ted,
          TestContainersMonitor passes intermittently unless it runs into this issue as I posted on the trace above. Looks like even with the patch we are running into this issue?

          Show
          Mahadev konar added a comment - @Ted, TestContainersMonitor passes intermittently unless it runs into this issue as I posted on the trace above. Looks like even with the patch we are running into this issue?
          Hide
          Tom White added a comment -

          > The NumberFormatException was due to stime, e.g., being too large.

          Yes, this should be fixed by this patch. I don't see why we don't use BigInteger - the updates are called every 3 seconds by a task, so its use shouldn't be prohibitive.

          Show
          Tom White added a comment - > The NumberFormatException was due to stime, e.g., being too large. Yes, this should be fixed by this patch. I don't see why we don't use BigInteger - the updates are called every 3 seconds by a task, so its use shouldn't be prohibitive.
          Hide
          Ted Yu added a comment -

          @Tom, @Nicolas:
          Shall we revisit mapreduce-3583-v3.txt ?

          Show
          Ted Yu added a comment - @Tom, @Nicolas: Shall we revisit mapreduce-3583-v3.txt ?
          Hide
          Ted Yu added a comment -

          I was thinking of the following trick in constructProcessInfo():

                  long utime;
                  if (m.group(7).length() > 19) {
                    utime = 0;
                  } else {
                    utime = Long.parseLong(m.group(7));
                  }
          

          Comment is welcome.

          Show
          Ted Yu added a comment - I was thinking of the following trick in constructProcessInfo(): long utime; if (m.group(7).length() > 19) { utime = 0; } else { utime = Long.parseLong(m.group(7)); } Comment is welcome.
          Hide
          Tsz Wo Nicholas Sze added a comment -

          Sorry that I thought BigInteger was used for checking overflow. If the range of stime is expected to be larger than Long.MAX_VALUE, it is okay to use BigInteger for the moment. We may improve it later on.

          Show
          Tsz Wo Nicholas Sze added a comment - Sorry that I thought BigInteger was used for checking overflow. If the range of stime is expected to be larger than Long.MAX_VALUE, it is okay to use BigInteger for the moment. We may improve it later on.
          Hide
          Ted Yu added a comment -

          Patch v7 for TRUNK uses same technique as mapreduce-3583-v3.txt

          @Tom:
          Shall we address JVM reuse in another JIRA ?

          Show
          Ted Yu added a comment - Patch v7 for TRUNK uses same technique as mapreduce-3583-v3.txt @Tom: Shall we address JVM reuse in another JIRA ?
          Hide
          Hadoop QA added a comment -

          +1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12514733/mapreduce-3583-trunk-v6.txt
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 3 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 eclipse:eclipse. The patch built with eclipse:eclipse.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed unit tests in .

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1874//testReport/
          Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1874//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12514733/mapreduce-3583-trunk-v6.txt against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 eclipse:eclipse. The patch built with eclipse:eclipse. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed unit tests in . +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1874//testReport/ Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1874//console This message is automatically generated.
          Hide
          Hadoop QA added a comment -

          +1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12514740/mapreduce-3583-trunk-v7.txt
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 3 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 eclipse:eclipse. The patch built with eclipse:eclipse.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed unit tests in .

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1876//testReport/
          Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1876//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12514740/mapreduce-3583-trunk-v7.txt against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 eclipse:eclipse. The patch built with eclipse:eclipse. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed unit tests in . +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1876//testReport/ Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1876//console This message is automatically generated.
          Hide
          Ted Yu added a comment -

          mapreduce-3583-v3.txt and mapreduce-3583-trunk-v7.txt should be ready to go.

          Show
          Ted Yu added a comment - mapreduce-3583-v3.txt and mapreduce-3583-trunk-v7.txt should be ready to go.
          Hide
          Tsz Wo Nicholas Sze added a comment -

          Hi Ted, you still have to address my first two items in my previous comment.

          Show
          Tsz Wo Nicholas Sze added a comment - Hi Ted, you still have to address my first two items in my previous comment .
          Hide
          Ted Yu added a comment -

          @Nicolas:
          I agree. This is only for mapreduce-3583-v3.txt, right ?

          I am in China now where it is hard to connect MacBook to internet.
          If you need a modified patch from me, I will do it on 21st, the latest.

          Show
          Ted Yu added a comment - @Nicolas: I agree. This is only for mapreduce-3583-v3.txt, right ? I am in China now where it is hard to connect MacBook to internet. If you need a modified patch from me, I will do it on 21st, the latest.
          Hide
          Tsz Wo Nicholas Sze added a comment -

          Hi Ted,

          No problem, please update mapreduce-3583-v3.txt when you have time. Hope a good trip!

          In the meantime, I will check mapreduce-3583-trunk-v7.txt one more time and commit it if everything looks good.

          Show
          Tsz Wo Nicholas Sze added a comment - Hi Ted, No problem, please update mapreduce-3583-v3.txt when you have time. Hope a good trip! In the meantime, I will check mapreduce-3583-trunk-v7.txt one more time and commit it if everything looks good.
          Hide
          Tsz Wo Nicholas Sze added a comment -

          I have committed this to trunk and 0.23. Thanks, Ted!

          Leaving this open for committing to branch-1.

          Show
          Tsz Wo Nicholas Sze added a comment - I have committed this to trunk and 0.23. Thanks, Ted! Leaving this open for committing to branch-1.
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Hdfs-trunk-Commit #1821 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/1821/)
          MAPREDUCE-3583. Change pid to String and stime to BigInteger in order to handle integers larger than Long.MAX_VALUE. Contributed by Zhihong Yu (Revision 1245828)

          Result = SUCCESS
          szetszwo : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1245828
          Files :

          • /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
          • /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/util/ProcfsBasedProcessTree.java
          • /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/util/ProcfsBasedProcessTree.java
          • /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/util/TestProcfsBasedProcessTree.java
          Show
          Hudson added a comment - Integrated in Hadoop-Hdfs-trunk-Commit #1821 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/1821/ ) MAPREDUCE-3583 . Change pid to String and stime to BigInteger in order to handle integers larger than Long.MAX_VALUE. Contributed by Zhihong Yu (Revision 1245828) Result = SUCCESS szetszwo : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1245828 Files : /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/util/ProcfsBasedProcessTree.java /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/util/ProcfsBasedProcessTree.java /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/util/TestProcfsBasedProcessTree.java
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Hdfs-0.23-Commit #556 (See https://builds.apache.org/job/Hadoop-Hdfs-0.23-Commit/556/)
          svn merge -c 1245828 from trunk for MAPREDUCE-3583. (Revision 1245831)

          Result = SUCCESS
          szetszwo : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1245831
          Files :

          • /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project
          • /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/CHANGES.txt
          • /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/util/ProcfsBasedProcessTree.java
          • /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/util/ProcfsBasedProcessTree.java
          • /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/util/TestProcfsBasedProcessTree.java
          Show
          Hudson added a comment - Integrated in Hadoop-Hdfs-0.23-Commit #556 (See https://builds.apache.org/job/Hadoop-Hdfs-0.23-Commit/556/ ) svn merge -c 1245828 from trunk for MAPREDUCE-3583 . (Revision 1245831) Result = SUCCESS szetszwo : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1245831 Files : /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/CHANGES.txt /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/util/ProcfsBasedProcessTree.java /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/util/ProcfsBasedProcessTree.java /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/util/TestProcfsBasedProcessTree.java
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Common-0.23-Commit #569 (See https://builds.apache.org/job/Hadoop-Common-0.23-Commit/569/)
          svn merge -c 1245828 from trunk for MAPREDUCE-3583. (Revision 1245831)

          Result = SUCCESS
          szetszwo : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1245831
          Files :

          • /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project
          • /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/CHANGES.txt
          • /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/util/ProcfsBasedProcessTree.java
          • /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/util/ProcfsBasedProcessTree.java
          • /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/util/TestProcfsBasedProcessTree.java
          Show
          Hudson added a comment - Integrated in Hadoop-Common-0.23-Commit #569 (See https://builds.apache.org/job/Hadoop-Common-0.23-Commit/569/ ) svn merge -c 1245828 from trunk for MAPREDUCE-3583 . (Revision 1245831) Result = SUCCESS szetszwo : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1245831 Files : /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/CHANGES.txt /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/util/ProcfsBasedProcessTree.java /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/util/ProcfsBasedProcessTree.java /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/util/TestProcfsBasedProcessTree.java
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Common-trunk-Commit #1747 (See https://builds.apache.org/job/Hadoop-Common-trunk-Commit/1747/)
          MAPREDUCE-3583. Change pid to String and stime to BigInteger in order to handle integers larger than Long.MAX_VALUE. Contributed by Zhihong Yu (Revision 1245828)

          Result = SUCCESS
          szetszwo : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1245828
          Files :

          • /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
          • /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/util/ProcfsBasedProcessTree.java
          • /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/util/ProcfsBasedProcessTree.java
          • /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/util/TestProcfsBasedProcessTree.java
          Show
          Hudson added a comment - Integrated in Hadoop-Common-trunk-Commit #1747 (See https://builds.apache.org/job/Hadoop-Common-trunk-Commit/1747/ ) MAPREDUCE-3583 . Change pid to String and stime to BigInteger in order to handle integers larger than Long.MAX_VALUE. Contributed by Zhihong Yu (Revision 1245828) Result = SUCCESS szetszwo : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1245828 Files : /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/util/ProcfsBasedProcessTree.java /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/util/ProcfsBasedProcessTree.java /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/util/TestProcfsBasedProcessTree.java
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Mapreduce-trunk-Commit #1759 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/1759/)
          MAPREDUCE-3583. Change pid to String and stime to BigInteger in order to handle integers larger than Long.MAX_VALUE. Contributed by Zhihong Yu (Revision 1245828)

          Result = ABORTED
          szetszwo : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1245828
          Files :

          • /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
          • /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/util/ProcfsBasedProcessTree.java
          • /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/util/ProcfsBasedProcessTree.java
          • /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/util/TestProcfsBasedProcessTree.java
          Show
          Hudson added a comment - Integrated in Hadoop-Mapreduce-trunk-Commit #1759 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/1759/ ) MAPREDUCE-3583 . Change pid to String and stime to BigInteger in order to handle integers larger than Long.MAX_VALUE. Contributed by Zhihong Yu (Revision 1245828) Result = ABORTED szetszwo : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1245828 Files : /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/util/ProcfsBasedProcessTree.java /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/util/ProcfsBasedProcessTree.java /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/util/TestProcfsBasedProcessTree.java
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Mapreduce-0.23-Commit #572 (See https://builds.apache.org/job/Hadoop-Mapreduce-0.23-Commit/572/)
          svn merge -c 1245828 from trunk for MAPREDUCE-3583. (Revision 1245831)

          Result = ABORTED
          szetszwo : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1245831
          Files :

          • /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project
          • /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/CHANGES.txt
          • /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/util/ProcfsBasedProcessTree.java
          • /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/util/ProcfsBasedProcessTree.java
          • /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/util/TestProcfsBasedProcessTree.java
          Show
          Hudson added a comment - Integrated in Hadoop-Mapreduce-0.23-Commit #572 (See https://builds.apache.org/job/Hadoop-Mapreduce-0.23-Commit/572/ ) svn merge -c 1245828 from trunk for MAPREDUCE-3583 . (Revision 1245831) Result = ABORTED szetszwo : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1245831 Files : /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/CHANGES.txt /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/util/ProcfsBasedProcessTree.java /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/util/ProcfsBasedProcessTree.java /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/util/TestProcfsBasedProcessTree.java
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Hdfs-trunk #959 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/959/)
          MAPREDUCE-3583. Change pid to String and stime to BigInteger in order to handle integers larger than Long.MAX_VALUE. Contributed by Zhihong Yu (Revision 1245828)

          Result = SUCCESS
          szetszwo : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1245828
          Files :

          • /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
          • /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/util/ProcfsBasedProcessTree.java
          • /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/util/ProcfsBasedProcessTree.java
          • /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/util/TestProcfsBasedProcessTree.java
          Show
          Hudson added a comment - Integrated in Hadoop-Hdfs-trunk #959 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/959/ ) MAPREDUCE-3583 . Change pid to String and stime to BigInteger in order to handle integers larger than Long.MAX_VALUE. Contributed by Zhihong Yu (Revision 1245828) Result = SUCCESS szetszwo : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1245828 Files : /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/util/ProcfsBasedProcessTree.java /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/util/ProcfsBasedProcessTree.java /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/util/TestProcfsBasedProcessTree.java
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Hdfs-0.23-Build #172 (See https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/172/)
          svn merge -c 1245828 from trunk for MAPREDUCE-3583. (Revision 1245831)

          Result = SUCCESS
          szetszwo : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1245831
          Files :

          • /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project
          • /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/CHANGES.txt
          • /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/util/ProcfsBasedProcessTree.java
          • /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/util/ProcfsBasedProcessTree.java
          • /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/util/TestProcfsBasedProcessTree.java
          Show
          Hudson added a comment - Integrated in Hadoop-Hdfs-0.23-Build #172 (See https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/172/ ) svn merge -c 1245828 from trunk for MAPREDUCE-3583 . (Revision 1245831) Result = SUCCESS szetszwo : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1245831 Files : /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/CHANGES.txt /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/util/ProcfsBasedProcessTree.java /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/util/ProcfsBasedProcessTree.java /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/util/TestProcfsBasedProcessTree.java
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Mapreduce-0.23-Build #200 (See https://builds.apache.org/job/Hadoop-Mapreduce-0.23-Build/200/)
          svn merge -c 1245828 from trunk for MAPREDUCE-3583. (Revision 1245831)

          Result = FAILURE
          szetszwo : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1245831
          Files :

          • /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project
          • /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/CHANGES.txt
          • /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/util/ProcfsBasedProcessTree.java
          • /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/util/ProcfsBasedProcessTree.java
          • /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/util/TestProcfsBasedProcessTree.java
          Show
          Hudson added a comment - Integrated in Hadoop-Mapreduce-0.23-Build #200 (See https://builds.apache.org/job/Hadoop-Mapreduce-0.23-Build/200/ ) svn merge -c 1245828 from trunk for MAPREDUCE-3583 . (Revision 1245831) Result = FAILURE szetszwo : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1245831 Files : /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/CHANGES.txt /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/util/ProcfsBasedProcessTree.java /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/util/ProcfsBasedProcessTree.java /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/util/TestProcfsBasedProcessTree.java
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Mapreduce-trunk #994 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/994/)
          MAPREDUCE-3583. Change pid to String and stime to BigInteger in order to handle integers larger than Long.MAX_VALUE. Contributed by Zhihong Yu (Revision 1245828)

          Result = SUCCESS
          szetszwo : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1245828
          Files :

          • /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
          • /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/util/ProcfsBasedProcessTree.java
          • /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/util/ProcfsBasedProcessTree.java
          • /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/util/TestProcfsBasedProcessTree.java
          Show
          Hudson added a comment - Integrated in Hadoop-Mapreduce-trunk #994 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/994/ ) MAPREDUCE-3583 . Change pid to String and stime to BigInteger in order to handle integers larger than Long.MAX_VALUE. Contributed by Zhihong Yu (Revision 1245828) Result = SUCCESS szetszwo : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1245828 Files : /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/util/ProcfsBasedProcessTree.java /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/util/ProcfsBasedProcessTree.java /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/util/TestProcfsBasedProcessTree.java
          Hide
          Ted Yu added a comment -

          Patch v6 for hadoop 1.0 which incorporates Nicolas' first two comments.

          Show
          Ted Yu added a comment - Patch v6 for hadoop 1.0 which incorporates Nicolas' first two comments.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12515499/mapreduce-3583-v6.txt
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          -1 tests included. The patch doesn't appear to include any new or modified tests.
          Please justify why no new tests are needed for this patch.
          Also please list what manual steps were performed to verify this patch.

          -1 patch. The patch command could not apply the patch.

          Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1907//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12515499/mapreduce-3583-v6.txt against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1907//console This message is automatically generated.
          Hide
          Tsz Wo Nicholas Sze added a comment -

          Hi Ted, we need to check directory name in in getProcessList().

          Show
          Tsz Wo Nicholas Sze added a comment - Hi Ted, we need to check directory name in in getProcessList().
          Hide
          Ted Yu added a comment -

          Patch v7 adds check for directory name in getProcessList()

          Show
          Ted Yu added a comment - Patch v7 adds check for directory name in getProcessList()
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12515597/mapreduce-3583-v7.txt
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          -1 tests included. The patch doesn't appear to include any new or modified tests.
          Please justify why no new tests are needed for this patch.
          Also please list what manual steps were performed to verify this patch.

          -1 patch. The patch command could not apply the patch.

          Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1909//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12515597/mapreduce-3583-v7.txt against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1909//console This message is automatically generated.
          Hide
          Tsz Wo Nicholas Sze added a comment -

          Please run "ant test-patch" and "ant test". These commands are run by Hudson for the patch on trunk but not for other branches. "ant test-patch" should take ~5 minutes but "ant test" is going to take a few hours. Please post the results on the JIRA. Thanks a lot!

          Show
          Tsz Wo Nicholas Sze added a comment - Please run "ant test-patch" and "ant test". These commands are run by Hudson for the patch on trunk but not for other branches. "ant test-patch" should take ~5 minutes but "ant test" is going to take a few hours. Please post the results on the JIRA. Thanks a lot!
          Hide
          Ted Yu added a comment -

          Running tests now.
          Will report back tonight.

          Show
          Ted Yu added a comment - Running tests now. Will report back tonight.
          Hide
          Ted Yu added a comment -

          I got the following when running "ant test-patch":

          BUILD FAILED
          /home/hduser/1-hadoop/build.xml:2228: 'findbugs.home' is not defined. Please pass -Dfindbugs.home=<base of Findbugs installation> to Ant on the command-line.
          
          Show
          Ted Yu added a comment - I got the following when running "ant test-patch": BUILD FAILED /home/hduser/1-hadoop/build.xml:2228: 'findbugs.home' is not defined. Please pass -Dfindbugs.home=<base of Findbugs installation> to Ant on the command-line.
          Hide
          Ted Yu added a comment -

          After 4 hours, the test suite is still running.
          /home/hduser/1-hadoop/build/test/testsfailed is empty.

          Show
          Ted Yu added a comment - After 4 hours, the test suite is still running. /home/hduser/1-hadoop/build/test/testsfailed is empty.
          Hide
          Ted Yu added a comment -

          I got two test failures:

              [junit] Test org.apache.hadoop.hdfs.security.TestDelegationToken FAILED (crashed)
              [junit] Test org.apache.hadoop.metrics2.impl.TestSinkQueue FAILED
          

          Here is more information:

          Testcase: testCancelDelegationToken took 0.001 sec
            Caused an ERROR
          Forked Java VM exited abnormally. Please note the time in the report does not reflect the time until the VM exit.
          junit.framework.AssertionFailedError: Forked Java VM exited abnormally. Please note the time in the report does not reflect the time until the VM exit.
          
          Testcase: testConcurrentConsumers took 0.005 sec
            FAILED
          should've thrown
          junit.framework.AssertionFailedError: should've thrown
            at org.apache.hadoop.metrics2.impl.TestSinkQueue.shouldThrowCME(TestSinkQueue.java:229)
            at org.apache.hadoop.metrics2.impl.TestSinkQueue.testConcurrentConsumers(TestSinkQueue.java:195)
          

          I reverted my patch and TestSinkQueue still failed.

          If someone can verify whether they pass (with my patch), that would be great.

          Show
          Ted Yu added a comment - I got two test failures: [junit] Test org.apache.hadoop.hdfs.security.TestDelegationToken FAILED (crashed) [junit] Test org.apache.hadoop.metrics2.impl.TestSinkQueue FAILED Here is more information: Testcase: testCancelDelegationToken took 0.001 sec Caused an ERROR Forked Java VM exited abnormally. Please note the time in the report does not reflect the time until the VM exit. junit.framework.AssertionFailedError: Forked Java VM exited abnormally. Please note the time in the report does not reflect the time until the VM exit. Testcase: testConcurrentConsumers took 0.005 sec FAILED should've thrown junit.framework.AssertionFailedError: should've thrown at org.apache.hadoop.metrics2.impl.TestSinkQueue.shouldThrowCME(TestSinkQueue.java:229) at org.apache.hadoop.metrics2.impl.TestSinkQueue.testConcurrentConsumers(TestSinkQueue.java:195) I reverted my patch and TestSinkQueue still failed. If someone can verify whether they pass (with my patch), that would be great.
          Hide
          Tsz Wo Nicholas Sze added a comment -

          > I got the following when running "ant test-patch":

          Sorry that I was not clear. The full command looks like

          ant -Dforrest.home=${FORREST_HOME} -Dfindbugs.home=${FINDBUGS_HOME} -Dpatch.file=a.patch test-patch
          

          and it requires findbugs and forrest.

          Show
          Tsz Wo Nicholas Sze added a comment - > I got the following when running "ant test-patch": Sorry that I was not clear. The full command looks like ant -Dforrest.home=${FORREST_HOME} -Dfindbugs.home=${FINDBUGS_HOME} -Dpatch.file=a.patch test-patch and it requires findbugs and forrest.
          Hide
          Ted Yu added a comment -

          I installed forrest and findbugs onto MacBook.

          /Users/zhihyu/205-hadoop/build.xml:1310: 'java5.home' is not defined.  Forrest requires Java 5.  Please pass -Djava5.home=<base of Java 5 distribution> to Ant on the command-line.
          

          Still need to install java 5.

          Show
          Ted Yu added a comment - I installed forrest and findbugs onto MacBook. /Users/zhihyu/205-hadoop/build.xml:1310: 'java5.home' is not defined. Forrest requires Java 5. Please pass -Djava5.home=<base of Java 5 distribution> to Ant on the command-line. Still need to install java 5.
          Hide
          Tsz Wo Nicholas Sze added a comment -

          I put java 6 for the java5.home and it works. So you don't really have to install Java 5.

          Show
          Tsz Wo Nicholas Sze added a comment - I put java 6 for the java5.home and it works. So you don't really have to install Java 5.
          Hide
          Tsz Wo Nicholas Sze added a comment -

          But I have to manually remove cn-doc dependency for using Java 6.

          Show
          Tsz Wo Nicholas Sze added a comment - But I have to manually remove cn-doc dependency for using Java 6.
          Hide
          Tsz Wo Nicholas Sze added a comment -

          Okay, I just have run "ant test-patch" on mapreduce-3583-v7.txt.

               [exec] -1 overall.  
               [exec] 
               [exec]     +1 @author.  The patch does not contain any @author tags.
               [exec] 
               [exec]     -1 tests included.  The patch doesn't appear to include any new or modified tests.
               [exec]                         Please justify why no tests are needed for this patch.
               [exec] 
               [exec]     +1 javadoc.  The javadoc tool did not generate any warning messages.
               [exec] 
               [exec]     +1 javac.  The applied patch does not increase the total number of javac compiler warnings.
               [exec] 
               [exec]     -1 findbugs.  The patch appears to introduce 5 new Findbugs (version 1.3.9) warnings.
               [exec] 
          

          The findbugs warnings are not related. The result is the same if running test-patch with an empty patch.

          Show
          Tsz Wo Nicholas Sze added a comment - Okay, I just have run "ant test-patch" on mapreduce-3583-v7.txt. [exec] -1 overall. [exec] [exec] +1 @author. The patch does not contain any @author tags. [exec] [exec] -1 tests included. The patch doesn't appear to include any new or modified tests. [exec] Please justify why no tests are needed for this patch. [exec] [exec] +1 javadoc. The javadoc tool did not generate any warning messages. [exec] [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings. [exec] [exec] -1 findbugs. The patch appears to introduce 5 new Findbugs (version 1.3.9) warnings. [exec] The findbugs warnings are not related. The result is the same if running test-patch with an empty patch.
          Hide
          Ted Yu added a comment -

          Turns out java 5 was installed.

          Here is the command I used:

          ant -Dforrest.home=${FORREST_HOME} -Dfindbugs.home=${FINDBUGS_HOME} -Djava5.home=/System/Library/Frameworks/JavaVM.framework/Versions/1.5/Home -Dpatch.file=../mapreduce-3583-v7.txt test-patch
          

          I got:

                [get] Error opening connection java.io.IOException: Server returned HTTP response code: 503 for URL: http://repo2.maven.org/maven2/org/apache/ivy/ivy/2.1.0/ivy-2.1.0.jar
                [get] Can't get http://repo2.maven.org/maven2/org/apache/ivy/ivy/2.1.0/ivy-2.1.0.jar to /Users/zhihyu/205-hadoop/ivy/ivy-2.1.0.jar
          
          BUILD FAILED
          /Users/zhihyu/205-hadoop/build.xml:2393: Can't get http://repo2.maven.org/maven2/org/apache/ivy/ivy/2.1.0/ivy-2.1.0.jar to /Users/zhihyu/205-hadoop/ivy/ivy-2.1.0.jar
          

          Not sure if the above was caused by firewall.

          Show
          Ted Yu added a comment - Turns out java 5 was installed. Here is the command I used: ant -Dforrest.home=${FORREST_HOME} -Dfindbugs.home=${FINDBUGS_HOME} -Djava5.home=/ System /Library/Frameworks/JavaVM.framework/Versions/1.5/Home -Dpatch.file=../mapreduce-3583-v7.txt test-patch I got: [get] Error opening connection java.io.IOException: Server returned HTTP response code: 503 for URL: http: //repo2.maven.org/maven2/org/apache/ivy/ivy/2.1.0/ivy-2.1.0.jar [get] Can't get http: //repo2.maven.org/maven2/org/apache/ivy/ivy/2.1.0/ivy-2.1.0.jar to /Users/zhihyu/205-hadoop/ivy/ivy-2.1.0.jar BUILD FAILED /Users/zhihyu/205-hadoop/build.xml:2393: Can't get http: //repo2.maven.org/maven2/org/apache/ivy/ivy/2.1.0/ivy-2.1.0.jar to /Users/zhihyu/205-hadoop/ivy/ivy-2.1.0.jar Not sure if the above was caused by firewall.
          Hide
          Tsz Wo Nicholas Sze added a comment -

          > I got two test failures ...

          Both tests passed on my machine and I don't think the failures your got are related to the patch.

              [junit] Running org.apache.hadoop.hdfs.security.TestDelegationToken
              [junit] Tests run: 5, Failures: 0, Errors: 0, Time elapsed: 15.793 sec
          
              [junit] Running org.apache.hadoop.metrics2.impl.TestSinkQueue
              [junit] Tests run: 8, Failures: 0, Errors: 0, Time elapsed: 0.321 sec
          
          Show
          Tsz Wo Nicholas Sze added a comment - > I got two test failures ... Both tests passed on my machine and I don't think the failures your got are related to the patch. [junit] Running org.apache.hadoop.hdfs.security.TestDelegationToken [junit] Tests run: 5, Failures: 0, Errors: 0, Time elapsed: 15.793 sec [junit] Running org.apache.hadoop.metrics2.impl.TestSinkQueue [junit] Tests run: 8, Failures: 0, Errors: 0, Time elapsed: 0.321 sec
          Hide
          Tsz Wo Nicholas Sze added a comment -

          I also have committed to branch-1 and branch-1.0. Thanks Ted again!

          Show
          Tsz Wo Nicholas Sze added a comment - I also have committed to branch-1 and branch-1.0. Thanks Ted again!

            People

            • Assignee:
              Ted Yu
              Reporter:
              Ted Yu
            • Votes:
              0 Vote for this issue
              Watchers:
              9 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development