Hadoop Map/Reduce
  1. Hadoop Map/Reduce
  2. MAPREDUCE-2283

TestBlockFixer hangs initializing MiniMRCluster

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Blocker Blocker
    • Resolution: Fixed
    • Affects Version/s: 0.23.0
    • Fix Version/s: 0.22.0
    • Component/s: contrib/raid
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      TestBlockFixer (a raid contrib test) is hanging the precommit testing on Hudson

      1. MAPREDUCE-2283.patch
        0.7 kB
        Ramkumar Vadali

        Activity

        Hide
        Nigel Daley added a comment -

        2 thread dumps of the process show this trace:

        
             [exec]     [junit] "main" prio=10 tid=0x0933ac00 nid=0x6aaf waiting on condition [0xf7d50000..0xf7d521e8]
             [exec]     [junit]    java.lang.Thread.State: TIMED_WAITING (sleeping)
             [exec]     [junit] 	at java.lang.Thread.sleep(Native Method)
             [exec]     [junit] 	at org.apache.hadoop.ipc.Client$Connection.handleConnectionFailure(Client.java:611)
             [exec]     [junit] 	at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:429)
             [exec]     [junit] 	- locked <0xf13fdb20> (a org.apache.hadoop.ipc.Client$Connection)
             [exec]     [junit] 	at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:504)
             [exec]     [junit] 	- locked <0xf13fdb20> (a org.apache.hadoop.ipc.Client$Connection)
             [exec]     [junit] 	at org.apache.hadoop.ipc.Client$Connection.access$2000(Client.java:206)
             [exec]     [junit] 	at org.apache.hadoop.ipc.Client.getConnection(Client.java:1164)
             [exec]     [junit] 	at org.apache.hadoop.ipc.Client.call(Client.java:1008)
             [exec]     [junit] 	at org.apache.hadoop.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:198)
             [exec]     [junit] 	at org.apache.hadoop.mapred.$Proxy9.getProtocolVersion(Unknown Source)
             [exec]     [junit] 	at org.apache.hadoop.ipc.WritableRpcEngine.getProxy(WritableRpcEngine.java:235)
             [exec]     [junit] 	at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:275)
             [exec]     [junit] 	at org.apache.hadoop.ipc.RPC.waitForProxy(RPC.java:206)
             [exec]     [junit] 	at org.apache.hadoop.ipc.RPC.waitForProxy(RPC.java:185)
             [exec]     [junit] 	at org.apache.hadoop.ipc.RPC.waitForProxy(RPC.java:169)
             [exec]     [junit] 	at org.apache.hadoop.mapred.TaskTracker$2.run(TaskTracker.java:699)
             [exec]     [junit] 	at java.security.AccessController.doPrivileged(Native Method)
             [exec]     [junit] 	at javax.security.auth.Subject.doAs(Subject.java:396)
             [exec]     [junit] 	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1142)
             [exec]     [junit] 	at org.apache.hadoop.mapred.TaskTracker.initialize(TaskTracker.java:695)
             [exec]     [junit] 	- locked <0xb75cd778> (a org.apache.hadoop.mapred.TaskTracker)
             [exec]     [junit] 	at org.apache.hadoop.mapred.TaskTracker.<init>(TaskTracker.java:1391)
             [exec]     [junit] 	at org.apache.hadoop.mapred.MiniMRCluster$TaskTrackerRunner.createTaskTracker(MiniMRCluster.java:219)
             [exec]     [junit] 	at org.apache.hadoop.mapred.MiniMRCluster$TaskTrackerRunner$1.run(MiniMRCluster.java:203)
             [exec]     [junit] 	at org.apache.hadoop.mapred.MiniMRCluster$TaskTrackerRunner$1.run(MiniMRCluster.java:201)
             [exec]     [junit] 	at java.security.AccessController.doPrivileged(Native Method)
             [exec]     [junit] 	at javax.security.auth.Subject.doAs(Subject.java:396)
             [exec]     [junit] 	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1142)
             [exec]     [junit] 	at org.apache.hadoop.mapred.MiniMRCluster$TaskTrackerRunner.<init>(MiniMRCluster.java:201)
             [exec]     [junit] 	at org.apache.hadoop.mapred.MiniMRCluster.startTaskTracker(MiniMRCluster.java:716)
             [exec]     [junit] 	at org.apache.hadoop.mapred.MiniMRCluster.<init>(MiniMRCluster.java:541)
             [exec]     [junit] 	at org.apache.hadoop.mapred.MiniMRCluster.<init>(MiniMRCluster.java:482)
             [exec]     [junit] 	at org.apache.hadoop.mapred.MiniMRCluster.<init>(MiniMRCluster.java:474)
             [exec]     [junit] 	at org.apache.hadoop.mapred.MiniMRCluster.<init>(MiniMRCluster.java:466)
             [exec]     [junit] 	at org.apache.hadoop.mapred.MiniMRCluster.<init>(MiniMRCluster.java:458)
             [exec]     [junit] 	at org.apache.hadoop.mapred.MiniMRCluster.<init>(MiniMRCluster.java:448)
             [exec]     [junit] 	at org.apache.hadoop.mapred.MiniMRCluster.<init>(MiniMRCluster.java:438)
             [exec]     [junit] 	at org.apache.hadoop.mapred.MiniMRCluster.<init>(MiniMRCluster.java:429)
             [exec]     [junit] 	at org.apache.hadoop.raid.TestBlockFixer.mySetup(TestBlockFixer.java:821)
             [exec]     [junit] 	at org.apache.hadoop.raid.TestBlockFixer.implBlockFix(TestBlockFixer.java:155)
             [exec]     [junit] 	at org.apache.hadoop.raid.TestBlockFixer.testBlockFixDist(TestBlockFixer.java:139)
        
        Show
        Nigel Daley added a comment - 2 thread dumps of the process show this trace: [exec] [junit] "main" prio=10 tid=0x0933ac00 nid=0x6aaf waiting on condition [0xf7d50000..0xf7d521e8] [exec] [junit] java.lang. Thread .State: TIMED_WAITING (sleeping) [exec] [junit] at java.lang. Thread .sleep(Native Method) [exec] [junit] at org.apache.hadoop.ipc.Client$Connection.handleConnectionFailure(Client.java:611) [exec] [junit] at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:429) [exec] [junit] - locked <0xf13fdb20> (a org.apache.hadoop.ipc.Client$Connection) [exec] [junit] at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:504) [exec] [junit] - locked <0xf13fdb20> (a org.apache.hadoop.ipc.Client$Connection) [exec] [junit] at org.apache.hadoop.ipc.Client$Connection.access$2000(Client.java:206) [exec] [junit] at org.apache.hadoop.ipc.Client.getConnection(Client.java:1164) [exec] [junit] at org.apache.hadoop.ipc.Client.call(Client.java:1008) [exec] [junit] at org.apache.hadoop.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:198) [exec] [junit] at org.apache.hadoop.mapred.$Proxy9.getProtocolVersion(Unknown Source) [exec] [junit] at org.apache.hadoop.ipc.WritableRpcEngine.getProxy(WritableRpcEngine.java:235) [exec] [junit] at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:275) [exec] [junit] at org.apache.hadoop.ipc.RPC.waitForProxy(RPC.java:206) [exec] [junit] at org.apache.hadoop.ipc.RPC.waitForProxy(RPC.java:185) [exec] [junit] at org.apache.hadoop.ipc.RPC.waitForProxy(RPC.java:169) [exec] [junit] at org.apache.hadoop.mapred.TaskTracker$2.run(TaskTracker.java:699) [exec] [junit] at java.security.AccessController.doPrivileged(Native Method) [exec] [junit] at javax.security.auth.Subject.doAs(Subject.java:396) [exec] [junit] at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1142) [exec] [junit] at org.apache.hadoop.mapred.TaskTracker.initialize(TaskTracker.java:695) [exec] [junit] - locked <0xb75cd778> (a org.apache.hadoop.mapred.TaskTracker) [exec] [junit] at org.apache.hadoop.mapred.TaskTracker.<init>(TaskTracker.java:1391) [exec] [junit] at org.apache.hadoop.mapred.MiniMRCluster$TaskTrackerRunner.createTaskTracker(MiniMRCluster.java:219) [exec] [junit] at org.apache.hadoop.mapred.MiniMRCluster$TaskTrackerRunner$1.run(MiniMRCluster.java:203) [exec] [junit] at org.apache.hadoop.mapred.MiniMRCluster$TaskTrackerRunner$1.run(MiniMRCluster.java:201) [exec] [junit] at java.security.AccessController.doPrivileged(Native Method) [exec] [junit] at javax.security.auth.Subject.doAs(Subject.java:396) [exec] [junit] at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1142) [exec] [junit] at org.apache.hadoop.mapred.MiniMRCluster$TaskTrackerRunner.<init>(MiniMRCluster.java:201) [exec] [junit] at org.apache.hadoop.mapred.MiniMRCluster.startTaskTracker(MiniMRCluster.java:716) [exec] [junit] at org.apache.hadoop.mapred.MiniMRCluster.<init>(MiniMRCluster.java:541) [exec] [junit] at org.apache.hadoop.mapred.MiniMRCluster.<init>(MiniMRCluster.java:482) [exec] [junit] at org.apache.hadoop.mapred.MiniMRCluster.<init>(MiniMRCluster.java:474) [exec] [junit] at org.apache.hadoop.mapred.MiniMRCluster.<init>(MiniMRCluster.java:466) [exec] [junit] at org.apache.hadoop.mapred.MiniMRCluster.<init>(MiniMRCluster.java:458) [exec] [junit] at org.apache.hadoop.mapred.MiniMRCluster.<init>(MiniMRCluster.java:448) [exec] [junit] at org.apache.hadoop.mapred.MiniMRCluster.<init>(MiniMRCluster.java:438) [exec] [junit] at org.apache.hadoop.mapred.MiniMRCluster.<init>(MiniMRCluster.java:429) [exec] [junit] at org.apache.hadoop.raid.TestBlockFixer.mySetup(TestBlockFixer.java:821) [exec] [junit] at org.apache.hadoop.raid.TestBlockFixer.implBlockFix(TestBlockFixer.java:155) [exec] [junit] at org.apache.hadoop.raid.TestBlockFixer.testBlockFixDist(TestBlockFixer.java:139)
        Hide
        Todd Lipcon added a comment -

        You sure this is actually hanging and not just really slow? I ran testBlockFixDist here and it passed, but took 115 seconds. Given there are 14 test cases in TestBlockFixer, maybe the test runs for 25-30 minutes?

        Show
        Todd Lipcon added a comment - You sure this is actually hanging and not just really slow? I ran testBlockFixDist here and it passed, but took 115 seconds. Given there are 14 test cases in TestBlockFixer, maybe the test runs for 25-30 minutes?
        Hide
        Nigel Daley added a comment -

        Todd, the test has been hung there for over 6 hours.

        FWIW, here's the output of top:

        top - 19:11:36 up 384 days, 10:55,  2 users,  load average: 0.00, 0.00, 0.00
        Tasks: 166 total,   1 running, 165 sleeping,   0 stopped,   0 zombie
        Cpu(s):  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
        Mem:  16257588k total,  9534092k used,  6723496k free,  3038084k buffers
        Swap: 24000984k total,        0k used, 24000984k free,  5331700k cached
        
          PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND                                                                                                                        
        27308 hudson    18  -2 1261m 222m  23m S    0  1.4   0:57.10 java                                                                                                                            
        28926 root      18  -2 19116 1368  992 R    0  0.0   0:00.03 top                                                                                                                             
            1 root      20   0  4028  732  592 S    0  0.0   0:18.61 init                                                                                                                            
            2 root      15  -5     0    0    0 S    0  0.0   0:00.00 kthreadd                                                                                                                        
            3 root      RT  -5     0    0    0 S    0  0.0   0:00.96 migration/0                                                                                                                     
        ...
        
        Show
        Nigel Daley added a comment - Todd, the test has been hung there for over 6 hours. FWIW, here's the output of top: top - 19:11:36 up 384 days, 10:55, 2 users, load average: 0.00, 0.00, 0.00 Tasks: 166 total, 1 running, 165 sleeping, 0 stopped, 0 zombie Cpu(s): 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Mem: 16257588k total, 9534092k used, 6723496k free, 3038084k buffers Swap: 24000984k total, 0k used, 24000984k free, 5331700k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 27308 hudson 18 -2 1261m 222m 23m S 0 1.4 0:57.10 java 28926 root 18 -2 19116 1368 992 R 0 0.0 0:00.03 top 1 root 20 0 4028 732 592 S 0 0.0 0:18.61 init 2 root 15 -5 0 0 0 S 0 0.0 0:00.00 kthreadd 3 root RT -5 0 0 0 S 0 0.0 0:00.96 migration/0 ...
        Hide
        Ramkumar Vadali added a comment -

        I think this has something to do with the MR ports change. Is it possible that hudson does not do ant clean? This test has not changed recently, but MiniMRCluster has

        Show
        Ramkumar Vadali added a comment - I think this has something to do with the MR ports change. Is it possible that hudson does not do ant clean? This test has not changed recently, but MiniMRCluster has
        Hide
        Konstantin Shvachko added a comment -

        Looks like it is really hanging.
        https://hudson.apache.org/hudson/job/PreCommit-MAPREDUCE-Build/18/console
        It's been 8.5 hours since Build/18 started, no progress. Just hanging after

        [exec]     [junit] Running org.apache.hadoop.raid.TestBlockFixer
        

        Why it is not getting timed out by Hadson? Should we turn it off?

        Show
        Konstantin Shvachko added a comment - Looks like it is really hanging. https://hudson.apache.org/hudson/job/PreCommit-MAPREDUCE-Build/18/console It's been 8.5 hours since Build/18 started, no progress. Just hanging after [exec] [junit] Running org.apache.hadoop.raid.TestBlockFixer Why it is not getting timed out by Hadson? Should we turn it off?
        Hide
        Ramkumar Vadali added a comment -

        Please turn it off while I figure out whats happening

        Show
        Ramkumar Vadali added a comment - Please turn it off while I figure out whats happening
        Hide
        Nigel Daley added a comment -

        Ramkumar, any update on this? AFAICT, there is no way to turn it off in the precommit testing without committing a change to the script and build.xml of the project.

        Show
        Nigel Daley added a comment - Ramkumar, any update on this? AFAICT, there is no way to turn it off in the precommit testing without committing a change to the script and build.xml of the project.
        Hide
        Ramkumar Vadali added a comment -

        Update:

        If I run ant clean from the top-level and run `ant test -Dtestcase=TestBlockFixer`, it runs fine.
        But if I run ant test-patch from the top level and run it again, it gets stuck. I ran with test.output=yes to see what was going on, and found this:

            [junit] 11/01/27 09:21:24 INFO mapred.TaskTracker: TaskTracker up at: localhost.localdomain/127.0.0.1:50197
            [junit] 11/01/27 09:21:24 INFO mapred.TaskTracker: Starting tracker tracker_host0.foo.com:localhost.localdomain/127.0.0.1:50197
            [junit] 11/01/27 09:21:25 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:0. Already tried 0 time(s).
            [junit] 11/01/27 09:21:26 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:0. Already tried 1 time(s).
            [junit] 11/01/27 09:21:27 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:0. Already tried 2 time(s).
            [junit] 11/01/27 09:21:28 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:0. Already tried 3 time(s).
            [junit] 11/01/27 09:21:29 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:0. Already tried 4 time(s).
            [junit] 11/01/27 09:21:30 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:0. Already tried 5 time(s).
            [junit] 11/01/27 09:21:31 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:0. Already tried 6 time(s).
            [junit] 11/01/27 09:21:32 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:0. Already tried 7 time(s).
            [junit] 11/01/27 09:21:33 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:0. Already tried 8 time(s).
            [junit] 11/01/27 09:21:34 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:0. Already tried 9 time(s).
            [junit] 11/01/27 09:21:34 INFO ipc.RPC: Server at localhost/127.0.0.1:0 not available yet, Zzzzz...
        

        I think hudson does something like this, and ant test-patch is somehow pulling in a jar that prevents MiniMRCluster from starting. To check, I wrote a simple test that only tries to start a MiniMRCluster:

        public class TestStuckMiniMR extends TestCase {
          public static final int NUM_DATANODES = 3;
          Configuration conf;
          String namenode = null;
          MiniDFSCluster dfs = null;
          MiniMRCluster mr = null;
          String jobTrackerName = null;
          FileSystem fileSys = null;
          protected void setUp() throws Exception {
        
            conf = new Configuration();
        
            dfs = new MiniDFSCluster(conf, NUM_DATANODES, true, null);
            dfs.waitActive();
            fileSys = dfs.getFileSystem();
            namenode = fileSys.getUri().toString();
        
            FileSystem.setDefaultUri(conf, namenode);
            mr = new MiniMRCluster(4, namenode, 3);
            jobTrackerName = "localhost:" + mr.getJobTrackerPort();
          }
        
          protected void tearDown() {
            dfs.shutdown();
            mr.shutdown();
          }
        
          public void testStuck() throws Exception {
            System.out.println("Done");
          }
        }
        

        This also gets stuck in setup. So I think the problem is outside RAID. Infact, just after I tried this, I tried running a test under contrib/streaming. That also gets stuck the same way.

        ant test -Dtestcase=TestFileArgs -Dtest.output=yes
        

        The output:

            [junit] 11/01/27 09:42:10 INFO mapred.TaskTracker: TaskTracker up at: localhost.localdomain/127.0.0.1:59339
            [junit] 11/01/27 09:42:10 INFO mapred.TaskTracker: Starting tracker tracker_host0.foo.com:localhost.localdomain/127.0.0.1:59339
            [junit] 11/01/27 09:42:11 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:0. Already tried 0 time(s).
            [junit] 11/01/27 09:42:12 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:0. Already tried 1 time(s).
            [junit] 11/01/27 09:42:13 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:0. Already tried 2 time(s).
        

        Can someone try killing TestBlockFixer and run TestFileArgs on the machine thats running hudson?

        Show
        Ramkumar Vadali added a comment - Update: If I run ant clean from the top-level and run `ant test -Dtestcase=TestBlockFixer`, it runs fine. But if I run ant test-patch from the top level and run it again, it gets stuck. I ran with test.output=yes to see what was going on, and found this: [junit] 11/01/27 09:21:24 INFO mapred.TaskTracker: TaskTracker up at: localhost.localdomain/127.0.0.1:50197 [junit] 11/01/27 09:21:24 INFO mapred.TaskTracker: Starting tracker tracker_host0.foo.com:localhost.localdomain/127.0.0.1:50197 [junit] 11/01/27 09:21:25 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:0. Already tried 0 time(s). [junit] 11/01/27 09:21:26 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:0. Already tried 1 time(s). [junit] 11/01/27 09:21:27 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:0. Already tried 2 time(s). [junit] 11/01/27 09:21:28 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:0. Already tried 3 time(s). [junit] 11/01/27 09:21:29 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:0. Already tried 4 time(s). [junit] 11/01/27 09:21:30 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:0. Already tried 5 time(s). [junit] 11/01/27 09:21:31 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:0. Already tried 6 time(s). [junit] 11/01/27 09:21:32 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:0. Already tried 7 time(s). [junit] 11/01/27 09:21:33 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:0. Already tried 8 time(s). [junit] 11/01/27 09:21:34 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:0. Already tried 9 time(s). [junit] 11/01/27 09:21:34 INFO ipc.RPC: Server at localhost/127.0.0.1:0 not available yet, Zzzzz... I think hudson does something like this, and ant test-patch is somehow pulling in a jar that prevents MiniMRCluster from starting. To check, I wrote a simple test that only tries to start a MiniMRCluster: public class TestStuckMiniMR extends TestCase { public static final int NUM_DATANODES = 3; Configuration conf; String namenode = null ; MiniDFSCluster dfs = null ; MiniMRCluster mr = null ; String jobTrackerName = null ; FileSystem fileSys = null ; protected void setUp() throws Exception { conf = new Configuration(); dfs = new MiniDFSCluster(conf, NUM_DATANODES, true , null ); dfs.waitActive(); fileSys = dfs.getFileSystem(); namenode = fileSys.getUri().toString(); FileSystem.setDefaultUri(conf, namenode); mr = new MiniMRCluster(4, namenode, 3); jobTrackerName = "localhost:" + mr.getJobTrackerPort(); } protected void tearDown() { dfs.shutdown(); mr.shutdown(); } public void testStuck() throws Exception { System .out.println( "Done" ); } } This also gets stuck in setup. So I think the problem is outside RAID. Infact, just after I tried this, I tried running a test under contrib/streaming. That also gets stuck the same way. ant test -Dtestcase=TestFileArgs -Dtest.output=yes The output: [junit] 11/01/27 09:42:10 INFO mapred.TaskTracker: TaskTracker up at: localhost.localdomain/127.0.0.1:59339 [junit] 11/01/27 09:42:10 INFO mapred.TaskTracker: Starting tracker tracker_host0.foo.com:localhost.localdomain/127.0.0.1:59339 [junit] 11/01/27 09:42:11 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:0. Already tried 0 time(s). [junit] 11/01/27 09:42:12 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:0. Already tried 1 time(s). [junit] 11/01/27 09:42:13 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:0. Already tried 2 time(s). Can someone try killing TestBlockFixer and run TestFileArgs on the machine thats running hudson?
        Hide
        Ramkumar Vadali added a comment -

        This enables a timeout for RAID tests. This does not fix the MiniMRCluster problem though.

        test-junit:
            [junit] WARNING: multiple versions of ant detected in path for junit
            [junit]          jar:file:/home/rvadali/local/external/ant/lib/ant.jar!/org/apache/tools/ant/Project.class
            [junit]      and jar:file:/home/rvadali/.ivy2/cache/ant/ant/jars/ant-1.6.5.jar!/org/apache/tools/ant/Project.class
            [junit] Running org.apache.hadoop.raid.TestBlockFixer
            [junit] Running org.apache.hadoop.raid.TestBlockFixer
            [junit] Tests run: 1, Failures: 0, Errors: 1, Time elapsed: 0 sec
            [junit] Test org.apache.hadoop.raid.TestBlockFixer FAILED (timeout)
        
        BUILD FAILED
        /data/users/rvadali/apache/hadoop-mapred-trunk/build.xml:821: The following error occurred while executing this line:
        /data/users/rvadali/apache/hadoop-mapred-trunk/build.xml:805: The following error occurred while executing this line:
        /data/users/rvadali/apache/hadoop-mapred-trunk/src/contrib/build.xml:60: The following error occurred while executing this line:
        /data/users/rvadali/apache/hadoop-mapred-trunk/src/contrib/raid/build.xml:60: Tests failed!
        
        
        Show
        Ramkumar Vadali added a comment - This enables a timeout for RAID tests. This does not fix the MiniMRCluster problem though. test-junit: [junit] WARNING: multiple versions of ant detected in path for junit [junit] jar:file:/home/rvadali/local/external/ant/lib/ant.jar!/org/apache/tools/ant/Project.class [junit] and jar:file:/home/rvadali/.ivy2/cache/ant/ant/jars/ant-1.6.5.jar!/org/apache/tools/ant/Project.class [junit] Running org.apache.hadoop.raid.TestBlockFixer [junit] Running org.apache.hadoop.raid.TestBlockFixer [junit] Tests run: 1, Failures: 0, Errors: 1, Time elapsed: 0 sec [junit] Test org.apache.hadoop.raid.TestBlockFixer FAILED (timeout) BUILD FAILED /data/users/rvadali/apache/hadoop-mapred-trunk/build.xml:821: The following error occurred while executing this line: /data/users/rvadali/apache/hadoop-mapred-trunk/build.xml:805: The following error occurred while executing this line: /data/users/rvadali/apache/hadoop-mapred-trunk/src/contrib/build.xml:60: The following error occurred while executing this line: /data/users/rvadali/apache/hadoop-mapred-trunk/src/contrib/raid/build.xml:60: Tests failed!
        Hide
        Scott Chen added a comment -

        +1 OK. At least, this will not hang and block hudson.
        But someone should fix the problem caused by MiniMRCluster.

        Show
        Scott Chen added a comment - +1 OK. At least, this will not hang and block hudson. But someone should fix the problem caused by MiniMRCluster.
        Hide
        Ramkumar Vadali added a comment -

        Jira for MiniMRCluster problem: MAPREDUCE-2285

        Show
        Ramkumar Vadali added a comment - Jira for MiniMRCluster problem: MAPREDUCE-2285
        Hide
        Scott Chen added a comment -

        I have committed the patch. At least now, this will not block Hudson build.
        But we need to fix MiniMRCluster related tests.

        Show
        Scott Chen added a comment - I have committed the patch. At least now, this will not block Hudson build. But we need to fix MiniMRCluster related tests.
        Hide
        Hudson added a comment -

        Integrated in Hadoop-Mapreduce-trunk-Commit #587 (See https://hudson.apache.org/hudson/job/Hadoop-Mapreduce-trunk-Commit/587/)
        MAPREDUCE-2283. Add timeout for Raid Tests (Ramkumar Vadali via schen)

        Show
        Hudson added a comment - Integrated in Hadoop-Mapreduce-trunk-Commit #587 (See https://hudson.apache.org/hudson/job/Hadoop-Mapreduce-trunk-Commit/587/ ) MAPREDUCE-2283 . Add timeout for Raid Tests (Ramkumar Vadali via schen)
        Hide
        Konstantin Shvachko added a comment -

        Could you please commit this also to 0.22. And then resolve.

        Show
        Konstantin Shvachko added a comment - Could you please commit this also to 0.22. And then resolve.
        Hide
        Scott Chen added a comment -

        I just committed this to 0.22.
        Thanks for the reminder, Konstantin.

        Show
        Scott Chen added a comment - I just committed this to 0.22. Thanks for the reminder, Konstantin.
        Hide
        Hudson added a comment -

        Integrated in Hadoop-Mapreduce-22-branch #33 (See https://hudson.apache.org/hudson/job/Hadoop-Mapreduce-22-branch/33/)

        Show
        Hudson added a comment - Integrated in Hadoop-Mapreduce-22-branch #33 (See https://hudson.apache.org/hudson/job/Hadoop-Mapreduce-22-branch/33/ )
        Hide
        Hudson added a comment -

        Integrated in Hadoop-Mapreduce-trunk #643 (See https://hudson.apache.org/hudson/job/Hadoop-Mapreduce-trunk/643/)

        Show
        Hudson added a comment - Integrated in Hadoop-Mapreduce-trunk #643 (See https://hudson.apache.org/hudson/job/Hadoop-Mapreduce-trunk/643/ )

          People

          • Assignee:
            Ramkumar Vadali
            Reporter:
            Nigel Daley
          • Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development