Hadoop Map/Reduce
  1. Hadoop Map/Reduce
  2. MAPREDUCE-2285

MiniMRCluster does not start after ant test-patch

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Blocker Blocker
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.22.0
    • Component/s: test
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      Any test using MiniMRCluster hangs in the MiniMRCluster constructor after running ant test-patch. Steps to reproduce:
      1. ant -Dpatch.file=<dummy patch to CHANGES.txt> -Dforrest.home=<path to forrest> -Dfindbugs.home=<path to findbugs> -Dscratch.dir=/tmp/testpatch -Djava5.home=<path to java5> test-patch
      2. Run any test that creates MiniMRCluster, say ant test -Dtestcase=TestFileArgs (contrib/streaming)

      Expected result: Test should succeed
      Actual result: Test hangs in MiniMRCluster.<init>. This does not happen if we run ant clean after ant test-patch

      Test output:

          [junit] 11/01/27 12:11:43 INFO ipc.Server: IPC Server handler 3 on 58675: starting
          [junit] 11/01/27 12:11:43 INFO mapred.TaskTracker: TaskTracker up at: localhost.localdomain/127.0.0.1:58675
          [junit] 11/01/27 12:11:43 INFO mapred.TaskTracker: Starting tracker tracker_host0.foo.com:localhost.localdomain/127.0.0.1:58675
          [junit] 11/01/27 12:11:44 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:0. Already tried 0 time(s).
          [junit] 11/01/27 12:11:45 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:0. Already tried 1 time(s).
          [junit] 11/01/27 12:11:46 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:0. Already tried 2 time(s).
          [junit] 11/01/27 12:11:47 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:0. Already tried 3 time(s).
          [junit] 11/01/27 12:11:48 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:0. Already tried 4 time(s).
          [junit] 11/01/27 12:11:49 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:0. Already tried 5 time(s).
          [junit] 11/01/27 12:11:50 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:0. Already tried 6 time(s).
          [junit] 11/01/27 12:11:51 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:0. Already tried 7 time(s).
          [junit] 11/01/27 12:11:52 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:0. Already tried 8 time(s).
          [junit] 11/01/27 12:11:53 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:0. Already tried 9 time(s).
          [junit] 11/01/27 12:11:53 INFO ipc.RPC: Server at localhost/127.0.0.1:0 not available yet, Zzzzz...
      

      Stack trace:

              at java.lang.Thread.sleep(Native Method)
              at org.apache.hadoop.ipc.Client$Connection.handleConnectionFailure(Client.java:611)
              at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:429)
              - locked <0x00007f3b8dc08700> (a org.apache.hadoop.ipc.Client$Connection)
              at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:504)
              - locked <0x00007f3b8dc08700> (a org.apache.hadoop.ipc.Client$Connection)
              at org.apache.hadoop.ipc.Client$Connection.access$2000(Client.java:206)
              at org.apache.hadoop.ipc.Client.getConnection(Client.java:1164)
              at org.apache.hadoop.ipc.Client.call(Client.java:1008)
              at org.apache.hadoop.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:198)
              at org.apache.hadoop.mapred.$Proxy11.getProtocolVersion(Unknown Source)
              at org.apache.hadoop.ipc.WritableRpcEngine.getProxy(WritableRpcEngine.java:235)
              at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:275)
              at org.apache.hadoop.ipc.RPC.waitForProxy(RPC.java:206)
              at org.apache.hadoop.ipc.RPC.waitForProxy(RPC.java:185)
              at org.apache.hadoop.ipc.RPC.waitForProxy(RPC.java:169)
              at org.apache.hadoop.mapred.TaskTracker$2.run(TaskTracker.java:699)
              at java.security.AccessController.doPrivileged(Native Method)
              at javax.security.auth.Subject.doAs(Subject.java:396)
              at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1142)
              at org.apache.hadoop.mapred.TaskTracker.initialize(TaskTracker.java:695)
              - locked <0x00007f3b8ccc3870> (a org.apache.hadoop.mapred.TaskTracker)
              at org.apache.hadoop.mapred.TaskTracker.<init>(TaskTracker.java:1391)
              at org.apache.hadoop.mapred.MiniMRCluster$TaskTrackerRunner.createTaskTracker(MiniMRCluster.java:219)
              at org.apache.hadoop.mapred.MiniMRCluster$TaskTrackerRunner$1.run(MiniMRCluster.java:203)
              at org.apache.hadoop.mapred.MiniMRCluster$TaskTrackerRunner$1.run(MiniMRCluster.java:201)
              at java.security.AccessController.doPrivileged(Native Method)
              at javax.security.auth.Subject.doAs(Subject.java:396)
              at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1142)
              at org.apache.hadoop.mapred.MiniMRCluster$TaskTrackerRunner.<init>(MiniMRCluster.java:201)
              at org.apache.hadoop.mapred.MiniMRCluster.startTaskTracker(MiniMRCluster.java:716)
              at org.apache.hadoop.mapred.MiniMRCluster.<init>(MiniMRCluster.java:541)
              at org.apache.hadoop.mapred.MiniMRCluster.<init>(MiniMRCluster.java:482)
              at org.apache.hadoop.mapred.MiniMRCluster.<init>(MiniMRCluster.java:474)
              at org.apache.hadoop.mapred.MiniMRCluster.<init>(MiniMRCluster.java:466)
              at org.apache.hadoop.mapred.MiniMRCluster.<init>(MiniMRCluster.java:458)
              at org.apache.hadoop.mapred.MiniMRCluster.<init>(MiniMRCluster.java:448)
              at org.apache.hadoop.mapred.MiniMRCluster.<init>(MiniMRCluster.java:438)
              at org.apache.hadoop.mapred.MiniMRCluster.<init>(MiniMRCluster.java:429)
              at org.apache.hadoop.streaming.TestFileArgs.<init>(TestFileArgs.java:59)
              at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
              at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
              at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
              at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
              at org.junit.runners.BlockJUnit4ClassRunner.createTest(BlockJUnit4ClassRunner.java:202)
              at org.junit.runners.BlockJUnit4ClassRunner$1.runReflectiveCall(BlockJUnit4ClassRunner.java:251)
              at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
              at org.junit.runners.BlockJUnit4ClassRunner.methodBlock(BlockJUnit4ClassRunner.java:248)
              at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:76)
              at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
              at org.junit.runners.ParentRunner$3.run(ParentRunner.java:193)
              at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:52)
              at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:191)
              at org.junit.runners.ParentRunner.access$000(ParentRunner.java:42)
              at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:184)
              at org.junit.runners.ParentRunner.run(ParentRunner.java:236)
              at junit.framework.JUnit4TestAdapter.run(JUnit4TestAdapter.java:39)
              at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.run(JUnitTestRunner.java:422)
              at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.launch(JUnitTestRunner.java:931)
              at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.main(JUnitTestRunner.java:785)
      
      
      1. cp-bad
        63 kB
        Todd Lipcon
      2. cp-good
        55 kB
        Todd Lipcon
      3. fix-build.diff
        0.5 kB
        Todd Lipcon

        Activity

        Hide
        Todd Lipcon added a comment -

        I think the difference is that, after we run test-patch, build/ivy/lib/Hadoop/javadoc gets populated and on the classpath of the contrib tests. Attached are the classpath for a bad invocation (post test-patch) vs a good one (post clean)

        Maybe we need to straighten out our ivy confs?

        Show
        Todd Lipcon added a comment - I think the difference is that, after we run test-patch, build/ivy/lib/Hadoop/javadoc gets populated and on the classpath of the contrib tests. Attached are the classpath for a bad invocation (post test-patch) vs a good one (post clean) Maybe we need to straighten out our ivy confs?
        Hide
        Todd Lipcon added a comment -

        maybe something like this patch? Including */.jar from build/ivy/lib is going to pull in all the jars from javadoc, releaseaudit, and jdiff, which apparently have conflicting dependencies which cause the JT to crash and the tests to time out.

        After removing this we might be missing some dependencies which we'll have to add to the specific projects ivy.xml files, but that's more correct anyhow.

        Show
        Todd Lipcon added a comment - maybe something like this patch? Including * / .jar from build/ivy/lib is going to pull in all the jars from javadoc, releaseaudit, and jdiff, which apparently have conflicting dependencies which cause the JT to crash and the tests to time out. After removing this we might be missing some dependencies which we'll have to add to the specific projects ivy.xml files, but that's more correct anyhow.
        Hide
        Ramkumar Vadali added a comment -

        The patch fixes the problem. I am no ivy expert, but it looks good to me.

        Show
        Ramkumar Vadali added a comment - The patch fixes the problem. I am no ivy expert, but it looks good to me.
        Hide
        Giridharan Kesavan added a comment -

        Patch looks good.

        After removing this we might be missing some dependencies which we'll have to add to the specific projects ivy.xml files,

        I agree with Todd's point about adding deps to the ivy.xml.

        Show
        Giridharan Kesavan added a comment - Patch looks good. After removing this we might be missing some dependencies which we'll have to add to the specific projects ivy.xml files, I agree with Todd's point about adding deps to the ivy.xml.
        Hide
        Nigel Daley added a comment -

        Making patch available and will manually run thru hudson to test MR precommit.

        Show
        Nigel Daley added a comment - Making patch available and will manually run thru hudson to test MR precommit.
        Hide
        Todd Lipcon added a comment -

        This problem also causes failures in compiling mrunit's tests. eg this build: https://hudson.apache.org/hudson/job/Hadoop-Mapreduce-22-branch/25/
        (failed with weird errors about assert methods not existing in junit)

        The patch I attached last week fixes it. Giri: was your "looks good" a +1? (ie can I commit to branch and trunk?)

        Show
        Todd Lipcon added a comment - This problem also causes failures in compiling mrunit's tests. eg this build: https://hudson.apache.org/hudson/job/Hadoop-Mapreduce-22-branch/25/ (failed with weird errors about assert methods not existing in junit) The patch I attached last week fixes it. Giri: was your "looks good" a +1? (ie can I commit to branch and trunk?)
        Hide
        Nigel Daley added a comment -

        Yes Todd, please commit to trunk and 0.22. Thanks!

        Show
        Nigel Daley added a comment - Yes Todd, please commit to trunk and 0.22. Thanks!
        Hide
        Hudson added a comment -

        Integrated in Hadoop-Mapreduce-trunk-Commit #607 (See https://hudson.apache.org/hudson/job/Hadoop-Mapreduce-trunk-Commit/607/)
        MAPREDUCE-2285. MiniMRCluster does not start after ant test-patch. Contributed by Todd Lipcon.

        Show
        Hudson added a comment - Integrated in Hadoop-Mapreduce-trunk-Commit #607 (See https://hudson.apache.org/hudson/job/Hadoop-Mapreduce-trunk-Commit/607/ ) MAPREDUCE-2285 . MiniMRCluster does not start after ant test-patch. Contributed by Todd Lipcon.
        Hide
        Hudson added a comment -

        Integrated in Hadoop-Mapreduce-22-branch #33 (See https://hudson.apache.org/hudson/job/Hadoop-Mapreduce-22-branch/33/)

        Show
        Hudson added a comment - Integrated in Hadoop-Mapreduce-22-branch #33 (See https://hudson.apache.org/hudson/job/Hadoop-Mapreduce-22-branch/33/ )
        Hide
        Hudson added a comment -

        Integrated in Hadoop-Mapreduce-trunk #643 (See https://hudson.apache.org/hudson/job/Hadoop-Mapreduce-trunk/643/)

        Show
        Hudson added a comment - Integrated in Hadoop-Mapreduce-trunk #643 (See https://hudson.apache.org/hudson/job/Hadoop-Mapreduce-trunk/643/ )

          People

          • Assignee:
            Todd Lipcon
            Reporter:
            Ramkumar Vadali
          • Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development