Hadoop Common
  1. Hadoop Common
  2. HADOOP-5172

Chukwa : TestAgentConfig.testInitAdaptors_vs_Checkpoint regularly fails

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.21.0
    • Component/s: test
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      org.apache.hadoop.chukwa.datacollection.agent.TestAgentConfig.testInitAdaptors_vs_Checkpoint regularly fails in Hudson builds. I am not sure which branches it affects. I will attach one of the failure logs.

      1. HADOOP-5172.patch
        0.5 kB
        Jerome Boulon

        Issue Links

          Activity

          Transition Time In Source Status Execution Times Last Executer Last Execution Date
          Open Open Patch Available Patch Available
          6d 17h 49m 1 Jerome Boulon 12/Feb/09 00:49
          Patch Available Patch Available Resolved Resolved
          23h 55m 1 Nigel Daley 13/Feb/09 00:44
          Resolved Resolved Closed Closed
          557d 19h 50m 1 Tom White 24/Aug/10 21:35
          Tom White made changes -
          Status Resolved [ 5 ] Closed [ 6 ]
          Owen O'Malley made changes -
          Component/s contrib/chukwa [ 12312445 ]
          Hide
          Hudson added a comment -
          Show
          Hudson added a comment - Integrated in Hadoop-trunk #756 (See http://hudson.zones.apache.org/hudson/job/Hadoop-trunk/756/ )
          Nigel Daley made changes -
          Resolution Fixed [ 1 ]
          Hadoop Flags [Reviewed]
          Fix Version/s 0.21.0 [ 12313563 ]
          Status Patch Available [ 10002 ] Resolved [ 5 ]
          Hide
          Nigel Daley added a comment -

          I just committed this. Thanks Jerome!

          Show
          Nigel Daley added a comment - I just committed this. Thanks Jerome!
          Jerome Boulon made changes -
          Link This issue relates to HADOOP-5243 [ HADOOP-5243 ]
          Hide
          Jerome Boulon added a comment -

          BTW, we can set the port to 0 and let the system decide.
          That way we don't have to provide a random port, which can still fail.
          This could be done in several ways but an easy one is to access the AgentControlSocketListener from ChukwaAgent ( field.setAccessible(true); from the test case).

          But before that: +1 to finding root cause of failures.

          Show
          Jerome Boulon added a comment - BTW, we can set the port to 0 and let the system decide. That way we don't have to provide a random port, which can still fail. This could be done in several ways but an easy one is to access the AgentControlSocketListener from ChukwaAgent ( field.setAccessible(true); from the test case). But before that: +1 to finding root cause of failures.
          Hide
          Mac Yang added a comment -

          Hi Nigel,

          We will continue to work on a fix for this. But in the meantime, could you
          commit this patch so the unit tests can run cleanly?

          Thanks,
          Mac

          Show
          Mac Yang added a comment - Hi Nigel, We will continue to work on a fix for this. But in the meantime, could you commit this patch so the unit tests can run cleanly? Thanks, Mac
          Hide
          Ari Rabkin added a comment -

          Failure appears unrelated to patch.

          Also, definitely +1 to finding root cause of failures.

          Show
          Ari Rabkin added a comment - Failure appears unrelated to patch. Also, definitely +1 to finding root cause of failures.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12400067/HADOOP-5172.patch
          against trunk revision 743513.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 1 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs warnings.

          +1 Eclipse classpath. The patch retains Eclipse classpath integrity.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed core unit tests.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3837/testReport/
          Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3837/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3837/artifact/trunk/build/test/checkstyle-errors.html
          Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3837/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12400067/HADOOP-5172.patch against trunk revision 743513. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 1 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 Eclipse classpath. The patch retains Eclipse classpath integrity. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3837/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3837/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3837/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3837/console This message is automatically generated.
          Hide
          Jerome Boulon added a comment -

          I agree that using "chukwaAgent.agent.control.port" we can set the port but before doing that I would like to identify the real problem since we are doing some start/stop on the agent without any issue from almost all Junits.

          Show
          Jerome Boulon added a comment - I agree that using "chukwaAgent.agent.control.port" we can set the port but before doing that I would like to identify the real problem since we are doing some start/stop on the agent without any issue from almost all Junits.
          Hide
          Ari Rabkin added a comment -

          We could pass in a custom Configuration, and specify a new portno, including a random one.

          Show
          Ari Rabkin added a comment - We could pass in a custom Configuration, and specify a new portno, including a random one.
          Hide
          Jerome Boulon added a comment - - edited

          The Junit fails because the port is already/still open for sure.
          However, it's not clear why we have only one test case that is failing since we are doing the same kind of operation in almost all test cases.
          Since this happen only on Solaris, it may be caused by a delay before we can actually reuse the same port but in order to validate this I first need to be able to reproduce the problem.

          Also we don't have any code in place to randomize the port but that could be done.

          Show
          Jerome Boulon added a comment - - edited The Junit fails because the port is already/still open for sure. However, it's not clear why we have only one test case that is failing since we are doing the same kind of operation in almost all test cases. Since this happen only on Solaris, it may be caused by a delay before we can actually reuse the same port but in order to validate this I first need to be able to reproduce the problem. Also we don't have any code in place to randomize the port but that could be done.
          Hide
          Nigel Daley added a comment -

          Jerome, I think you suspect this is due to port being held on Solaris. Can you randomize the port you use for each test? This is what Hadoop does.

          Show
          Nigel Daley added a comment - Jerome, I think you suspect this is due to port being held on Solaris. Can you randomize the port you use for each test? This is what Hadoop does.
          Jerome Boulon made changes -
          Status Open [ 1 ] Patch Available [ 10002 ]
          Hide
          Jerome Boulon added a comment -

          Exclude TestAgentConfig.java for now since this test case is failing on solaris.

          Show
          Jerome Boulon added a comment - Exclude TestAgentConfig.java for now since this test case is failing on solaris.
          Jerome Boulon made changes -
          Attachment HADOOP-5172.patch [ 12400067 ]
          Jerome Boulon made changes -
          Field Original Value New Value
          Assignee Jerome Boulon [ jboulon ]
          Hide
          Raghu Angadi added a comment -

          Log for one of the failures :

          Error Message
          ============
          
          org.apache.hadoop.chukwa.datacollection.agent.ChukwaAgent$AlreadyRunningException: Agent already running; aborting
          
          Stacktrace
          ========
          
          junit.framework.AssertionFailedError: org.apache.hadoop.chukwa.datacollection.agent.ChukwaAgent$AlreadyRunningException: Agent already running; aborting
          	at org.apache.hadoop.chukwa.datacollection.agent.TestAgentConfig.testInitAdaptors_vs_Checkpoint(TestAgentConfig.java:73)
          
          Standard Output
          
          ---------------------done with first run, now stopping
          ---------------------restarting
          console connector started
          
          Standard Error
          ============
          
          log4j:WARN No appenders could be found for logger (org.apache.hadoop.conf.Configuration).
          log4j:WARN Please initialize the log4j system properly.
          org.apache.hadoop.chukwa.datacollection.agent.ChukwaAgent$AlreadyRunningException: Agent already running; aborting
          	at org.apache.hadoop.chukwa.datacollection.agent.ChukwaAgent.<init>(ChukwaAgent.java:257)
          	at org.apache.hadoop.chukwa.datacollection.agent.TestAgentConfig.testInitAdaptors_vs_Checkpoint(TestAgentConfig.java:61)
          	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
          	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
          	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
          	at java.lang.reflect.Method.invoke(Method.java:597)
          	at junit.framework.TestCase.runTest(TestCase.java:154)
          	at junit.framework.TestCase.runBare(TestCase.java:127)
          	at junit.framework.TestResult$1.protect(TestResult.java:106)
          	at junit.framework.TestResult.runProtected(TestResult.java:124)
          	at junit.framework.TestResult.run(TestResult.java:109)
          	at junit.framework.TestCase.run(TestCase.java:118)
          	at junit.framework.TestSuite.runTest(TestSuite.java:208)
          	at junit.framework.TestSuite.run(TestSuite.java:203)
          	at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.run(JUnitTestRunner.java:421)
          	at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.launch(JUnitTestRunner.java:912)
          	at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.main(JUnitTestRunner.java:766)
          
          Show
          Raghu Angadi added a comment - Log for one of the failures : Error Message ============ org.apache.hadoop.chukwa.datacollection.agent.ChukwaAgent$AlreadyRunningException: Agent already running; aborting Stacktrace ======== junit.framework.AssertionFailedError: org.apache.hadoop.chukwa.datacollection.agent.ChukwaAgent$AlreadyRunningException: Agent already running; aborting at org.apache.hadoop.chukwa.datacollection.agent.TestAgentConfig.testInitAdaptors_vs_Checkpoint(TestAgentConfig.java:73) Standard Output ---------------------done with first run, now stopping ---------------------restarting console connector started Standard Error ============ log4j:WARN No appenders could be found for logger (org.apache.hadoop.conf.Configuration). log4j:WARN Please initialize the log4j system properly. org.apache.hadoop.chukwa.datacollection.agent.ChukwaAgent$AlreadyRunningException: Agent already running; aborting at org.apache.hadoop.chukwa.datacollection.agent.ChukwaAgent.<init>(ChukwaAgent.java:257) at org.apache.hadoop.chukwa.datacollection.agent.TestAgentConfig.testInitAdaptors_vs_Checkpoint(TestAgentConfig.java:61) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at junit.framework.TestCase.runTest(TestCase.java:154) at junit.framework.TestCase.runBare(TestCase.java:127) at junit.framework.TestResult$1.protect(TestResult.java:106) at junit.framework.TestResult.runProtected(TestResult.java:124) at junit.framework.TestResult.run(TestResult.java:109) at junit.framework.TestCase.run(TestCase.java:118) at junit.framework.TestSuite.runTest(TestSuite.java:208) at junit.framework.TestSuite.run(TestSuite.java:203) at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.run(JUnitTestRunner.java:421) at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.launch(JUnitTestRunner.java:912) at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.main(JUnitTestRunner.java:766)
          Raghu Angadi created issue -

            People

            • Assignee:
              Jerome Boulon
              Reporter:
              Raghu Angadi
            • Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development