Hadoop Common
  1. Hadoop Common
  2. HADOOP-5172

Chukwa : TestAgentConfig.testInitAdaptors_vs_Checkpoint regularly fails

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.21.0
    • Component/s: test
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      org.apache.hadoop.chukwa.datacollection.agent.TestAgentConfig.testInitAdaptors_vs_Checkpoint regularly fails in Hudson builds. I am not sure which branches it affects. I will attach one of the failure logs.

      1. HADOOP-5172.patch
        0.5 kB
        Jerome Boulon

        Issue Links

          Activity

          Hide
          Hudson added a comment -
          Show
          Hudson added a comment - Integrated in Hadoop-trunk #756 (See http://hudson.zones.apache.org/hudson/job/Hadoop-trunk/756/ )
          Hide
          Nigel Daley added a comment -

          I just committed this. Thanks Jerome!

          Show
          Nigel Daley added a comment - I just committed this. Thanks Jerome!
          Hide
          Jerome Boulon added a comment -

          BTW, we can set the port to 0 and let the system decide.
          That way we don't have to provide a random port, which can still fail.
          This could be done in several ways but an easy one is to access the AgentControlSocketListener from ChukwaAgent ( field.setAccessible(true); from the test case).

          But before that: +1 to finding root cause of failures.

          Show
          Jerome Boulon added a comment - BTW, we can set the port to 0 and let the system decide. That way we don't have to provide a random port, which can still fail. This could be done in several ways but an easy one is to access the AgentControlSocketListener from ChukwaAgent ( field.setAccessible(true); from the test case). But before that: +1 to finding root cause of failures.
          Hide
          Mac Yang added a comment -

          Hi Nigel,

          We will continue to work on a fix for this. But in the meantime, could you
          commit this patch so the unit tests can run cleanly?

          Thanks,
          Mac

          Show
          Mac Yang added a comment - Hi Nigel, We will continue to work on a fix for this. But in the meantime, could you commit this patch so the unit tests can run cleanly? Thanks, Mac
          Hide
          Ari Rabkin added a comment -

          Failure appears unrelated to patch.

          Also, definitely +1 to finding root cause of failures.

          Show
          Ari Rabkin added a comment - Failure appears unrelated to patch. Also, definitely +1 to finding root cause of failures.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12400067/HADOOP-5172.patch
          against trunk revision 743513.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 1 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs warnings.

          +1 Eclipse classpath. The patch retains Eclipse classpath integrity.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed core unit tests.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3837/testReport/
          Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3837/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3837/artifact/trunk/build/test/checkstyle-errors.html
          Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3837/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12400067/HADOOP-5172.patch against trunk revision 743513. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 1 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 Eclipse classpath. The patch retains Eclipse classpath integrity. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3837/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3837/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3837/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3837/console This message is automatically generated.
          Hide
          Jerome Boulon added a comment -

          I agree that using "chukwaAgent.agent.control.port" we can set the port but before doing that I would like to identify the real problem since we are doing some start/stop on the agent without any issue from almost all Junits.

          Show
          Jerome Boulon added a comment - I agree that using "chukwaAgent.agent.control.port" we can set the port but before doing that I would like to identify the real problem since we are doing some start/stop on the agent without any issue from almost all Junits.
          Hide
          Ari Rabkin added a comment -

          We could pass in a custom Configuration, and specify a new portno, including a random one.

          Show
          Ari Rabkin added a comment - We could pass in a custom Configuration, and specify a new portno, including a random one.
          Hide
          Jerome Boulon added a comment - - edited

          The Junit fails because the port is already/still open for sure.
          However, it's not clear why we have only one test case that is failing since we are doing the same kind of operation in almost all test cases.
          Since this happen only on Solaris, it may be caused by a delay before we can actually reuse the same port but in order to validate this I first need to be able to reproduce the problem.

          Also we don't have any code in place to randomize the port but that could be done.

          Show
          Jerome Boulon added a comment - - edited The Junit fails because the port is already/still open for sure. However, it's not clear why we have only one test case that is failing since we are doing the same kind of operation in almost all test cases. Since this happen only on Solaris, it may be caused by a delay before we can actually reuse the same port but in order to validate this I first need to be able to reproduce the problem. Also we don't have any code in place to randomize the port but that could be done.
          Hide
          Nigel Daley added a comment -

          Jerome, I think you suspect this is due to port being held on Solaris. Can you randomize the port you use for each test? This is what Hadoop does.

          Show
          Nigel Daley added a comment - Jerome, I think you suspect this is due to port being held on Solaris. Can you randomize the port you use for each test? This is what Hadoop does.
          Hide
          Jerome Boulon added a comment -

          Exclude TestAgentConfig.java for now since this test case is failing on solaris.

          Show
          Jerome Boulon added a comment - Exclude TestAgentConfig.java for now since this test case is failing on solaris.
          Hide
          Raghu Angadi added a comment -

          Log for one of the failures :

          Error Message
          ============
          
          org.apache.hadoop.chukwa.datacollection.agent.ChukwaAgent$AlreadyRunningException: Agent already running; aborting
          
          Stacktrace
          ========
          
          junit.framework.AssertionFailedError: org.apache.hadoop.chukwa.datacollection.agent.ChukwaAgent$AlreadyRunningException: Agent already running; aborting
          	at org.apache.hadoop.chukwa.datacollection.agent.TestAgentConfig.testInitAdaptors_vs_Checkpoint(TestAgentConfig.java:73)
          
          Standard Output
          
          ---------------------done with first run, now stopping
          ---------------------restarting
          console connector started
          
          Standard Error
          ============
          
          log4j:WARN No appenders could be found for logger (org.apache.hadoop.conf.Configuration).
          log4j:WARN Please initialize the log4j system properly.
          org.apache.hadoop.chukwa.datacollection.agent.ChukwaAgent$AlreadyRunningException: Agent already running; aborting
          	at org.apache.hadoop.chukwa.datacollection.agent.ChukwaAgent.<init>(ChukwaAgent.java:257)
          	at org.apache.hadoop.chukwa.datacollection.agent.TestAgentConfig.testInitAdaptors_vs_Checkpoint(TestAgentConfig.java:61)
          	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
          	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
          	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
          	at java.lang.reflect.Method.invoke(Method.java:597)
          	at junit.framework.TestCase.runTest(TestCase.java:154)
          	at junit.framework.TestCase.runBare(TestCase.java:127)
          	at junit.framework.TestResult$1.protect(TestResult.java:106)
          	at junit.framework.TestResult.runProtected(TestResult.java:124)
          	at junit.framework.TestResult.run(TestResult.java:109)
          	at junit.framework.TestCase.run(TestCase.java:118)
          	at junit.framework.TestSuite.runTest(TestSuite.java:208)
          	at junit.framework.TestSuite.run(TestSuite.java:203)
          	at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.run(JUnitTestRunner.java:421)
          	at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.launch(JUnitTestRunner.java:912)
          	at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.main(JUnitTestRunner.java:766)
          
          Show
          Raghu Angadi added a comment - Log for one of the failures : Error Message ============ org.apache.hadoop.chukwa.datacollection.agent.ChukwaAgent$AlreadyRunningException: Agent already running; aborting Stacktrace ======== junit.framework.AssertionFailedError: org.apache.hadoop.chukwa.datacollection.agent.ChukwaAgent$AlreadyRunningException: Agent already running; aborting at org.apache.hadoop.chukwa.datacollection.agent.TestAgentConfig.testInitAdaptors_vs_Checkpoint(TestAgentConfig.java:73) Standard Output ---------------------done with first run, now stopping ---------------------restarting console connector started Standard Error ============ log4j:WARN No appenders could be found for logger (org.apache.hadoop.conf.Configuration). log4j:WARN Please initialize the log4j system properly. org.apache.hadoop.chukwa.datacollection.agent.ChukwaAgent$AlreadyRunningException: Agent already running; aborting at org.apache.hadoop.chukwa.datacollection.agent.ChukwaAgent.<init>(ChukwaAgent.java:257) at org.apache.hadoop.chukwa.datacollection.agent.TestAgentConfig.testInitAdaptors_vs_Checkpoint(TestAgentConfig.java:61) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at junit.framework.TestCase.runTest(TestCase.java:154) at junit.framework.TestCase.runBare(TestCase.java:127) at junit.framework.TestResult$1.protect(TestResult.java:106) at junit.framework.TestResult.runProtected(TestResult.java:124) at junit.framework.TestResult.run(TestResult.java:109) at junit.framework.TestCase.run(TestCase.java:118) at junit.framework.TestSuite.runTest(TestSuite.java:208) at junit.framework.TestSuite.run(TestSuite.java:203) at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.run(JUnitTestRunner.java:421) at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.launch(JUnitTestRunner.java:912) at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.main(JUnitTestRunner.java:766)

            People

            • Assignee:
              Jerome Boulon
              Reporter:
              Raghu Angadi
            • Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development