Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Critical Critical
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.94.4
    • Component/s: None
    • Labels:
      None

      Description

      See discussion at the end of HBASE-5778.
      TestReplication failed in all recent 0.94 jenkins builds.

      1. 7417-0.94.txt
        5 kB
        Lars Hofhansl
      2. 7417-0.96.txt
        5 kB
        Lars Hofhansl
      3. 7417-test.txt
        1 kB
        Lars Hofhansl
      4. 7417-test-v2.txt
        4 kB
        Lars Hofhansl

        Activity

        Hide
        Hudson added a comment -

        Integrated in HBase-0.94-security-on-Hadoop-23 #10 (See https://builds.apache.org/job/HBase-0.94-security-on-Hadoop-23/10/)
        HBASE-7417 Test patch, hopefully fixes TestReplication (Revision 1425251)

        Result = FAILURE
        larsh :
        Files :

        • /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java
        • /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/replication/TestReplication.java
        Show
        Hudson added a comment - Integrated in HBase-0.94-security-on-Hadoop-23 #10 (See https://builds.apache.org/job/HBase-0.94-security-on-Hadoop-23/10/ ) HBASE-7417 Test patch, hopefully fixes TestReplication (Revision 1425251) Result = FAILURE larsh : Files : /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/replication/TestReplication.java
        Hide
        Lars Hofhansl added a comment -

        Closing this. In branching 0.94 shortly after 0.94.3 and on a separate jenkins build I find that I cannot get a stable build there as well, so the recent issue are caused by ... no f*cking idea. Maybe some jenkins environment issues.

        Show
        Lars Hofhansl added a comment - Closing this. In branching 0.94 shortly after 0.94.3 and on a separate jenkins build I find that I cannot get a stable build there as well, so the recent issue are caused by ... no f*cking idea. Maybe some jenkins environment issues.
        Hide
        Hudson added a comment -

        Integrated in HBase-0.94-security #88 (See https://builds.apache.org/job/HBase-0.94-security/88/)
        HBASE-7417 Test patch, hopefully fixes TestReplication (Revision 1425251)

        Result = SUCCESS
        larsh :
        Files :

        • /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java
        • /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/replication/TestReplication.java
        Show
        Hudson added a comment - Integrated in HBase-0.94-security #88 (See https://builds.apache.org/job/HBase-0.94-security/88/ ) HBASE-7417 Test patch, hopefully fixes TestReplication (Revision 1425251) Result = SUCCESS larsh : Files : /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/replication/TestReplication.java
        Hide
        stack added a comment -

        +1 on reverting till stable again.

        Show
        stack added a comment - +1 on reverting till stable again.
        Hide
        Lars Hofhansl added a comment -

        So that fixed the weird SplitLogManager messages, but the truncate still timed out.
        That part is actually scary too, because it means the deleted rows were not replicated in a reasonable time. That could be a test problem, or indicating that something is wrong with replication.

        I know you prefer #2 and #3 Stack. But I am beginning to feel that we should roll back these two changes from 0.94 to keep 0.94 stable. Maybe even more changes need to be reverted.

        Before Dec 12th or 13th or so, we had close to 80% pass rates of the 0.94 tests. Something happened then, maybe it's the ipv4 change? Or the upgrade to ZK 3.4.5? It's very hard to pinpoint this locally.

        Show
        Lars Hofhansl added a comment - So that fixed the weird SplitLogManager messages, but the truncate still timed out. That part is actually scary too, because it means the deleted rows were not replicated in a reasonable time. That could be a test problem, or indicating that something is wrong with replication. I know you prefer #2 and #3 Stack . But I am beginning to feel that we should roll back these two changes from 0.94 to keep 0.94 stable. Maybe even more changes need to be reverted. Before Dec 12th or 13th or so, we had close to 80% pass rates of the 0.94 tests. Something happened then, maybe it's the ipv4 change? Or the upgrade to ZK 3.4.5? It's very hard to pinpoint this locally.
        Hide
        Hudson added a comment -

        Integrated in HBase-0.94 #656 (See https://builds.apache.org/job/HBase-0.94/656/)
        HBASE-7417 Test patch, hopefully fixes TestReplication (Revision 1425251)

        Result = FAILURE
        larsh :
        Files :

        • /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java
        • /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/replication/TestReplication.java
        Show
        Hudson added a comment - Integrated in HBase-0.94 #656 (See https://builds.apache.org/job/HBase-0.94/656/ ) HBASE-7417 Test patch, hopefully fixes TestReplication (Revision 1425251) Result = FAILURE larsh : Files : /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/replication/TestReplication.java
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12562213/7417-test-v2.txt
        against trunk revision .

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 6 new or modified tests.

        -1 patch. The patch command could not apply the patch.

        Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/3667//console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12562213/7417-test-v2.txt against trunk revision . +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 6 new or modified tests. -1 patch . The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/3667//console This message is automatically generated.
        Hide
        Lars Hofhansl added a comment - - edited

        This one make the cleanup of the FS optional. Because it confuses TestReplication, it can opt out of the FS cleanup.

        This is for #3. I think I should apply this first, to see what happens.

        In my tests I saw two different failures:

        1. Some parts in the test just took a bit too long.
        2. That weird message from the SplitLogManager, which prevents loadTable called from queueFailover to finish

        #1 happens fairly rarely and was there before I assume.
        #2 seems to go away with this patch.

        Show
        Lars Hofhansl added a comment - - edited This one make the cleanup of the FS optional. Because it confuses TestReplication, it can opt out of the FS cleanup. This is for #3. I think I should apply this first, to see what happens. In my tests I saw two different failures: Some parts in the test just took a bit too long. That weird message from the SplitLogManager, which prevents loadTable called from queueFailover to finish #1 happens fairly rarely and was there before I assume. #2 seems to go away with this patch.
        Hide
        stack added a comment -

        In order:

        Do 2.

        Do 2. and 3.

        Show
        stack added a comment - In order: Do 2. Do 2. and 3.
        Hide
        Lars Hofhansl added a comment -

        I spent a significant amount on this now, I have no more time to spend on this.
        It seems we have three options:

        1. Rollback both HBASE-7283 and HBASE-5778
        2. Commit the first patch I attached (the one that starts/stops a cluster before/after each test). Adds about 1-2 mins to the test run
        3. Commit the -test patch with an addition to make TestTableDeleteFamilyHandler pass

        Comments?

        Show
        Lars Hofhansl added a comment - I spent a significant amount on this now, I have no more time to spend on this. It seems we have three options: Rollback both HBASE-7283 and HBASE-5778 Commit the first patch I attached (the one that starts/stops a cluster before/after each test). Adds about 1-2 mins to the test run Commit the -test patch with an addition to make TestTableDeleteFamilyHandler pass Comments?
        Hide
        Lars Hofhansl added a comment -

        I think we'll get good mileage if we apply the -test patch + making TestTableDeleteFamilyHandler (which is not too hard).
        (I would like to understand why this is, though. The changes basically look like they rip a regions storage directory from under the regionserver, but in this case it should not cause a problem).

        Show
        Lars Hofhansl added a comment - I think we'll get good mileage if we apply the -test patch + making TestTableDeleteFamilyHandler (which is not too hard). (I would like to understand why this is, though. The changes basically look like they rip a regions storage directory from under the regionserver, but in this case it should not cause a problem).
        Hide
        Andrew Purtell added a comment - - edited

        No, not 100% over 10 reps with just the -test patch applied, nor with only HBASE-7283 reverted, I have to revert both HBASE-7283 and HBASE-5778. (Last night I was testing with a 0.94 checked out just prior to your reapplication of HBASE-5778.) Perhaps we should accept the state of 0.94 branch as is current and debug the failure you are seeing as a new problem.

        Show
        Andrew Purtell added a comment - - edited No, not 100% over 10 reps with just the -test patch applied, nor with only HBASE-7283 reverted, I have to revert both HBASE-7283 and HBASE-5778 . (Last night I was testing with a 0.94 checked out just prior to your reapplication of HBASE-5778 .) Perhaps we should accept the state of 0.94 branch as is current and debug the failure you are seeing as a new problem.
        Hide
        Lars Hofhansl added a comment -

        Yep, pretty reliably I get queueFailover to fail and see the SplitLogManager messages. When I apply the -test patch that has not happened so far.

        So I would like to work out how to make TestTableDeleteFamilyHandler pass, and then commit the -test patch. Please let me know if you have objections. (Maybe this is not the full story, but it is definitely better)

        Show
        Lars Hofhansl added a comment - Yep, pretty reliably I get queueFailover to fail and see the SplitLogManager messages. When I apply the -test patch that has not happened so far. So I would like to work out how to make TestTableDeleteFamilyHandler pass, and then commit the -test patch. Please let me know if you have objections. (Maybe this is not the full story, but it is definitely better)
        Hide
        Lars Hofhansl added a comment -

        Two observations:

        1. I see the above SplitLogManager messages only during failed runs
        2. I'm having good luck with the -test patch
        Show
        Lars Hofhansl added a comment - Two observations: I see the above SplitLogManager messages only during failed runs I'm having good luck with the -test patch
        Hide
        Lars Hofhansl added a comment -

        Andrew Purtell And with HBASE-7283 backed out you get 100% passing rate?

        Show
        Lars Hofhansl added a comment - Andrew Purtell And with HBASE-7283 backed out you get 100% passing rate?
        Hide
        Andrew Purtell added a comment -

        No the -test change wasn't enough. Failed on run 8.

        Show
        Andrew Purtell added a comment - No the -test change wasn't enough. Failed on run 8.
        Hide
        Lars Hofhansl added a comment -

        I see interesting logs during a failed run:
        2012-12-21 12:55:37,103 DEBUG [localhost,60156,1356123125892.splitLogManagerTimeoutMonitor] master.SplitLogManager$TimeoutMonitor(970): total tasks = 24 unassigned = 24

        Apparently never making any progress (this is with an unchanged 0.94 checkout).

        Show
        Lars Hofhansl added a comment - I see interesting logs during a failed run: 2012-12-21 12:55:37,103 DEBUG [localhost,60156,1356123125892.splitLogManagerTimeoutMonitor] master.SplitLogManager$TimeoutMonitor(970): total tasks = 24 unassigned = 24 Apparently never making any progress (this is with an unchanged 0.94 checkout).
        Hide
        Ted Yu added a comment -

        Looks like the test runs better with -test patch:

        Failed tests:   queueFailover(org.apache.hadoop.hbase.replication.TestReplication): Waited too much time for queueFailover replication. Waited 42980ms.
        ...
        TestReplication failed, iteration: 3
        
        Show
        Ted Yu added a comment - Looks like the test runs better with -test patch: Failed tests: queueFailover(org.apache.hadoop.hbase.replication.TestReplication): Waited too much time for queueFailover replication. Waited 42980ms. ... TestReplication failed, iteration: 3
        Hide
        Andrew Purtell added a comment -

        Unfortunately removing that part causes TestTableDeleteFamilyHandler to fail.

        Sure, but if this works we've at least identified the test changes for HBASE-7283 aren't fully baked.

        Show
        Andrew Purtell added a comment - Unfortunately removing that part causes TestTableDeleteFamilyHandler to fail. Sure, but if this works we've at least identified the test changes for HBASE-7283 aren't fully baked.
        Hide
        Lars Hofhansl added a comment -

        Unfortunately removing that part causes TestTableDeleteFamilyHandler to fail.

        Show
        Lars Hofhansl added a comment - Unfortunately removing that part causes TestTableDeleteFamilyHandler to fail.
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12562129/7417-test.txt
        against trunk revision .

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 3 new or modified tests.

        -1 patch. The patch command could not apply the patch.

        Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/3658//console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12562129/7417-test.txt against trunk revision . +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 3 new or modified tests. -1 patch . The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/3658//console This message is automatically generated.
        Hide
        Andrew Purtell added a comment -

        Trying the -test patch locally.

        Show
        Andrew Purtell added a comment - Trying the -test patch locally.
        Hide
        Lars Hofhansl added a comment -

        Going to apply the -test patch.

        Show
        Lars Hofhansl added a comment - Going to apply the -test patch.
        Hide
        Lars Hofhansl added a comment -

        Hah, we came to the same conclusion at the same time

        Show
        Lars Hofhansl added a comment - Hah, we came to the same conclusion at the same time
        Hide
        Lars Hofhansl added a comment -

        Might be worth applying this.

        Show
        Lars Hofhansl added a comment - Might be worth applying this.
        Hide
        Lars Hofhansl added a comment -

        Ok... Here's a guess. HBASE-7283 has a (what appears to be superfluous) change in HBaseTestingUtility.createMultiRegions.

        Show
        Lars Hofhansl added a comment - Ok... Here's a guess. HBASE-7283 has a (what appears to be superfluous) change in HBaseTestingUtility.createMultiRegions.
        Hide
        Andrew Purtell added a comment -

        I wonder if backing out just the changes to HBaseTestingUtility would "fix".

        Show
        Andrew Purtell added a comment - I wonder if backing out just the changes to HBaseTestingUtility would "fix".
        Hide
        Lars Hofhansl added a comment -

        Still fails sometimes locally... But it looks like that is indeed the culprit.
        Looking at HBASE-7283, though, I can't see how it can possibly cause this problem.

        OK. So now the question is: Did HBASE-7283 introduce a bug? Or is it just causing a weird interaction in TestReplication?

        Show
        Lars Hofhansl added a comment - Still fails sometimes locally... But it looks like that is indeed the culprit. Looking at HBASE-7283 , though, I can't see how it can possibly cause this problem. OK. So now the question is: Did HBASE-7283 introduce a bug? Or is it just causing a weird interaction in TestReplication?
        Hide
        Lars Hofhansl added a comment -

        Trying locally with HBASE-7283 reverted.

        Show
        Lars Hofhansl added a comment - Trying locally with HBASE-7283 reverted.
        Hide
        Lars Hofhansl added a comment -

        OK... Before I commit this test change here, let's remove that backport and see what that does.
        Thanks Andrew Purtell!!

        Show
        Lars Hofhansl added a comment - OK... Before I commit this test change here, let's remove that backport and see what that does. Thanks Andrew Purtell !!
        Hide
        stack added a comment -

        +1 on purging that backport if it causing the fail. Nice diving Andrew.

        Show
        stack added a comment - +1 on purging that backport if it causing the fail. Nice diving Andrew.
        Hide
        Andrew Purtell added a comment -

        +1 on doing what's necessary to get a green build before moving this out to hbase-it

        Show
        Andrew Purtell added a comment - +1 on doing what's necessary to get a green build before moving this out to hbase-it
        Hide
        Andrew Purtell added a comment - - edited

        Carrying over from the tail of HBASE-5778, bisecting finished, this is what I get with checking the commits with (up to) 10 repetitions of TestReplication:

        91ca402 is the first bad commit
        91ca402...
            HBASE-7283 Backport HBASE-6564 + HBASE-7202 to 0.94
            git-svn-id: https://svn.apache.org/repos/asf/hbase/branches/0.94@1423774
        
        Show
        Andrew Purtell added a comment - - edited Carrying over from the tail of HBASE-5778 , bisecting finished, this is what I get with checking the commits with (up to) 10 repetitions of TestReplication: 91ca402 is the first bad commit 91ca402... HBASE-7283 Backport HBASE-6564 + HBASE-7202 to 0.94 git-svn-id: https://svn.apache.org/repos/asf/hbase/branches/0.94@1423774
        Hide
        stack added a comment -

        +1 on commit. +1 too on moving this out to hbase-it because it runs too long.

        Show
        stack added a comment - +1 on commit. +1 too on moving this out to hbase-it because it runs too long.
        Hide
        Lars Hofhansl added a comment -

        The test runs for over 8 mins now, and in the latest hadoopqa run. But it passed. I did not see TestReplicationWirhCompression.

        Show
        Lars Hofhansl added a comment - The test runs for over 8 mins now, and in the latest hadoopqa run. But it passed. I did not see TestReplicationWirhCompression.
        Hide
        Nicolas Liochon added a comment -

        A slow non flaky test is better than a fast flaky test .
        But the problem with launching a cluster in each test method is that then the test method tends to grow to include multiples smalls tests. TestMasterFailover is for example very difficult to understand because of this: the test methods are 100 lines each. And trying to fix it later is difficult. I tried and finally failed on TestMasterFailover.

        So in the general case, I think the best pattern is a single cluster start/stop per class test, this makes having small & clear test methods natural.

        Again, I'm quite happy to learn that TestReplication is now non flaky, that's a lovely progress. Just that for new tests we should not do that .

        Show
        Nicolas Liochon added a comment - A slow non flaky test is better than a fast flaky test . But the problem with launching a cluster in each test method is that then the test method tends to grow to include multiples smalls tests. TestMasterFailover is for example very difficult to understand because of this: the test methods are 100 lines each. And trying to fix it later is difficult. I tried and finally failed on TestMasterFailover. So in the general case, I think the best pattern is a single cluster start/stop per class test, this makes having small & clear test methods natural. Again, I'm quite happy to learn that TestReplication is now non flaky, that's a lovely progress. Just that for new tests we should not do that .
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12562056/7417-0.96.txt
        against trunk revision .

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 3 new or modified tests.

        +1 hadoop2.0. The patch compiles against the hadoop 2.0 profile.

        +1 javadoc. The javadoc tool did not generate any warning messages.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        -1 findbugs. The patch appears to introduce 28 new Findbugs (version 1.3.9) warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        -1 core tests. The patch failed these unit tests:

        -1 core zombie tests. There are zombie tests. See build logs for details.

        Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/3651//testReport/
        Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3651//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
        Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3651//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
        Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3651//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
        Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3651//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
        Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3651//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
        Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3651//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
        Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3651//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
        Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/3651//console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12562056/7417-0.96.txt against trunk revision . +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 3 new or modified tests. +1 hadoop2.0 . The patch compiles against the hadoop 2.0 profile. +1 javadoc . The javadoc tool did not generate any warning messages. +1 javac . The applied patch does not increase the total number of javac compiler warnings. -1 findbugs . The patch appears to introduce 28 new Findbugs (version 1.3.9) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. -1 core tests . The patch failed these unit tests: -1 core zombie tests . There are zombie tests. See build logs for details. Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/3651//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3651//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3651//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3651//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3651//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3651//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3651//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3651//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/3651//console This message is automatically generated.
        Hide
        stack added a comment -

        Its long. This is probably the longest running test, it or the MR ones. I still think it should be an IT test – spinning up two clusters, etc., etc. – but this could do till we move it over.

        Show
        stack added a comment - Its long. This is probably the longest running test, it or the MR ones. I still think it should be an IT test – spinning up two clusters, etc., etc. – but this could do till we move it over.
        Hide
        Lars Hofhansl added a comment -

        Let's get a HadoopQA run

        Show
        Lars Hofhansl added a comment - Let's get a HadoopQA run
        Hide
        Lars Hofhansl added a comment -

        And here's the 0.96 patch.
        For some reason the mini cluster was started with 3 RSs in 0.96 but only 2 in 0.94.
        Changed it to 2 in 0.96 as well.
        Test runs for 420s on my machine.

        Show
        Lars Hofhansl added a comment - And here's the 0.96 patch. For some reason the mini cluster was started with 3 RSs in 0.96 but only 2 in 0.94. Changed it to 2 in 0.96 as well. Test runs for 420s on my machine.
        Hide
        Lars Hofhansl added a comment -

        Here's a 0.94 patch

        Show
        Lars Hofhansl added a comment - Here's a 0.94 patch
        Hide
        Lars Hofhansl added a comment -

        TestReplication for 0.94 runs about 7m on my laptop. Is that too long for a single test?
        There seems to be some other issue in trunk... Looking.

        Show
        Lars Hofhansl added a comment - TestReplication for 0.94 runs about 7m on my laptop. Is that too long for a single test? There seems to be some other issue in trunk... Looking.
        Hide
        stack added a comment -

        +1 on breaking it up then. Would it be too much to do a separate test class w/ all the setup per test in the current class? (We have // building so may not add much to overall time)

        Show
        stack added a comment - +1 on breaking it up then. Would it be too much to do a separate test class w/ all the setup per test in the current class? (We have // building so may not add much to overall time)
        Hide
        Lars Hofhansl added a comment -
        Show
        Lars Hofhansl added a comment - Jean-Daniel Cryans , Andrew Purtell : FYI
        Hide
        Lars Hofhansl added a comment -

        After looking at the test, it seems a bunch of its instability comes from all the cleaning-up-between-tests huh hah. This was done to avoid starting/stopping a cluster for each test.
        Since, a bunch of work went in to make cluster start/stop faster.

        If I change that to start/stop the minicluster before/after each test, I do get a test that runs longer but which doesn't fail (at least locally).

        Show
        Lars Hofhansl added a comment - After looking at the test, it seems a bunch of its instability comes from all the cleaning-up-between-tests huh hah. This was done to avoid starting/stopping a cluster for each test. Since, a bunch of work went in to make cluster start/stop faster. If I change that to start/stop the minicluster before/after each test, I do get a test that runs longer but which doesn't fail (at least locally).

          People

          • Assignee:
            Lars Hofhansl
            Reporter:
            Lars Hofhansl
          • Votes:
            0 Vote for this issue
            Watchers:
            8 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development