|
[
Permalink
| « Hide
]
Dag H. Wanvik added a comment - 22/Apr/09 12:15 AM
Patch which solves the issue. The Thread.sleep is still there, meaning that in most cases, the XRE11 would be seen, which seems the right. I feel this is a bit of a kludge, so please chime in on who we should solve this brittleness..
Dag H. Wanvik made changes - 22/Apr/09 12:15 AM
Uploading a second version of this patch, we shows that there are actually
two intermediate states we could encounter before we reach the end state in step 3. I don't this this behavior is well documented if at all. The state labelled b) in the code comment is a bit murky.. The new code uses a loop to wait until the final expected end state is reached. Uncomment the printlns inside the loop to see what happens. This patch should make this instability go away, running regressions.
Dag H. Wanvik made changes - 23/Apr/09 03:29 AM
Dag H. Wanvik made changes - 23/Apr/09 03:29 AM
Dag H. Wanvik made changes - 23/Apr/09 03:30 AM
The approach with a loop looks fine to me. One small issue:
If the SQLState is REPLICATION_DB_NOT_BOOTED, gotEx will be null, so this code is unnecessarily complex and somewhat misleading: + if (gotEx != null) { + // always fails + BaseJDBCTestCase.assertSQLState( + connectionURL + " failed: ", + REPLICATION_DB_NOT_BOOTED, + gotEx); } Perhaps it should be replaced with if (gotEx != null) throw gotEx; ?
Dag H. Wanvik made changes - 24/Apr/09 03:43 PM
Dag H. Wanvik made changes - 24/Apr/09 03:43 PM
Thanks for looking at this Knut; I agree, I had done the same
simplification in another version on the waiting loop logic (new in spin #3 of this patch) when I saw your comment :) Uploading version # of this patch, which adds another instance of a a missing wait; which made the test ReplicationRun_Local_StateTest_part1_2 fail if the machine was loaded. The patch also increases wait from 0.1s to 1 seconds in two instances, which did have waiting loops already, but waitined no longer than 2 seconds in all; which was too little under load on my box. I generally make the waiting loops wait do 20 attempts, waiting 1 second between each attempt now. That should be sufficient even for a reasonably loaded machine :) I also replaced the commented out "System.err.println"s with the debug util call that's used in these tests to keep things uniform, although such calls should probably in time be replaced by the normal BaseTestCase.println (outside scope here).
Dag H. Wanvik made changes - 24/Apr/09 04:02 PM
Dag H. Wanvik made changes - 24/Apr/09 04:09 PM
Dag H. Wanvik made changes - 24/Apr/09 06:47 PM
#3 looks fine to me. +1 to commit.
Dag H. Wanvik made changes - 27/Apr/09 11:18 PM
Dag H. Wanvik made changes - 27/Apr/09 11:30 PM
Committed #3 as svn 769602, resolving. Not closing yet, may want to backport.
Dag H. Wanvik made changes - 29/Apr/09 12:47 AM
Dag H. Wanvik made changes - 14/May/09 11:54 PM
Dag H. Wanvik made changes - 11/Jun/09 12:35 PM
Backported to 10.5 branch as svn 798700, closing.
Dag H. Wanvik made changes - 28/Jul/09 08:30 PM
Dag H. Wanvik made changes - 28/Jul/09 08:31 PM
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||