|
[
Permalink
| « Hide
]
Dag H. Wanvik added a comment - 10/Jul/09 06:02 PM
In view of the newly discovered DERBY-4299, I am bumping this to Urgent.
Dag H. Wanvik made changes - 10/Jul/09 06:03 PM
Dag H. Wanvik made changes - 10/Jul/09 06:04 PM
Dag H. Wanvik made changes - 10/Jul/09 06:04 PM
I am wondering if DERBY-4299 doesn't really call into question the soundness of the "freeze database" approach.
In DERBY-4299, the ASSERT seen happens because LogToFile.appendLogRecord does no writing of a log record that is later accessed in the boot phase in "SLAVE_PRE_MODE" (log records are not written in this "pre" boot used to authenticate to avoid getting out of synch with the master).
Kim Haase made changes - 13/Jul/09 05:28 PM
I'm working on this, but a little more is needed than just condensing the steps. Step 5 says,
A successful use of the startMaster=true attribute will also unfreeze the database. Will it also start a database that's been shut down? We're now telling users to shut down the database, not just freeze it. I think the startMaster=true attribute description in the Reference Manual needs changing too. Thanks in advance. Thanks Kim!
startMaster=true will boot the database if it's not already booted. Since we don't want to mention the freeze step anymore, we could probably skip the sentence about startMaster=true unfreezing the database. The paragraph starting with "If any unlogged operations are running" will not make much sense either if we make these changes, and can be removed (no unlogged operations can be running in a database that's not booted). In the reference manual, I think this change must be made: Before you specify this attribute, you must boot the database on the master system, freeze it, perform a file system copy (...) -> (...) this attribute, you must cleanly shut down the database on the master system, perform a file system copy (...) I don't know if we should remove the paragraph about unlogged operations from the reference manual, or if we should just make it less prominent. It is still the behaviour of startMaster=true if the database is already booted when it's invoked, so it might still make sense to mention it. Perhaps just change the start first sentence in that paragraph to "If the master database is already booted and any unlogged operations are running"? Thanks, Knut, I was wondering about those sentences. Those are great suggestions.
I am wondering about the later part of the paragraph on unlogged operations -- it describes exactly what the error message says: "The message instructs the user to unfreeze the database to allow the operations to complete, and then to specify startMaster=true again." I think I should just remove that sentence and have the paragraph end with "an error message appears." That leaves the issue of the actual error message text. XRE23 says, "Replication master cannot be started since unlogged operations are in progress, unfreeze to allow unlogged operations to complete and restart replication." Should it be left as it is, since the only way it's likely to come up is if someone did freeze the database instead of shutting it down? I'll post a patch that includes your suggestions. Attaching
M src/adminguide/cadminreplicstartrun.dita M src/ref/rrefattribstartmaster.dita Please let me know if further changes are needed.
Kim Haase made changes - 13/Jul/09 07:08 PM
Kim Haase made changes - 13/Jul/09 07:09 PM
Thanks for the patch, Kim. The changes look very good to me. I agree that it's best to remove the description of the exact error message from the paragraph. It's probably OK to leave the actual message as it is, since the freeze/unfreeze approach is very likely to have been used if the situation occurs.
+1 to commit.
Thanks very much, Knut!
While I was in the middle of merging the patch to the branch, I did a final check and realized that the documentation of the startSlave attribute has another reference to freezing the database, so I'll need to do another patch after this one. Sorry, I should have checked more thoroughly before. Committed patch Merged to 10.5 doc branch at revision 793905. A second patch will follow. Attaching
Kim Haase made changes - 14/Jul/09 02:37 PM
Good catch, Kim. The patch looks fine.
I took a quick look at the other replication-related attributes (failover, stopSlave, stopMaster, slavePort, slaveHost) and it looks like we're covered now. Thanks.
Thanks again, Knut.
Committed patch Merged to 10.5 doc branch at revision 794044.
Kim Haase made changes - 14/Jul/09 08:06 PM
Kathey Marsden made changes - 16/Jul/09 09:24 PM
Verified in the latest alpha manuals. Closing.
Knut Anders Hatlen made changes - 02/Feb/10 10:04 AM
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||