Issue Details (XML | Word | Printable)

Key: DERBY-4196
Type: Improvement Improvement
Status: Closed Closed
Resolution: Fixed
Priority: Minor Minor
Assignee: Kim Haase
Reporter: Knut Anders Hatlen
Votes: 0
Watchers: 0
Operations

If you were logged in you would be able to see more operations.
Derby

Document initiation of replication from cleanly shut down database

Created: 29/Apr/09 12:19 PM   Updated: 02/Feb/10 10:04 AM
Component/s: Documentation, Replication
Affects Version/s: 10.4.1.3
Fix Version/s: 10.5.2.0, 10.6.0.0

Time Tracking:
Not Specified

File Attachments:
  Size
File Licensed for inclusion in ASF works DERBY-4196-2.diff 2009-07-14 02:37 PM Kim Haase 0.6 kB
File Licensed for inclusion in ASF works DERBY-4196.diff 2009-07-13 07:08 PM Kim Haase 4 kB
Zip Archive Licensed for inclusion in ASF works DERBY-4196.zip 2009-07-13 07:08 PM Kim Haase 5 kB
HTML File Licensed for inclusion in ASF works rrefattribstartslave.html 2009-07-14 02:37 PM Kim Haase 6 kB
Issue Links:
Reference
 

Urgency: Urgent
Resolution Date: 14/Jul/09 08:06 PM
Labels:


 Description  « Hide
The admin guide describes how to start replication.
http://db.apache.org/derby/docs/dev/adminguide/cadminreplicstartrun.html

It describes two steps that must be performed before the database is copied from the master to the slave:

1. Boot the database on the master system
2. Freeze the database (CALL SYSCS_UTIL.SYSCS_FREEZE_DATABASE())

Those two steps could be replaced with a single step:

1-2) Make sure the database on the master system is shut down cleanly

This works because then there is no recovery to be performed when the database later is booted in master mode, and neither the log nor the database will be modified during boot, so the master database will stay completely in sync with the slave.

Advantages with the alternative procedure are:

- no need to keep a process running with the database booted and frozen while copying the database from the master system to the slave system

- uncommitted transactions that are active at the time of the copying won't cause any problems (DERBY-3896)

 All   Comments   Work Log   Change History   Subversion Commits      Sort Order: Ascending order - Click to sort in descending order
Dag H. Wanvik added a comment - 10/Jul/09 06:02 PM
In view of the newly discovered DERBY-4299, I am bumping this to Urgent.

Dag H. Wanvik made changes - 10/Jul/09 06:03 PM
Field Original Value New Value
Urgency Urgent
Dag H. Wanvik made changes - 10/Jul/09 06:04 PM
Link This issue is related to DERBY-4299 [ DERBY-4299 ]
Dag H. Wanvik made changes - 10/Jul/09 06:04 PM
Link This issue is related to DERBY-3896 [ DERBY-3896 ]
Dag H. Wanvik added a comment - 10/Jul/09 06:19 PM
I am wondering if DERBY-4299 doesn't really call into question the soundness of the "freeze database" approach.
In DERBY-4299, the ASSERT seen happens because LogToFile.appendLogRecord does no writing of a log record that is later
accessed in the boot phase in "SLAVE_PRE_MODE" (log records are not written in this "pre" boot used to authenticate to avoid
getting out of synch with the master).

Kim Haase made changes - 13/Jul/09 05:28 PM
Assignee Kim Haase [ chaase3 ]
Kim Haase added a comment - 13/Jul/09 05:33 PM
I'm working on this, but a little more is needed than just condensing the steps. Step 5 says,

A successful use of the startMaster=true attribute will also unfreeze the database.

Will it also start a database that's been shut down? We're now telling users to shut down the database, not just freeze it.

I think the startMaster=true attribute description in the Reference Manual needs changing too.

Thanks in advance.

Knut Anders Hatlen added a comment - 13/Jul/09 06:09 PM
Thanks Kim!

startMaster=true will boot the database if it's not already booted. Since we don't want to mention the freeze step anymore, we could probably skip the sentence about startMaster=true unfreezing the database. The paragraph starting with "If any unlogged operations are running" will not make much sense either if we make these changes, and can be removed (no unlogged operations can be running in a database that's not booted).

In the reference manual, I think this change must be made:

Before you specify this attribute, you must boot the database on the master system, freeze it, perform a file system copy (...)
-> (...) this attribute, you must cleanly shut down the database on the master system, perform a file system copy (...)

I don't know if we should remove the paragraph about unlogged operations from the reference manual, or if we should just make it less prominent. It is still the behaviour of startMaster=true if the database is already booted when it's invoked, so it might still make sense to mention it. Perhaps just change the start first sentence in that paragraph to "If the master database is already booted and any unlogged operations are running"?

Kim Haase added a comment - 13/Jul/09 06:30 PM
Thanks, Knut, I was wondering about those sentences. Those are great suggestions.

I am wondering about the later part of the paragraph on unlogged operations -- it describes exactly what the error message says:

"The message instructs the user to unfreeze the database to allow the operations to complete, and then to specify startMaster=true again."

I think I should just remove that sentence and have the paragraph end with "an error message appears."

That leaves the issue of the actual error message text. XRE23 says, "Replication master cannot be started since unlogged operations are in progress, unfreeze to allow unlogged operations to complete and restart replication." Should it be left as it is, since the only way it's likely to come up is if someone did freeze the database instead of shutting it down?

I'll post a patch that includes your suggestions.

Kim Haase added a comment - 13/Jul/09 07:08 PM
Attaching DERBY-4196.diff and DERBY-4196.zip, with changes to two files:

M src/adminguide/cadminreplicstartrun.dita
M src/ref/rrefattribstartmaster.dita

Please let me know if further changes are needed.

Kim Haase made changes - 13/Jul/09 07:08 PM
Attachment DERBY-4196.diff [ 12413332 ]
Attachment DERBY-4196.zip [ 12413333 ]
Kim Haase made changes - 13/Jul/09 07:09 PM
Issue & fix info [Patch Available]
Knut Anders Hatlen added a comment - 14/Jul/09 10:43 AM
Thanks for the patch, Kim. The changes look very good to me. I agree that it's best to remove the description of the exact error message from the paragraph. It's probably OK to leave the actual message as it is, since the freeze/unfreeze approach is very likely to have been used if the situation occurs.

+1 to commit.

Repository Revision Date User Message
ASF #793902 Tue Jul 14 13:52:21 UTC 2009 chaase3 DERBY-4196: Document initiation of replication from cleanly shut down database

Corrected instructions in two topics.

Patch: DERBY-4196.diff
Files Changed
MODIFY /db/derby/docs/trunk/src/adminguide/cadminreplicstartrun.dita
MODIFY /db/derby/docs/trunk/src/ref/rrefattribstartmaster.dita

Repository Revision Date User Message
ASF #793905 Tue Jul 14 14:10:56 UTC 2009 chaase3 DERBY-4196: Document initiation of replication from cleanly shut down database

Merged DERBY-4196.diff to 10.5 docs branch from trunk revision 793902.
Files Changed
MODIFY /db/derby/docs/branches/10.5/src/adminguide/cadminreplicstartrun.dita
MODIFY /db/derby/docs/branches/10.5/src/ref/rrefattribstartmaster.dita

Kim Haase added a comment - 14/Jul/09 02:15 PM
Thanks very much, Knut!

While I was in the middle of merging the patch to the branch, I did a final check and realized that the documentation of the startSlave attribute has another reference to freezing the database, so I'll need to do another patch after this one. Sorry, I should have checked more thoroughly before.

Committed patch DERBY-4196.diff to documentation trunk at revision 793902.
Merged to 10.5 doc branch at revision 793905.

A second patch will follow.

Kim Haase added a comment - 14/Jul/09 02:37 PM
Attaching DERBY-4196-2.diff and rrefattribstartslave.html, a one-line change that corrects this additional reference topic to refer to a shutdown rather than a freeze of the master database.

Kim Haase made changes - 14/Jul/09 02:37 PM
Attachment DERBY-4196-2.diff [ 12413432 ]
Attachment rrefattribstartslave.html [ 12413433 ]
Knut Anders Hatlen added a comment - 14/Jul/09 05:16 PM
Good catch, Kim. The patch looks fine.

I took a quick look at the other replication-related attributes (failover, stopSlave, stopMaster, slavePort, slaveHost) and it looks like we're covered now. Thanks.

Repository Revision Date User Message
ASF #794028 Tue Jul 14 19:27:23 UTC 2009 chaase3 DERBY-4196: Document initiation of replication from cleanly shut down database

Modified startSlave topic in addition to startMaster.

Patch: DERBY-4196-2.diff
Files Changed
MODIFY /db/derby/docs/trunk/src/ref/rrefattribstartslave.dita

Repository Revision Date User Message
ASF #794044 Tue Jul 14 20:04:25 UTC 2009 chaase3 DERBY-4196: Document initiation of replication from cleanly shut down database

Merged DERBY-4196-2.diff to 10.5 docs branch from trunk revision 794028.
Files Changed
MODIFY /db/derby/docs/branches/10.5/src/ref/rrefattribstartslave.dita

Kim Haase added a comment - 14/Jul/09 08:06 PM
Thanks again, Knut.

Committed patch DERBY-4196-2.diff to documentation trunk at revision 794028.
Merged to 10.5 doc branch at revision 794044.

Kim Haase made changes - 14/Jul/09 08:06 PM
Status Open [ 1 ] Resolved [ 5 ]
Issue & fix info [Patch Available]
Fix Version/s 10.5.1.2 [ 12313870 ]
Fix Version/s 10.6.0.0 [ 12313727 ]
Resolution Fixed [ 1 ]
Kathey Marsden made changes - 16/Jul/09 09:24 PM
Fix Version/s 10.5.2.0 [ 12314116 ]
Fix Version/s 10.5.1.2 [ 12313870 ]
Knut Anders Hatlen added a comment - 02/Feb/10 10:04 AM
Verified in the latest alpha manuals. Closing.

Knut Anders Hatlen made changes - 02/Feb/10 10:04 AM
Status Resolved [ 5 ] Closed [ 6 ]