|
functional/design spec based on the input I got on the derby-dev list.
This patch adds some code necessary to support real-time online backup that does not block writes when database backup is in progress. All the current functional tests passed with these changes. It would be great if some can review and commit this patch. This patch changes the way data segment and log is backed up without blocking the writes. Data Segment Backup: o The containers to be backed up are found by scanning the files in seg0. o Each container is backed up by reading all the pages through the page cache, and then writing to the backup container. Pages are latched while writing into the backup container. o Not necessary to backup containers in any particular order , All updates that happens after a container is backed will be redone using the transaction log on restore. MT cases: 1) Each page is latched when it is written to the backup to prevent partial written pages sneaking into the backup. 2) Thread that is backing up the container will stop if another thread requests removal of the container when container is being backed up. 3) Truncate of the container blocks if the container is being backed up. 4) Partially created containers will not be backed up. Container cache will not return the container items until the creation is complete. (No changes are not for this case , that is how it currently works). Transaction Log Backup: Transaction Log file backup in two phases: 1) First Check point info and the the log files are backed up before the data segment. 2) After the data segment is backed up , all the log files that are generated after tha backup started are also copied into the backup. MT cases: 1) If there is a checkpoint in progress, backup will wait for the checkpoint to complete before copying checkpoint control information into the backup. Testing : All functional tests(derbyall suite) passed on jdk142/Windows XP. svn status: M java\engine\org\apache\derby\impl\store\raw\log\ReadOnly.java M java\engine\org\apache\derby\impl\store\raw\log\LogToFile.java M java\engine\org\apache\derby\impl\store\raw\RawStore.java M java\engine\org\apache\derby\impl\store\raw\data\BasePage.java M java\engine\org\apache\derby\impl\store\raw\data\InputStreamContainer.jav a M java\engine\org\apache\derby\impl\store\raw\data\BaseDataFileFactory.java M java\engine\org\apache\derby\impl\store\raw\data\CachedPage.java M java\engine\org\apache\derby\impl\store\raw\data\FileContainer.java M java\engine\org\apache\derby\impl\store\raw\data\BaseContainer.java M java\engine\org\apache\derby\impl\store\raw\data\BaseContainerHandle.java M java\engine\org\apache\derby\impl\store\raw\data\RAFContainer.java M java\engine\org\apache\derby\iapi\store\raw\log\LogFactory.java M java\engine\org\apache\derby\iapi\store\raw\data\DataFactory.java M java\engine\org\apache\derby\iapi\store\raw\ContainerHandle.java I have looked at the onlinebackup_1.diff patch, and overall the patch
looks good. However, I think there is one severe bug in RAFContainer.java. I guess due the cut and paste from writePage(), you are seeking in another file than the one you are writing to (fileData.seek(), but backupRaf.write()). I have a few other comments and questions: General: * If I run backup with this patch, it seems like I will run the new code. Does one not need to change the restore code to be able to restore restore such a backup, or does that the ordinary recovery handle that? RawStore.java: * The BACKUP_FILTER now contains so much, that it would be useful to have a comment that says what is really left to copy. * Intuitively, it seems wrong to hard-code "seg0", but I see that this is done all over the code. Will there always be only one segment? What is then the purpose of the segment concept? * backup(File ...) seems like it would create an endless recursion if called. Fortunately, it seems like it never will be called. Why do we need the methods with a File parameter instead of a String. The system procedures uses the String variant. Maybe we could just remove the File variant? * I think it would be helpful with a comment that explained why disableLogArchiveMode() was made synchronized. BaseDataFileFactory.java: * I do think basing which files to back up on the contents of the seg0 directory is very robust. What if someone by accident has written a file with a name that matches the pattern you are looking for. Then I would think you may get a very strange error message that may not be easy to resolve. Could not this be based on some system catalog? Another scenario is if someone by accident deletes a file for a table that is not accessed very often. A later backup will then not detect that this file is missing. Since the backup is believed to be succesful, the latest backup of this file may be deleted. FileContainer.java: * I cannot find any backup-specific about getPageForBackup() so I think a more general name would be better (e.g, getLatchedPage). RAFContainer.java: * The changes to this file seems not to be quite in line with some of the original design philosophies. I am not sure that is necessarily bad, but it would be nice to here the arguments for doing it this way. More specifically: - While RAFContainer so far has used the StorageRandomAccessFile/StorageFile abstractions, backup use RandomAccessFile/File directly. Is there a particular reason for that? - In order to be able to backup a page, new methods have been added to FileContainer and BasePage/CachedPage, and the RAFContainer is doing latching of pages. This increases coupling to other modules, have alternative designs been considered? * privBackupContainer(): - Setting 'done=true' at the start of the method is a bit confusing. I would think another name for this variable would be better. - If an exception is thrown while holding a latch, do one not need to relase the latch? * writeToBackup(): - copies a lot of code from writePage(). One should consider factoring out common code. - Due the cut and paste, you are seeking in another file than the one you are writing to. (fileData.seek(), but backupRaf.write()) LogToFile.java: * In getFirstLogNeeded why did you need to change getFirstLogNeeded() to handle a null checkpoint? Is it in case you do backup before any checkpoint has been performed? * The javadoc for startLogBackup() says that 'log files are copied after all the data files are backed up, but at the end of the method log files are copied. ReadOnly.java/InputStreamContainer.java: * I do not think javadoc should just be a copy of the interface/superclass, but say something about what is particular to this implementation. In this case, the Javadoc should say that nothing is done. typos: * several occurences of 'backedup' * LogToFile.java: 'eventhough' * RAFContainer.java: 'pahe cache', 'conatiner' * 'Standard Cloudscape error policy'. Should this be changed to Derby? Pet peeves: * Should have Javadoc on all methods, parameters and return values. * Lines longer than 80 chars. What to do if backup is started in a transaction that already has unlogged operations executed?
In previous discussions about online backup, it was concluded that existing backup procedures calls will WAIT for the transaction with unlogged operations to commit before proceeding with the backup. One issue that was missing from the discussion was, what to do if user starts a backup in the same transaction that has unlogged operations executed before the backup call. WAIT will not be an acceptable option here, because backup call will wait forever. I can think of two ways this issue can be addressed: 1) Add a restriction that backup procedures can only be called in a brand NEW transaction. And also implicitly commit the backup transaction at the end of the backup. Commit is not required as such to solve this problem, but it would be cleaner because backup itself is not a rollback-able operation. 2) Make backup procedures fail, if transaction that it is started in contains unlogged operations. I am inclined towards implementing the first option. Any comments/suggestion will be appreciated. Thanks -suresht Fix to the problem found by Øystein while reviewing the previous online backup patch(online_backup1.diff).
Backup of a container code was doing a seek incorrectly on the file container instead of the backup file. Tests: All tests passed on jdk142 Windows XP. It would be great if some one can commit this patch. Thanks -suresht This patch adds code to support real-time online backup with unlogged
operations. A consistent backup can not be made if there are pending transactions with unlogged operations or if unlogged operations occur when backup is in progress. Because container files can be copied to the backup before the transaction is committed and the data pages are flushed as part of the commit. As there is no transaction log for unlogged operations, while restoring from the backup database can not be restored to a consistent state. To make a consistent online backup in this scenario, this patch: 1) blocks online backup until all the transactions with unlogged operation are committed/aborted. 2) implicitly converts all unlogged operations to logged mode for the duration of the online backup, if they are started when backup is in progress. This patch also adds a test to test the online backup in parallel with some DML, DDL and unlogged operations. TESTS : derbyall test suite passed on Windows XP/JDK142 It would be great if some can review and commit this patch. svn stat: M java\engine\org\apache\derby\impl\store\raw\xact\Xact.java M java\engine\org\apache\derby\impl\store\raw\xact\XactFactory.java M java\engine\org\apache\derby\impl\store\raw\RawStore.java M java\engine\org\apache\derby\impl\store\raw\data\BaseDataFileFactory.java M java\engine\org\apache\derby\iapi\store\raw\xact\RawTransaction.java M java\engine\org\apache\derby\iapi\store\raw\xact\TransactionFactory.java M java\testing\org\apache\derbyTesting\functionTests\tests\storetests\st_1.sql A java\testing\org\apache\derbyTesting\functionTests\tests\store\OnlineBackupTest1_app.properties A java\testing\org\apache\derbyTesting\functionTests\tests\store\OnlineBackup.java M java\testing\org\apache\derbyTesting\functionTests\tests\store\copyfiles.ant A java\testing\org\apache\derbyTesting\functionTests\tests\store\OnlineBackupTest1.java M java\testing\org\apache\derbyTesting\functionTests\master\st_1.out A java\testing\org\apache\derbyTesting\functionTests\master\OnlineBackupTest1.out M java\testing\org\apache\derbyTesting\functionTests\suites\storemore.runall minor updates to the spec:
- Procedure can not be overloaded with different number of arguments in Derby. Added the keyword "ONLINE" to the new backup procedures to differentiate them from the old procedures. -- jar file handling. This patch makes online backup call to wait/fail when unlogged operations like
create index are pending. It also fixes derby-523 bug my making the existing log archive backup procedure to wait for the unlogged operation to complete. -- Two new procedures are added to allow the users to make backup wait/fail when unlogged operations are pending. -- prevents users starting backup in an non-idle transactions to avoid backup blocking forever if users starts backup in the same transaction as an unlogged operation. -- backup is not really transactional , to avoid any locking issues in the futures; backup procedures ends the transaction by implicitly doing commit when it is successful or rollback on any errors. A new backup test is added to store suite to test the above scenarios. TESTS : derbyall test suite passed on Windows XP/JDK142 It would be great if some can review and commit this patch. $ svn status M java\engine\org\apache\derby\impl\sql\catalog\DataDictionaryImpl.java M java\engine\org\apache\derby\impl\sql\catalog\DD_Version.java M java\engine\org\apache\derby\impl\db\BasicDatabase.java M java\engine\org\apache\derby\impl\store\access\RAMAccessManager.java M java\engine\org\apache\derby\impl\store\raw\xact\XactFactory.java M java\engine\org\apache\derby\impl\store\raw\RawStore.java M java\engine\org\apache\derby\iapi\store\access\AccessFactory.java M java\engine\org\apache\derby\iapi\store\raw\RawStoreFactory.java M java\engine\org\apache\derby\iapi\reference\SQLState.java M java\engine\org\apache\derby\database\Database.java M java\engine\org\apache\derby\catalog\SystemProcedures.java M java\engine\org\apache\derby\loc\messages_en.properties M java\testing\org\apache\derbyTesting\functionTests\tests\storetests\st_1.sql A java\testing\org\apache\derbyTesting\functionTests\tests\store\onlineBackupTest2_app.properties M java\testing\org\apache\derbyTesting\functionTests\tests\store\copyfiles.ant A java\testing\org\apache\derbyTesting\functionTests\tests\store\onlineBackupTest2.sql M java\testing\org\apache\derbyTesting\functionTests\master\st_1.out A java\testing\org\apache\derbyTesting\functionTests\master\onlineBackupTest2.out M java\testing\org\apache\derbyTesting\functionTests\suites\storemore.runall M java\testing\org\apache\derbyTesting\functionTests\util\FTFileUtil.java This pacth fixes store/onlineBackupTest1.java failure on non-windows envirorment. Problem was unlogged operations thread and insert thread are working on the same connection. Test was failing becuase insert thread was committing the unlogged operation that was suppose to block the backup.
This pacth modified the test , so that these threads works on different conenctions, Test passed on Windows XP and Linuix. It would be great if some one can commit this patch. This patch adds code to support online backup when jar operations
are running parallel to the backup. Jar files are not logged, but the system catalogs updates are logged when a jar file is added/replaced. If the jar file operations are allowed during the backup, system catalog (sys.sysfiles) table in the backup database can have a reference to a jar file that does not exist in the backup database. And also backup can contain partial written jar files. To make a consistent online backup, this patch: 1) Makes Backup operation wait/fail for all the jar operations activity in progress to complete. 2) Blocks jar file operations when a backup is in progress. This patch also adds a new test to test the online backup with jar operations. TESTS : derbyall test suite passed on Windows XP/JDK142 It would be great if some can review and commit this patch. svn stat: M java\engine\org\apache\derby\impl\store\raw\xact\Xact.java M java\engine\org\apache\derby\impl\store\raw\xact\XactFactory.java M java\engine\org\apache\derby\impl\store\raw\data\BaseDataFileFactory.java M java\engine\org\apache\derby\impl\store\raw\data\RFResource.java M java\engine\org\apache\derby\iapi\store\raw\xact\RawTransaction.java A java\testing\org\apache\derbyTesting\functionTests\tests\store\obtest_customer.jar M java\testing\org\apache\derbyTesting\functionTests\tests\store\copyfiles.ant A java\testing\org\apache\derbyTesting\functionTests\tests\store\OnlineBackupTest3.java A java\testing\org\apache\derbyTesting\functionTests\tests\store\OnlineBackupTest3_app.properties A java\testing\org\apache\derbyTesting\functionTests\master\OnlineBackupTest3.out M java\testing\org\apache\derbyTesting\functionTests\suites\storemore.runall test jar file was missing from the earlier onlinebackup_6.diff patch. This jar file should be added to :
org/apache/derbyTesting/functionTests/tests/store/ directory along with the onlinebackup_6.diff patch. I have reviewed the latest patches (3 through6). (A bit late I must admit). Here are my comments:
java/engine/org/apache/derby/catalog/SystemProcedures.java * SYSCS_BACKUP_DATABASE() - "By default this procedure will wait ..." Is it possible to change this behavior for this particular procedure? If not, "by default" is a bit misleading - "wait for the backup blocking unlogged operations to complete ..." is a bit heavy. I suggest just "wait for any unlogged operations to complete ..." * SYSCS_ONLINE_BACKUP_DATABASE() - Since both backup procedures are ONLINE, it is a bit misleading to use this word to distinguish between the two backup procedures. I guess the main reason for choosing a new name is the extra parameter. In that case, I think it would be better to name the new procedure, SYSCS_BACKUP_DATABASE_NOWAIT and leave out the parameter. - The javadoc does not say what will happen if wait is 0. Will one get an exception if there is unlogged operations? * backupDatabase() - Is this the right layer for checking that the transaction is idle and for doing rollback/commit the transaction? Since this is a requirement for the logic at lower layers to work correctly, not something that is done because it is the desirable behavior of the system procedure, I feel that this should be done at a lower layer. - I know when we discussed this isssue earlier, I agreed that checking that the transaction is idle was a good solution. However, thinking a bit more about this, I think it would be better to fail the transaction when unlogged operations have been performed by the same transaction. That would limit it to those who actually need to be affected, and it would significantly reduce the probability of someone ever experiencing this problem. - I am not very fond of automatic commits like this. If this is necessary, I think it would be better to require that backup is performed in autocommit mode. Then the implications would be more evident to the users and not catch someone by surprise. - The javadoc for the system procedures that use this function should state the requirement imposed here (idle transaction) and that the transaction will be committed if backup is succesful. * SYSCS_ONLINE_BACKUP_DATABASE_AND_ENABLE_LOG_ARCHIVE_MODE() - Same comment on ONLINE as above - Could not boolean parameters be used now? * backupDatabaseAndEnableLogArchiveMode() - Most of the code here is the same as in backupDatabase(). (Another argument for pushing this code down to a lower layer.) To avoid code duplication, whether to enable archiving could have been a flag to backupDatabase. * SYSCS_DISABLE_LOG_ARCHIVE_MODE() - Is checkBackupTransactionIsIdle() strictly necessary here? This seems like an operation where failures could be handle at statement level. java/engine/org/apache/derby/iapi/store/access/AccessFactory.java * backup(String ...) - "Please see cloudscape on line ..." Derby? * backup(File ...) - I will just remind you of I am not sure it is a good idea for other people change the backup code while you are working on it. (May create merge conflicts for you.) java/engine/org/apache/derby/iapi/store/raw/xact/RawTransaction.java * setBackupBlockingState() - I do not like the name for this method. I suggest calling it blockBackup() or something like that. At least, the javadoc should explain what is meant by "backup blocking state". java/engine/org/apache/derby/iapi/store/raw/xact/TransactionFactory.java * stopBackupBlockingOperations() - Name indicates that backup blocking operations are stopped, but javadoc says that only new ones are blocked. I think the name is misleading. - Javadoc should be revisited for typos. java/engine/org/apache/derby/impl/db/BasicDatabase.java * backupAndEnableLogArchiveMode() - Non-standard indentation for parameters? java/engine/org/apache/derby/impl/store/raw/RawStore.java * backup(String, boolean) - Typo in comment: "Check if there any backup ..." Remove "there"? * backupAndEnableLogArchiveMode(String, boolean) - Why do you need a finally clause? Would not a catch clause be sufficient? Then, you could eliminate the local 'error' variable. java/engine/org/apache/derby/impl/store/raw/data/RFResource.java * add()/remove() - Are the casts to RawTransaction safe? Does this assumption have any impact on the modularity of the code? * serviceImmediately() - How is this change related to backup? java/engine/org/apache/derby/impl/store/raw/xact/Xact.java * setBackupBlockingState()/setUnblockBackupState() - Names should be symmetric. (e.g., blockBackup/unblockBackup) - Why do you have to wait for commit to unblock? Would it not be sufficient to have completed the unlogged operations before backup is started? java/engine/org/apache/derby/impl/store/raw/xact/XactFactory.java * canStartBackupBlockingOperation() - Since this method now may wait for backup to complete, I do not feel canStartBackupBlockingOperation() is a good name. - Symmetric naming with backupBlockingOperationFinished() would make it more evident that these two functions should be called in pairs. java/engine/org/apache/derby/loc/messages_en.properties * XSRSA.S - Suggest the following change: "Cannot backup the database when unlogged operations are uncommitted. Please commit the transactions with backup blocking operations or use the backup procedure with option to wait for them to complete." - Generally, I am not very fond of these long error messages. I think it would be better with just a single sentence, and then the user should be able to look up an explanation in a manual. * XSRSB.S - Suggest the following change: "Backup operation can not be performed in an active transaction. Please start a new transaction to execute backup procedures." java/testing/org/apache/derbyTesting/functionTests/tests/store/OnlineBackupTest1.java * runTest() - Should have more than one unlogged operation in parallel in order to fully test that the counting of unlogged operations work. - Suggest to add test that when doing backup in a non-idle transaction, previous work has not been rolled back when backup fails, and that one can continue with more operations within the same transaction. * runConsistencyChecker() - This does only check consistnecy of internal structures. Should also check consistency of application data. Maybe you could execute a select? * performDmlActions() - I assume the intention here is to do "while (!stopActivity)" * endUnloggedAction() - What is the role of the insert? Is is unlogged? It is not evident from the method name or Javadoc why you are doing inserts here. * select() - What is the point of doing consistency checks on columns that are not updated? If id and name does not match, that must be caused by errors in code that is not particular to backup. java/testing/org/apache/derbyTesting/functionTests/tests/store/OnlineBackupTest3.java * installJarTest() - Typos in comment: "// followng backup call should because jar operation is pending". Should what? - Comment say: "//Now commit the jar operation in connection1 for backup to proceed." The next statement does an insert. This is confusing. - It would be nice if the test checked that jar operation is still waiting for backup when create index has completed, but I guess this is a bit difficult to achieve. * A mix of tab and spaces for indentation. For new files that should not be necessary! * removeJarTest() - Comment copied from installJarTest? "// wait for customer app jar installation to finish now. " java/testing/org/apache/derbyTesting/functionTests/tests/store/onlineBackupTest2.sql * Do you have idea of how frequently it happens that backup thread has not been blocked yet when backupdir is created? How long a sleep would you need to be certain? inplace-compress with online backup problem:
I was scanning through the code to find out any issues with online backup and in-place compress and came across the following code that does a checkpoint before truncating the container. FileContainer.java: protected void compressContainer( .... // make sure we don't execute redo recovery on any page // which is getting truncated. At this point we have an exclusive // table lock on the table, so after checkpoint no page change // can happen between checkpoint log record and compress of space. dataFactory.getRawStoreFactory().checkpoint(); Above code assumes that redo will only start after the checkpoint done by the compress, that is true in crash-recovery. But restore from backup can start redo from a checkpoint that is taken when backup was started, which can be before the checkpoint done by compress. if compress is run in parallel to the backup, restore from the backup can FAIL because it can not find the pages needed by the redo if a container gets backed up after it is truncate by the compress. I could not think of an easy way to avoid the need for the compress to perform checkpoint while truncating a container. One way to make good online backup when in-place compress is in progress seems to be by enforcing the following restrictions similar to the way unlogged operations are handled: 1) Block in-place compress operation if backup is in progress and 2) Make backup operation wait/fail until compress is done. I don't like to add restrictions, but I guess compress is an infrequent operation, so it may be ok. Any comments/suggestions ? Thanks -suresht This patch addresses the issues raised by Øystein in his review of previous
online backup patches 3-6. - changed the backup procedures names with ONLINE to NOWAIT - removed the transaction Idle restriction to run backup procedures. - removed implicit commit/rollbacks. - Added a new lesser impact restriction, which only disallows backup call only if there are unlogged operations executed in the same transaction before the backup. - Removed casting to RawTransaction. - fixed Names and Comments. - Enhanced the tests with addional test cases suggested by Øystein. TESTS : derbyall test suite passed on Windows XP/JDK142 It would be great if some can review and commit this patch. svn stat: M java\engine\org\apache\derby\impl\sql\catalog\DataDictionaryImpl.java M java\engine\org\apache\derby\impl\db\BasicDatabase.java M java\engine\org\apache\derby\impl\store\raw\xact\Xact.java M java\engine\org\apache\derby\impl\store\raw\xact\XactFactory.java M java\engine\org\apache\derby\impl\store\raw\RawStore.java M java\engine\org\apache\derby\impl\store\raw\data\BaseDataFileFactory.java M java\engine\org\apache\derby\impl\store\raw\data\RFResource.java M java\engine\org\apache\derby\iapi\store\access\AccessFactory.java M java\engine\org\apache\derby\iapi\store\raw\xact\RawTransaction.java M java\engine\org\apache\derby\iapi\store\raw\xact\TransactionFactory.java M java\engine\org\apache\derby\iapi\store\raw\RawStoreFactory.java M java\engine\org\apache\derby\iapi\reference\SQLState.java M java\engine\org\apache\derby\catalog\SystemProcedures.java M java\engine\org\apache\derby\loc\messages_en.properties M java\testing\org\apache\derbyTesting\functionTests\tests\store\OnlineBackupTest1.java M java\testing\org\apache\derbyTesting\functionTests\tests\store\onlineBackupTest2.sql M java\testing\org\apache\derbyTesting\functionTests\tests\store\OnlineBackupTest3.java M java\testing\org\apache\derbyTesting\functionTests\master\OnlineBackupTest1.out M java\testing\org\apache\derbyTesting\functionTests\master\onlineBackupTest2.out M java\testing\org\apache\derbyTesting\functionTests\master\OnlineBackupTest3.out Is the functional spec updated to reflect the new names of the procedures?
I'm actually confused as to why the name now includes NOWAIT, not sure what the NOWAIT is meant to imply to the user of the procedures. To me it might seem that this would imply the procedure returns right away and the backup continues in the background. Hopefully an updated functional spec will make it clear why NOWAIT is part of the name. For Derby to be easy to use the names of the system procedures should clearly indicate what they do. This patch addresses the improvements suggested by Oystein in his review of the
first online backup patch(onlinebackup_1.diff) and also resolved some more online backup issues. -- fixed comments and moved a duplicate code in write page and the backup of container into a separate method. fixes with this patch: -- backup of container was using the same encryption buffer as container read/writes, this requires backup of the container and read/writes are synchronized to avoid corrupting the encryption buffer. This patch modified the backup of container code to use it's own temporary encryption buffer to avoid the buffer corruption. -- added code to prevent truncation of log during checkpoint and disabling of log archival does not delete log files that are yet to be copied into the backup, if backup is running in parallel. -- In-place compress is blocked during backup and vice-versa, until the compress nested transaction commits. This change is needed because compress does a special checkpoint to avoid redo on the truncated pages. If backup is running in parallel, the checkpoint backup is based on can be earlier than checkpoint done by the truncate operation. Without the blocking during restore from backup, recovery will fail if it needs redo ant log records on the truncated pages. TESTS : derbyall test suite passed on Windows XP/JDK142 It would be great if some can review and commit this patch. svn stat: M java\engine\org\apache\derby\impl\store\raw\log\ReadOnly.java M java\engine\org\apache\derby\impl\store\raw\log\LogToFile.java M java\engine\org\apache\derby\impl\store\raw\RawStore.java M java\engine\org\apache\derby\impl\store\raw\data\InputStreamContainer.jav a M java\engine\org\apache\derby\impl\store\raw\data\FileContainer.java M java\engine\org\apache\derby\impl\store\raw\data\BaseContainer.java M java\engine\org\apache\derby\impl\store\raw\data\BaseContainerHandle.java M java\engine\org\apache\derby\impl\store\raw\data\RAFContainer.java M java\engine\org\apache\derby\iapi\store\raw\log\LogFactory.java M java\engine\org\apache\derby\loc\messages_en.properties M java\shared\org\apache\derby\shared\common\reference\SQLState.java M java\testing\org\apache\derbyTesting\functionTests\tests\store\copyfiles.ant A java\testing\org\apache\derbyTesting\functionTests\tests\store\onlineBackupTest4.sql A java\testing\org\apache\derbyTesting\functionTests\tests\store\onlineBackupTest4_app.properties A java\testing\org\apache\derbyTesting\functionTests\master\onlineBackupTest4.out M java\testing\org\apache\derbyTesting\functionTests\suites\storemore.runall Thanks -suresh I committed the onlinebackup_8.diff patch to trunk as svn 373380
minor updated to the spec. Changed the names of the new procedures to :
SYSCS_BACKUP_DATABASE_NOWAIT SYSCS_BACKUP_DATABASE_AND_ENABLE_LOG_ARCHIVE_MODE_NOWAIT |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
to the database when the backup is in progress will be a useful feature to Derby users,
especially in the client/server environment. This backup mechanism might take more time
than current online backup because of the synchronization overheads required to allow changes to
the database when backup is in progress. At this point I am not sure how much more
time it will take, but I think it should not be more than 50%, in the worst case scenario.
Current online backup mechanism (that blocks changes to the database) is
supported using system procedures(ex:SYSCS_UTIL.SYSCS_BACKUP_DATABASE ). My
plan is to make the existing backup procedures work work without blocking the
changes to the database; No new system procedures are required. If community thinks
both blocking/non-blocking type backups are useful, new procedures can
be added.
Currently backup contains mainly data files (seg0/*) and the transaction log
files(log/*) that are there when the backup started. On restore from the
backup, transactions are replayed, similar to crash-recovery to bring the database
to a consistent state. New online backup also will work same way, except
that all the transaction log must be copied to the backup, only after all the data
files are backed up.
I think current implementation freezes(no changes to the database) the database
during backup for following reasons :
1) Data files will in a stable state; backup will not contain partially updates
pages on the disk.
2) No new data files will be added/deleted on the disk;
because create/drop operations are blocked.
3) No transaction will committed after the backup starts. So all
unlogged operations will be rolled back.
If the database is not frozen above conditions will not be true, that might
lead to the backups that are in corrupted/inconsistent state. I think, it is
not necessary to freeze the whole database to make a stable backup copy, by
blocking operations that modifies the files on-disk for small amounts of time,
a stable backup can be made.
Following sections explain some of the issues and possible ways to address them to
provide a real online backup that does not block changes to the database for the whole
duration of the backup.
1) Corrupt pages in the backup database:
Backup reads and the page cache writes can be interleaved if the database is
not frozen. i.e it is possible to land up with a page in the backup that has
a portion of the page that is more up-to-date than the rest of the page, if the
page cache writes are not blocked when a page is being read for the backup.
To avoid backup process reading partial written pages, some kind of
synchronization mechanism that does not allow reading a page to write to the
back backup when the same page is being written to the disk. This can be
implemented by one of the following approaches:
a) By latching on a page key (container id, page number) while doing the write
of the page from cache to disk and while reading the page from the
disk/cache to write to the backup. This approach has small overhead of
acquiring an extra latch during the page cache writes when the backup is in progress.
or
b) read each pages in to the page cache first and then latch the
page in the cache until a temporary copy of it is made. This approach
does not have extra overhead of extra latches on the page keys during writes , but
will pollute the page cache with the pages that are only required by the
backup; this might have impact on user operations because active user pages may
have been replaced by the backup pages in the page cache.
or
c) read pages into buffer pool and latch them while making a copy similar to
the above approach, but some how make sure that user pages are not kicked out
of the buffer pool.
One optimization that may be made is to copy the file on the disk as it
is to the backup, but keep track of pages that gets modified when file was
being copied and rewrite those pages by using one of the above latching
mechanisms.
2) Committed Non logged operation:
Basic requirement to have consistent database backup is after the checkpoint
for the backup all changes to the database will be available in the
transaction log. But Derby provides some non logged operations for
performance reasons , for example CREATE INDEX , IMPORT to a empty table
..etc.
This was not a issue in the old backup mechanism because no operations will
be committed once the backup starts. So any non logged operations will be rolled
back similar to the regular crash recovery.
I can think of two ways to address this issue:
a) To block non-logged operations when backup is in progress and also make backup
wait before copying until the non-logged operation are complete.
b) make backup always wait for the non-logged operations to complete and
retake the backup of those files that got affected by the non-logged
operation, if they were already backed up.
c) Some how trigger logging for all the operations after the checkpoint for
the backup until the backup is complete. This one is easy to implement
for non-logged operation that are stated after the backup, but the
tricky case is to trigger logging for those non-logging operation that
started before the backup but are committed during the backup.
3) drop of a table when the file on the disk is being backed up. drop of
a table will result in deletion of the file on the disk, but deletion will get errors
if it is opened for backup.
Some form of synchronization required to make sure that users do not see
weird errors in this case.
4) creating a table/index after the data files are backed up. Basically
recovery system expects that file on the disk exists before the log records
that refer to it are written to the transaction log.
I think roll-forward recovery already handles this case , but should be
tested.
5) data file growth because of inserts when the file(table/index) is being backed up.
Recovery system expects that a page is allocated on the disk
before log records are written to the transaction log about a page to
avoid recovery errors because of space issues except incase of roll-forward
recovery.
I think roll-forward recovery handles this case already; but have to make
sure it will work in this case also. Test cases should be added.
Some form of synchronization is required, to make a stable table snap shot of the
file , if the file is growing when the backup is in progress.
6) checkpoints when the backup is in progress.
I think it not necessary to allow checkpoints when the backup is in
progress. But if some one thinks otherwise , following should
be addressed:
1) make copy of the log control file for the backup before copying any
2) If there are any operations that rely on checkpoint to make the
operation consistent should not be allowed because backup might have
already copied some files when checkpoint happens.
Any comments/suggestions will be appreciated.
Thanks
-suresh