|
[
Permlink
| « Hide
]
Kathey Marsden added a comment - 25/Feb/09 10:53 PM
Attaching files for reproduction. See description for reproduction information.
Kathey Marsden made changes - 25/Feb/09 10:53 PM
Kathey Marsden made changes - 25/Feb/09 11:07 PM
So the exception we are not reporting is:
java.io.FileNotFoundException: /home/kmarsden/repro/10602/wombat/log/log28.dat (Permission denied) at java.io.RandomAccessFile.<init>(RandomAccessFile.java:218) at org.apache.derby.impl.io.DirRandomAccessFile.<init>(DirRandomAccessFile.java:57) at org.apache.derby.impl.io.DirFile4.getRandomAccessFile(DirFile4.java:250) at org.apache.derby.impl.store.raw.log.LogToFile.run(LogToFile.java:5689) at java.security.AccessController.doPrivileged(AccessController.java:251) at org.apache.derby.impl.store.raw.log.LogToFile.privRandomAccessFile(LogToFile.java:5579) at org.apache.derby.impl.store.raw.log.LogToFile.checkJvmSyncError(LogToFile.java:5522) at org.apache.derby.impl.store.raw.log.LogToFile.openLogFileInWriteMode(LogToFile.java:5471) at org.apache.derby.impl.store.raw.log.LogToFile.recover(LogToFile.java:1074) at org.apache.derby.impl.store.raw.RawStore.boot(RawStore.java:339) at org.apache.derby.impl.services.monitor.BaseMonitor.boot(BaseMonitor.java:2019) at org.apache.derby.impl.services.monitor.TopService.bootModule(TopService.java:291) at org.apache.derby.impl.services.monitor.BaseMonitor.startModule(BaseMonitor.java:573) at org.apache.derby.iapi.services.monitor.Monitor.bootServiceModule(Monitor.java:427) at org.apache.derby.impl.store.access.RAMAccessManager.boot(RAMAccessManager.java:1019) at org.apache.derby.impl.services.monitor.BaseMonitor.boot(BaseMonitor.java:2019) at org.apache.derby.impl.services.monitor.TopService.bootModule(TopService.java:291) at org.apache.derby.impl.services.monitor.BaseMonitor.startModule(BaseMonitor.java:573) at org.apache.derby.iapi.services.monitor.Monitor.bootServiceModule(Monitor.java:427) at org.apache.derby.impl.db.BasicDatabase.bootStore(BasicDatabase.java:780) at org.apache.derby.impl.db.BasicDatabase.boot(BasicDatabase.java:196) at org.apache.derby.impl.services.monitor.BaseMonitor.boot(BaseMonitor.java:2019) at org.apache.derby.impl.services.monitor.TopService.bootModule(TopService.java:291) at org.apache.derby.impl.services.monitor.BaseMonitor.bootService(BaseMonitor.java:1856) at org.apache.derby.impl.services.monitor.BaseMonitor.startProviderService(BaseMonitor.java:1722) at org.apache.derby.impl.services.monitor.BaseMonitor.findProviderAndStartService(BaseMonitor.java:1602) at org.apache.derby.impl.services.monitor.BaseMonitor.startPersistentService(BaseMonitor.java:1021) at org.apache.derby.iapi.services.monitor.Monitor.startPersistentService(Monitor.java:550) at org.apache.derby.impl.jdbc.EmbedConnection.bootDatabase(EmbedConnection.java:2581) at org.apache.derby.impl.jdbc.EmbedConnection.<init>(EmbedConnection.java:374) at org.apache.derby.jdbc.Driver40.getNewEmbedConnection(Driver40.java:68) at org.apache.derby.jdbc.InternalDriver.connect(InternalDriver.java:238) at org.apache.derby.jdbc.AutoloadedDriver.connect(AutoloadedDriver.java:119) at java.sql.DriverManager.getConnection(DriverManager.java:316) at java.sql.DriverManager.getConnection(DriverManager.java:273) at ConnectAndShutdown.main(ConnectAndShutdown.java:9)java.io.FileNotFoundException: and the correspoinding code in LogToFile is ~1067 if (!ReadOnlyDB) { // if datafactory doesn't think it is readonly, we can // do some futher test of our own try { if(isWriteSynced) theLog = openLogFileInWriteMode(logFile); else theLog = privRandomAccessFile(logFile, "rw"); } catch (IOException ioe) { theLog = null; } if (theLog == null || !privCanWrite(logFile)) { if (theLog != null) theLog.close(); theLog = null; ReadOnlyDB = true; } } We catch the exception and silently change the database to ReadOnly. Later ~~`1172 we fail to initialize logOut because theLog is null if (theLog != null) logOut = new LogAccessFile(this, theLog, logBufferSize); There is another area of similar code ~986 finally during flush logOut is null so we get the NPE. I was thinking we should 1) Log the IOExceptions. 2) Log whenever we change a database to ReadOnly. 3) In flush() if the database is ReadOnly print a warning to the log that we cannot flush the log to a ReadOnlyDB and return. but I don't know if this is overly verbose. Are there normal instances of the IOException we might want to silently ignore? I am not sure best way to do the following, but thought I would at least first comment on what the goal should be. The whole "readonly" code I think is weak. It was mostly put in to allow a derby db to be put on and run directly from something like a cd, and to be zero-admin, ie. not force people to somehow createdb a readonly db. We expect to determine a readonly db when we get the lock on the database lock file which requires writing to the file. it never really was
designed for these "half-readonly" cases. The problem as this case shows is that catching exceptions and interpreting them may lead to the wrong conclusions. So for instance I would say that we should not boot if we think the db is readonly but we find work to do during restart recovery. This is effectively what the code does but delays the error to sometime in the future and a NPE. The possible problems with allowing boot to continue are: o if redo recovery has anything to do then it will fill up the cache with dirty pages, and on shutdown or earlier we may try to write those pages out and if the db is read only get errors. o if redo recovery has anything to abort then it will have to write extra stuff to the log file and if existing log file is not writable or new logfiles are not writeable then we may get wierd errors during boot. Both the above may be existing problems in the code with a readonly cd version, I don't know. I think in both these cases we should fail the boot and make sure somehow errors get printed saying exactly why we thought the db was readonly and that recovery code not run as it found necessary writable work to do. Unfortunately I think we still want to support finding some log files in a readonly db as a very normal situation would be for a customer to load up a db, then do a clean shutdown and then copy the full db to the cd. This will still leave some log files, but no work to do. so, if nothing else is done I think it would be worthwhile to always log why we are booting a db as readonly, especially if it is not the standard place where we expect in getting the db lock. It would be nicer if this was just part of the boot message rather than an additional message but not sure that fits in the code flow.
I would be concerned about the last part suggested where we just ignore the flush error. It would be good to know if the problems with this wierd installation are also problems with the expected read only medium. To me the following are bugs: o any attempt to add a writable page to the cache for a non-temp table should just get an early error, rather than fail later at flush. o any writable log action to a readonly db would be nice to get a reasonable error, rather npe. this just makes support easier.
Kathey Marsden made changes - 27/Feb/09 07:11 PM
I think for starters I will just log the places and the reason where we change to ReadOnly in LogToFile where we weren't ReadOnly before and leave the NPE. If we add the ReadOnly change logging that will come first in the log and will give sufficient information as to what the problem is. I think this will not affect existing read only users because their database will have been set ReadOnly earlier when we were accessing the lock. I will test a full read only database just to make sure. Attached is a proposed path to improve the error logging when an error gaining write access to the log files causes the database status to change to read only (derby-4072_diff.txt)
Attached also is a sample derby.log showing the new warning. I confirmed that for a totally read only database we just get the READ ONLY boot message and these warnings don't show up. I am running test now. Sorry for originally accidentally attaching this to DERBY-4077. Can't seem to keep my Jira issues straight today.
Kathey Marsden made changes - 27/Feb/09 09:27 PM
Kathey Marsden made changes - 28/Feb/09 01:28 AM
I looked at the code and it seems reasonable to me.
I reviewed the proposed changes, they look good to me. It will be nice to get more info to debug this on remote user sites.
Thanks Myrna and Mike for looking at the patch. I submitted to trunk and will backport to 10.4,10.3,10.2, and 10.1 before closing this issue.
Kathey Marsden made changes - 06/Mar/09 05:34 PM
Myrna van Lunteren made changes - 04/May/09 06:22 PM
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||