Issue Details (XML | Word | Printable)

Key: DERBY-4072
Type: Bug Bug
Status: Closed Closed
Resolution: Fixed
Priority: Minor Minor
Assignee: Kathey Marsden
Reporter: Kathey Marsden
Votes: 0
Watchers: 0
Operations

If you were logged in you would be able to see more operations.
Derby

shutdown with incorrect permission on log files shows java.lang.NullPointerException at org.apache.derby.impl.store.raw.log.LogToFile.flush(LogToFile.java:3964). Should give bettter message.

Created: 25/Feb/09 10:52 PM   Updated: 04/May/09 06:22 PM
Component/s: Store
Affects Version/s: 10.1.3.1, 10.2.2.0, 10.3.3.0, 10.4.2.0, 10.5.1.1
Fix Version/s: 10.1.3.3, 10.2.2.1, 10.3.3.1, 10.4.2.1, 10.5.1.1

Time Tracking:
Not Specified

File Attachments:
  Size
Java Source File Licensed for inclusion in ASF works ConnectAndShutdown.java 2009-02-25 10:53 PM Kathey Marsden 0.6 kB
Text File Licensed for inclusion in ASF works derby-4072_diff.txt 2009-02-27 09:27 PM Kathey Marsden 3 kB
Java Source File Licensed for inclusion in ASF works InsertALot.java 2009-02-25 10:53 PM Kathey Marsden 1 kB
Java Source File Licensed for inclusion in ASF works MakeDB.java 2009-02-25 10:53 PM Kathey Marsden 0.7 kB
Text File Licensed for inclusion in ASF works sample.derby.log 2009-02-27 09:27 PM Kathey Marsden 6 kB

Issue & fix info: Patch Available
Resolution Date: 06/Mar/09 05:34 PM


 Description  « Hide
I recently saw case where a user was seeing the following error in the derby.log when trying to shutdown their database.
New exception raised during cleanup null
java.lang.NullPointerException
        at org.apache.derby.impl.store.raw.log.LogToFile.flush(LogToFile.java:3964)
        at org.apache.derby.impl.store.raw.log.LogToFile.flush(LogToFile.java:1781)
        at org.apache.derby.impl.store.raw.data.BaseDataFileFactory.flush(BaseDataFileFa
        at org.apache.derby.impl.store.raw.data.CachedPage.writePage(CachedPage.java:761
        at org.apache.derby.impl.store.raw.data.CachedPage.clean(CachedPage.java:610)
        at org.apache.derby.impl.services.cache.ConcurrentCache.cleanAndUnkeepEntry(Conc
        at org.apache.derby.impl.services.cache.ConcurrentCache.cleanCache(ConcurrentCac
        at org.apache.derby.impl.services.cache.ConcurrentCache.cleanAll(ConcurrentCache
        at org.apache.derby.impl.services.cache.ConcurrentCache.shutdown(ConcurrentCache
        at org.apache.derby.impl.store.raw.data.BaseDataFileFactory.stop(BaseDataFileFac
        at org.apache.derby.impl.services.monitor.TopService.stop(TopService.java:405)
        at org.apache.derby.impl.services.monitor.TopService.shutdown(TopService.java:34
        at org.apache.derby.impl.services.monitor.BaseMonitor.shutdown(BaseMonitor.java:
        at org.apache.derby.impl.db.DatabaseContextImpl.cleanupOnError(DatabaseContextIm
        at org.apache.derby.iapi.services.context.ContextManager.cleanupOnError(ContextM
        at org.apache.derby.impl.jdbc.TransactionResourceImpl.cleanupOnError(Transaction
        at org.apache.derby.impl.jdbc.EmbedConnection.<init>(EmbedConnection.java:584)
        at org.apache.derby.jdbc.Driver40.getNewEmbedConnection(Driver40.java:68)
        at org.apache.derby.jdbc.InternalDriver.connect(InternalDriver.java:238)
        at org.apache.derby.jdbc.AutoloadedDriver.connect(AutoloadedDriver.java:119)
        at java.sql.DriverManager.getConnection(DriverManager.java:316)
        at java.sql.DriverManager.getConnection(DriverManager.java:273)

It ended up that some of the log files did not have proper write permissions because some operation on the database had been performed by root. They had subsequently deleted their db.lck file so the database did not boot READ ONLY as it would if the root owned db.lck file still existed and the symptom was that they got this error on shutdown.

Clearly this was user error, but it would have been good if we gave a better error message. To reproduce on Linux:
As a user with umask 0022, run the program
java MakeDB
this will make the databases wombat and create a table.
su root
with umask 0022, run the program to insert data and remove the db.lck file:
java InsertALot
rm wombat/db.lck
go back to the original user
run the program:
java ConnectAndShutdown

The application gets the normal shutdown exception but if you look in derby.log you will see the exception.
java.lang.NullPointerException
        at org.apache.derby.impl.store.raw.log.LogToFile.flush(LogToFile.java:3964)
        ...
I will attach the files.


 All   Comments   Work Log   Change History   Subversion Commits      Sort Order: Ascending order - Click to sort in descending order
Kathey Marsden added a comment - 25/Feb/09 10:53 PM
Attaching files for reproduction. See description for reproduction information.

Kathey Marsden made changes - 25/Feb/09 10:53 PM
Field Original Value New Value
Attachment InsertALot.java [ 12400971 ]
Attachment ConnectAndShutdown.java [ 12400972 ]
Attachment MakeDB.java [ 12400970 ]
Kathey Marsden made changes - 25/Feb/09 11:07 PM
Description I recently saw case where a user was seeing the following error in the derby.log when trying to shutdown their database.
New exception raised during cleanup null
java.lang.NullPointerException
        at org.apache.derby.impl.store.raw.log.LogToFile.flush(LogToFile.java:3964)
        at org.apache.derby.impl.store.raw.log.LogToFile.flush(LogToFile.java:1781)
        at org.apache.derby.impl.store.raw.data.BaseDataFileFactory.flush(BaseDataFileFa
        at org.apache.derby.impl.store.raw.data.CachedPage.writePage(CachedPage.java:761
        at org.apache.derby.impl.store.raw.data.CachedPage.clean(CachedPage.java:610)
        at org.apache.derby.impl.services.cache.ConcurrentCache.cleanAndUnkeepEntry(Conc
        at org.apache.derby.impl.services.cache.ConcurrentCache.cleanCache(ConcurrentCac
        at org.apache.derby.impl.services.cache.ConcurrentCache.cleanAll(ConcurrentCache
        at org.apache.derby.impl.services.cache.ConcurrentCache.shutdown(ConcurrentCache
        at org.apache.derby.impl.store.raw.data.BaseDataFileFactory.stop(BaseDataFileFac
        at org.apache.derby.impl.services.monitor.TopService.stop(TopService.java:405)
        at org.apache.derby.impl.services.monitor.TopService.shutdown(TopService.java:34
        at org.apache.derby.impl.services.monitor.BaseMonitor.shutdown(BaseMonitor.java:
        at org.apache.derby.impl.db.DatabaseContextImpl.cleanupOnError(DatabaseContextIm
        at org.apache.derby.iapi.services.context.ContextManager.cleanupOnError(ContextM
        at org.apache.derby.impl.jdbc.TransactionResourceImpl.cleanupOnError(Transaction
        at org.apache.derby.impl.jdbc.EmbedConnection.<init>(EmbedConnection.java:584)
        at org.apache.derby.jdbc.Driver40.getNewEmbedConnection(Driver40.java:68)
        at org.apache.derby.jdbc.InternalDriver.connect(InternalDriver.java:238)
        at org.apache.derby.jdbc.AutoloadedDriver.connect(AutoloadedDriver.java:119)
        at java.sql.DriverManager.getConnection(DriverManager.java:316)
        at java.sql.DriverManager.getConnection(DriverManager.java:273)

It ended up that some of the log files did not have proper write permissions because some operation on the database had been performed by root. They had subsequently deleted their db.lck file so the database did not boot READ ONLY as it would if the root owned db.lck file still existed and the symptom was that they got this error on shutdown.

Clearly this was user error, but it would have been good if we gave a better error message. To reproduce on Linux:
As a user with umask 0022, run the program
java MakeDB
this will make the databases wombat and create a table.
su root
run the program to insert data and remove the db.lck file:
java InsertALot
rm wombat/db.lck
go back to the original user
run the program:
java ConnectAndShutdown

The application gets the normal shutdown exception but if you look in derby.log you will see the exception.
java.lang.NullPointerException
        at org.apache.derby.impl.store.raw.log.LogToFile.flush(LogToFile.java:3964)
        ...
I will attach the files.
I recently saw case where a user was seeing the following error in the derby.log when trying to shutdown their database.
New exception raised during cleanup null
java.lang.NullPointerException
        at org.apache.derby.impl.store.raw.log.LogToFile.flush(LogToFile.java:3964)
        at org.apache.derby.impl.store.raw.log.LogToFile.flush(LogToFile.java:1781)
        at org.apache.derby.impl.store.raw.data.BaseDataFileFactory.flush(BaseDataFileFa
        at org.apache.derby.impl.store.raw.data.CachedPage.writePage(CachedPage.java:761
        at org.apache.derby.impl.store.raw.data.CachedPage.clean(CachedPage.java:610)
        at org.apache.derby.impl.services.cache.ConcurrentCache.cleanAndUnkeepEntry(Conc
        at org.apache.derby.impl.services.cache.ConcurrentCache.cleanCache(ConcurrentCac
        at org.apache.derby.impl.services.cache.ConcurrentCache.cleanAll(ConcurrentCache
        at org.apache.derby.impl.services.cache.ConcurrentCache.shutdown(ConcurrentCache
        at org.apache.derby.impl.store.raw.data.BaseDataFileFactory.stop(BaseDataFileFac
        at org.apache.derby.impl.services.monitor.TopService.stop(TopService.java:405)
        at org.apache.derby.impl.services.monitor.TopService.shutdown(TopService.java:34
        at org.apache.derby.impl.services.monitor.BaseMonitor.shutdown(BaseMonitor.java:
        at org.apache.derby.impl.db.DatabaseContextImpl.cleanupOnError(DatabaseContextIm
        at org.apache.derby.iapi.services.context.ContextManager.cleanupOnError(ContextM
        at org.apache.derby.impl.jdbc.TransactionResourceImpl.cleanupOnError(Transaction
        at org.apache.derby.impl.jdbc.EmbedConnection.<init>(EmbedConnection.java:584)
        at org.apache.derby.jdbc.Driver40.getNewEmbedConnection(Driver40.java:68)
        at org.apache.derby.jdbc.InternalDriver.connect(InternalDriver.java:238)
        at org.apache.derby.jdbc.AutoloadedDriver.connect(AutoloadedDriver.java:119)
        at java.sql.DriverManager.getConnection(DriverManager.java:316)
        at java.sql.DriverManager.getConnection(DriverManager.java:273)

It ended up that some of the log files did not have proper write permissions because some operation on the database had been performed by root. They had subsequently deleted their db.lck file so the database did not boot READ ONLY as it would if the root owned db.lck file still existed and the symptom was that they got this error on shutdown.

Clearly this was user error, but it would have been good if we gave a better error message. To reproduce on Linux:
As a user with umask 0022, run the program
java MakeDB
this will make the databases wombat and create a table.
su root
with umask 0022, run the program to insert data and remove the db.lck file:
java InsertALot
rm wombat/db.lck
go back to the original user
run the program:
java ConnectAndShutdown

The application gets the normal shutdown exception but if you look in derby.log you will see the exception.
java.lang.NullPointerException
        at org.apache.derby.impl.store.raw.log.LogToFile.flush(LogToFile.java:3964)
        ...
I will attach the files.
Kathey Marsden added a comment - 27/Feb/09 06:04 PM
So the exception we are not reporting is:
java.io.FileNotFoundException: /home/kmarsden/repro/10602/wombat/log/log28.dat (Permission denied)
at java.io.RandomAccessFile.<init>(RandomAccessFile.java:218)
at org.apache.derby.impl.io.DirRandomAccessFile.<init>(DirRandomAccessFile.java:57)
at org.apache.derby.impl.io.DirFile4.getRandomAccessFile(DirFile4.java:250)
at org.apache.derby.impl.store.raw.log.LogToFile.run(LogToFile.java:5689)
at java.security.AccessController.doPrivileged(AccessController.java:251)
at org.apache.derby.impl.store.raw.log.LogToFile.privRandomAccessFile(LogToFile.java:5579)
at org.apache.derby.impl.store.raw.log.LogToFile.checkJvmSyncError(LogToFile.java:5522)
at org.apache.derby.impl.store.raw.log.LogToFile.openLogFileInWriteMode(LogToFile.java:5471)
at org.apache.derby.impl.store.raw.log.LogToFile.recover(LogToFile.java:1074)
at org.apache.derby.impl.store.raw.RawStore.boot(RawStore.java:339)
at org.apache.derby.impl.services.monitor.BaseMonitor.boot(BaseMonitor.java:2019)
at org.apache.derby.impl.services.monitor.TopService.bootModule(TopService.java:291)
at org.apache.derby.impl.services.monitor.BaseMonitor.startModule(BaseMonitor.java:573)
at org.apache.derby.iapi.services.monitor.Monitor.bootServiceModule(Monitor.java:427)
at org.apache.derby.impl.store.access.RAMAccessManager.boot(RAMAccessManager.java:1019)
at org.apache.derby.impl.services.monitor.BaseMonitor.boot(BaseMonitor.java:2019)
at org.apache.derby.impl.services.monitor.TopService.bootModule(TopService.java:291)
at org.apache.derby.impl.services.monitor.BaseMonitor.startModule(BaseMonitor.java:573)
at org.apache.derby.iapi.services.monitor.Monitor.bootServiceModule(Monitor.java:427)
at org.apache.derby.impl.db.BasicDatabase.bootStore(BasicDatabase.java:780)
at org.apache.derby.impl.db.BasicDatabase.boot(BasicDatabase.java:196)
at org.apache.derby.impl.services.monitor.BaseMonitor.boot(BaseMonitor.java:2019)
at org.apache.derby.impl.services.monitor.TopService.bootModule(TopService.java:291)
at org.apache.derby.impl.services.monitor.BaseMonitor.bootService(BaseMonitor.java:1856)
at org.apache.derby.impl.services.monitor.BaseMonitor.startProviderService(BaseMonitor.java:1722)
at org.apache.derby.impl.services.monitor.BaseMonitor.findProviderAndStartService(BaseMonitor.java:1602)
at org.apache.derby.impl.services.monitor.BaseMonitor.startPersistentService(BaseMonitor.java:1021)
at org.apache.derby.iapi.services.monitor.Monitor.startPersistentService(Monitor.java:550)
at org.apache.derby.impl.jdbc.EmbedConnection.bootDatabase(EmbedConnection.java:2581)
at org.apache.derby.impl.jdbc.EmbedConnection.<init>(EmbedConnection.java:374)
at org.apache.derby.jdbc.Driver40.getNewEmbedConnection(Driver40.java:68)
at org.apache.derby.jdbc.InternalDriver.connect(InternalDriver.java:238)
at org.apache.derby.jdbc.AutoloadedDriver.connect(AutoloadedDriver.java:119)
at java.sql.DriverManager.getConnection(DriverManager.java:316)
at java.sql.DriverManager.getConnection(DriverManager.java:273)
at ConnectAndShutdown.main(ConnectAndShutdown.java:9)java.io.FileNotFoundException:

and the correspoinding code in LogToFile is ~1067
if (!ReadOnlyDB)
{
// if datafactory doesn't think it is readonly, we can
// do some futher test of our own
try
{
if(isWriteSynced)
theLog = openLogFileInWriteMode(logFile);
else
theLog = privRandomAccessFile(logFile, "rw");
}
catch (IOException ioe)
{
theLog = null;
}
                        if (theLog == null || !privCanWrite(logFile))
{
if (theLog != null)
theLog.close();
theLog = null;

ReadOnlyDB = true;
}
}
We catch the exception and silently change the database to ReadOnly.
Later ~~`1172 we fail to initialize logOut because theLog is null
if (theLog != null)
logOut = new LogAccessFile(this, theLog, logBufferSize);

There is another area of similar code ~986

finally during flush logOut is null so we get the NPE.

I was thinking we should
1) Log the IOExceptions.
2) Log whenever we change a database to ReadOnly.
3) In flush() if the database is ReadOnly print a warning to the log that we cannot flush the log to a ReadOnlyDB and return.

but I don't know if this is overly verbose. Are there normal instances of the IOException we might want to silently ignore?


Mike Matrigali added a comment - 27/Feb/09 06:40 PM
I am not sure best way to do the following, but thought I would at least first comment on what the goal should be. The whole "readonly" code I think is weak. It was mostly put in to allow a derby db to be put on and run directly from something like a cd, and to be zero-admin, ie. not force people to somehow createdb a readonly db. We expect to determine a readonly db when we get the lock on the database lock file which requires writing to the file. it never really was
designed for these "half-readonly" cases.

The problem as this case shows is that catching exceptions and interpreting them may lead to the wrong conclusions.

So for instance I would say that we should not boot if we think the db is readonly but we find work to do during restart recovery. This is effectively what the code does but delays the error to sometime in the future and a NPE. The possible problems with allowing boot to continue are:
o if redo recovery has anything to do then it will fill up the cache with dirty pages, and on shutdown
   or earlier we may try to write those pages out and if the db is read only get errors.
o if redo recovery has anything to abort then it will have to write extra stuff to the log file and if
   existing log file is not writable or new logfiles are not writeable then we may get wierd errors
   during boot.
Both the above may be existing problems in the code with a readonly cd version, I don't know.
I think in both these cases we should fail the boot and make sure somehow errors get printed saying exactly why we thought the db was readonly and that recovery code not run as it found
necessary writable work to do.

Unfortunately I think we still want to support finding some log files in a readonly db as a very normal situation would be for a customer to load up a db, then do a clean shutdown and then copy the full db to the cd. This will still leave some log files, but no work to do.


Mike Matrigali added a comment - 27/Feb/09 06:50 PM
so, if nothing else is done I think it would be worthwhile to always log why we are booting a db as readonly, especially if it is not the standard place where we expect in getting the db lock. It would be nicer if this was just part of the boot message rather than an additional message but not sure that fits in the code flow.

I would be concerned about the last part suggested where we just ignore the flush error.

It would be good to know if the problems with this wierd installation are also problems with the expected read only medium. To me the following are bugs:
o any attempt to add a writable page to the cache for a non-temp table should just get an early
    error, rather than fail later at flush.
o any writable log action to a readonly db would be nice to get a reasonable error, rather npe.
   this just makes support easier.

Kathey Marsden made changes - 27/Feb/09 07:11 PM
Assignee Kathey Marsden [ kmarsden ]
Kathey Marsden added a comment - 27/Feb/09 07:13 PM

I think for starters I will just log the places and the reason where we change to ReadOnly in LogToFile where we weren't ReadOnly before and leave the NPE. If we add the ReadOnly change logging that will come first in the log and will give sufficient information as to what the problem is.

I think this will not affect existing read only users because their database will have been set ReadOnly earlier when we were accessing the lock. I will test a full read only database just to make sure.

Kathey Marsden added a comment - 27/Feb/09 09:27 PM
Attached is a proposed path to improve the error logging when an error gaining write access to the log files causes the database status to change to read only (derby-4072_diff.txt)

Attached also is a sample derby.log showing the new warning.

I confirmed that for a totally read only database we just get the READ ONLY boot message and these warnings don't show up.

I am running test now.

Sorry for originally accidentally attaching this to DERBY-4077. Can't seem to keep my Jira issues straight today.

Kathey Marsden made changes - 27/Feb/09 09:27 PM
Attachment derby-4072_diff.txt [ 12401138 ]
Attachment sample.derby.log [ 12401137 ]
Kathey Marsden added a comment - 28/Feb/09 01:28 AM
Tests passed except for known failure DERBY-3993. Please review.


Kathey Marsden made changes - 28/Feb/09 01:28 AM
Derby Info [Patch Available]
Myrna van Lunteren added a comment - 02/Mar/09 06:37 PM
I looked at the code and it seems reasonable to me.

Mike Matrigali added a comment - 02/Mar/09 06:45 PM
I reviewed the proposed changes, they look good to me. It will be nice to get more info to debug this on remote user sites.

Repository Revision Date User Message
ASF #749410 Mon Mar 02 18:49:51 UTC 2009 kmarsden DERBY-4072 Improve error message if log files are not writable.
Files Changed
MODIFY /db/derby/code/trunk/java/shared/org/apache/derby/shared/common/reference/MessageId.java
MODIFY /db/derby/code/trunk/java/engine/org/apache/derby/impl/store/raw/log/LogToFile.java
MODIFY /db/derby/code/trunk/java/engine/org/apache/derby/loc/messages.xml

Kathey Marsden added a comment - 02/Mar/09 07:02 PM
Thanks Myrna and Mike for looking at the patch. I submitted to trunk and will backport to 10.4,10.3,10.2, and 10.1 before closing this issue.

Repository Revision Date User Message
ASF #749656 Tue Mar 03 17:00:33 UTC 2009 kmarsden DERBY-4072 Improve error logging if we can't write to log files
Files Changed
MODIFY /db/derby/code/branches/10.4/java/shared/org/apache/derby/shared/common/reference/MessageId.java
MODIFY /db/derby/code/branches/10.4/java/engine/org/apache/derby/impl/store/raw/log/LogToFile.java
MODIFY /db/derby/code/branches/10.4/java/engine/org/apache/derby/loc/messages.xml

Repository Revision Date User Message
ASF #749657 Tue Mar 03 17:01:22 UTC 2009 kmarsden DERBY-4072 Improve error logging if we can't write to log files
Files Changed
MODIFY /db/derby/code/branches/10.3/java/shared/org/apache/derby/shared/common/reference/MessageId.java
MODIFY /db/derby/code/branches/10.3/java/engine/org/apache/derby/loc/messages.xml
MODIFY /db/derby/code/branches/10.3/java/engine/org/apache/derby/impl/store/raw/log/LogToFile.java

Repository Revision Date User Message
ASF #749658 Tue Mar 03 17:02:04 UTC 2009 kmarsden DERBY-4072 Improve error logging if we can't write to log files.
Files Changed
MODIFY /db/derby/code/branches/10.2/java/engine/org/apache/derby/iapi/services/monitor/Monitor.java
MODIFY /db/derby/code/branches/10.2/java/shared/org/apache/derby/shared/common/reference/MessageId.java
MODIFY /db/derby/code/branches/10.2/java/engine/org/apache/derby/loc/messages_en.properties
MODIFY /db/derby/code/branches/10.2/java/engine/org/apache/derby/impl/store/raw/log/LogToFile.java

Repository Revision Date User Message
ASF #749718 Tue Mar 03 19:57:18 UTC 2009 kmarsden DERBY-4072 improve message if log files cannot be accessed.
Files Changed
MODIFY /db/derby/code/branches/10.1/java/engine/org/apache/derby/iapi/services/monitor/Monitor.java
MODIFY /db/derby/code/branches/10.1/java/engine/org/apache/derby/iapi/reference/MessageId.java
MODIFY /db/derby/code/branches/10.1/java/engine/org/apache/derby/loc/messages_en.properties
MODIFY /db/derby/code/branches/10.1/java/engine/org/apache/derby/impl/store/raw/log/LogToFile.java

Kathey Marsden made changes - 06/Mar/09 05:34 PM
Fix Version/s 10.5.0.0 [ 12313010 ]
Resolution Fixed [ 1 ]
Fix Version/s 10.4.2.1 [ 12313401 ]
Fix Version/s 10.2.2.1 [ 12312251 ]
Fix Version/s 10.3.3.1 [ 12313143 ]
Status Open [ 1 ] Closed [ 6 ]
Fix Version/s 10.1.3.3 [ 12313478 ]
Myrna van Lunteren made changes - 04/May/09 06:22 PM
Fix Version/s 10.5.0.0 [ 12313010 ]
Fix Version/s 10.5.1.1 [ 12313771 ]
Affects Version/s 10.5.0.0 [ 12313010 ]
Affects Version/s 10.5.1.1 [ 12313771 ]