Derby
  1. Derby
  2. DERBY-3347

ERROR XSDB3: Container information cannot change once written

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Critical Critical
    • Resolution: Fixed
    • Affects Version/s: 10.3.1.4, 10.3.2.1
    • Fix Version/s: 10.3.3.0, 10.4.1.3, 10.5.1.1
    • Component/s: Store
    • Labels:
      None
    • Environment:
      Windows 2003 Server
      Sun Java 1.6.0_03
    • Issue & fix info:
      Release Note Needed
    • Bug behavior facts:
      Regression

      Description

      We are using derby as an embedded DB for our data collection server. During an endurance test when we do around 270 inserts and 9 updates per second, for about a week, I ocasionally see the error below in the deby log (and nothing else beside this).

      This is a vanilla installation, we run derby embedded with no extra configuration. I can confirm that there is no memory problem, the heap usage seems constant over time.

      Can somebody provide some more information regarding the effects of this error? By looking at the stacktrace, it looks like a checkpoint operation is aborted due to some inconsistency in the internal data structure. If the error does not repeat immediately, does it mean that the next checkpoint is successful and there is no data loss?

      I can't provide a test case for that, the error happens after about 1-2 day of running our software. I will rerun the test with the debug jars to capture the line numbers in the stacktrace. Also, I'm starting another test with 10.2.2.0, to see if this problem was introduced in the latest version.

      There are another two bugs referring to this error, (https://issues.apache.org/jira/browse/DERBY-2284 and https://issues.apache.org/jira/browse/DERBY-3087) but they seem to happen in response to some client action. This use case is a bit different, the client keeps inserting and updating records for several days in a steady manner and at some point the error pops up.

      And lastly, here is the exception:

      Checkpoint Daemon caught standard exception

      ------------ BEGIN ERROR STACK -------------

      ERROR XSDB3: Container information cannot change once written: was 0, now 80
      at org.apache.derby.iapi.error.StandardException.newException(Unknown Source)
      at org.apache.derby.impl.store.raw.data.AllocPage.WriteContainerInfo(Unknown Source)
      at org.apache.derby.impl.store.raw.data.FileContainer.writeHeader(Unknown Source)
      at org.apache.derby.impl.store.raw.data.RAFContainer.writeRAFHeader(Unknown Source)
      at org.apache.derby.impl.store.raw.data.RAFContainer.clean(Unknown Source)
      at org.apache.derby.impl.services.cache.CachedItem.clean(Unknown Source)
      at org.apache.derby.impl.services.cache.Clock.cleanCache(Unknown Source)
      at org.apache.derby.impl.services.cache.Clock.cleanAll(Unknown Source)
      at org.apache.derby.impl.store.raw.data.BaseDataFileFactory.checkpoint(Unknown Source)
      at org.apache.derby.impl.store.raw.log.LogToFile.checkpointWithTran(Unknown Source)
      at org.apache.derby.impl.store.raw.log.LogToFile.checkpoint(Unknown Source)
      at org.apache.derby.impl.store.raw.RawStore.checkpoint(Unknown Source)
      at org.apache.derby.impl.store.raw.log.LogToFile.performWork(Unknown Source)
      at org.apache.derby.impl.services.daemon.BasicDaemon.serviceClient(Unknown Source)
      at org.apache.derby.impl.services.daemon.BasicDaemon.work(Unknown Source)
      at org.apache.derby.impl.services.daemon.BasicDaemon.run(Unknown Source)
      at java.lang.Thread.run(Thread.java:619)

      ------------ END ERROR STACK -------------

      1. d3347-preview.diff
        14 kB
        Knut Anders Hatlen
      2. d3347-preview.diff
        14 kB
        Knut Anders Hatlen
      3. d3347-1a.diff
        18 kB
        Knut Anders Hatlen
      4. d3347-1a.stat
        0.3 kB
        Knut Anders Hatlen
      5. d3347-2a.diff
        3 kB
        Knut Anders Hatlen
      6. d3347-1b.diff
        18 kB
        Knut Anders Hatlen
      7. MiniStress.zip
        3 kB
        Kathey Marsden
      8. FileOffset.java
        2 kB
        Knut Anders Hatlen
      9. releaseNote.html
        5 kB
        Knut Anders Hatlen

        Issue Links

          Activity

          Hide
          Bogdan Calmac added a comment -

          All right, I was able to reproduce the problem using the debug jars for 10.3.2.1, the error happened 4 times over last night. The 10.2.2.0 test went fine so far, so this might be a problem introduced in 10.3. Another thing I want to mention is that the test runs on a quad core CPU, so this could be a concurrency issue. And now here is the stacktrace with line numbers:

          Checkpoint Daemon caught standard exception

          ------------ BEGIN ERROR STACK -------------

          ERROR XSDB3: Container information cannot change once written: was 0, now 80
          at org.apache.derby.iapi.error.StandardException.newException(StandardException.java:301)
          at org.apache.derby.impl.store.raw.data.AllocPage.WriteContainerInfo(AllocPage.java:575)
          at org.apache.derby.impl.store.raw.data.FileContainer.writeHeader(FileContainer.java:900)
          at org.apache.derby.impl.store.raw.data.RAFContainer.writeRAFHeader(RAFContainer.java:690)
          at org.apache.derby.impl.store.raw.data.RAFContainer.clean(RAFContainer.java:534)
          at org.apache.derby.impl.services.cache.CachedItem.clean(CachedItem.java:173)
          at org.apache.derby.impl.services.cache.Clock.cleanCache(Clock.java:1381)
          at org.apache.derby.impl.services.cache.Clock.cleanAll(Clock.java:619)
          at org.apache.derby.impl.store.raw.data.BaseDataFileFactory.checkpoint(BaseDataFileFactory.java:1202)
          at org.apache.derby.impl.store.raw.log.LogToFile.checkpointWithTran(LogToFile.java:1521)
          at org.apache.derby.impl.store.raw.log.LogToFile.checkpoint(LogToFile.java:1360)
          at org.apache.derby.impl.store.raw.RawStore.checkpoint(RawStore.java:440)
          at org.apache.derby.impl.store.raw.log.LogToFile.performWork(LogToFile.java:3411)
          at org.apache.derby.impl.services.daemon.BasicDaemon.serviceClient(BasicDaemon.java:331)
          at org.apache.derby.impl.services.daemon.BasicDaemon.work(BasicDaemon.java:668)
          at org.apache.derby.impl.services.daemon.BasicDaemon.run(BasicDaemon.java:394)
          at java.lang.Thread.run(Thread.java:619)

          ------------ END ERROR STACK -------------

          Show
          Bogdan Calmac added a comment - All right, I was able to reproduce the problem using the debug jars for 10.3.2.1, the error happened 4 times over last night. The 10.2.2.0 test went fine so far, so this might be a problem introduced in 10.3. Another thing I want to mention is that the test runs on a quad core CPU, so this could be a concurrency issue. And now here is the stacktrace with line numbers: Checkpoint Daemon caught standard exception ------------ BEGIN ERROR STACK ------------- ERROR XSDB3: Container information cannot change once written: was 0, now 80 at org.apache.derby.iapi.error.StandardException.newException(StandardException.java:301) at org.apache.derby.impl.store.raw.data.AllocPage.WriteContainerInfo(AllocPage.java:575) at org.apache.derby.impl.store.raw.data.FileContainer.writeHeader(FileContainer.java:900) at org.apache.derby.impl.store.raw.data.RAFContainer.writeRAFHeader(RAFContainer.java:690) at org.apache.derby.impl.store.raw.data.RAFContainer.clean(RAFContainer.java:534) at org.apache.derby.impl.services.cache.CachedItem.clean(CachedItem.java:173) at org.apache.derby.impl.services.cache.Clock.cleanCache(Clock.java:1381) at org.apache.derby.impl.services.cache.Clock.cleanAll(Clock.java:619) at org.apache.derby.impl.store.raw.data.BaseDataFileFactory.checkpoint(BaseDataFileFactory.java:1202) at org.apache.derby.impl.store.raw.log.LogToFile.checkpointWithTran(LogToFile.java:1521) at org.apache.derby.impl.store.raw.log.LogToFile.checkpoint(LogToFile.java:1360) at org.apache.derby.impl.store.raw.RawStore.checkpoint(RawStore.java:440) at org.apache.derby.impl.store.raw.log.LogToFile.performWork(LogToFile.java:3411) at org.apache.derby.impl.services.daemon.BasicDaemon.serviceClient(BasicDaemon.java:331) at org.apache.derby.impl.services.daemon.BasicDaemon.work(BasicDaemon.java:668) at org.apache.derby.impl.services.daemon.BasicDaemon.run(BasicDaemon.java:394) at java.lang.Thread.run(Thread.java:619) ------------ END ERROR STACK -------------
          Hide
          Knut Anders Hatlen added a comment -

          Just guessing here... The numbers 0 and 80 make me suspect that this is an issue with borrowedSpace. I think those are the only valid values for borrowedSpace. DERBY-3099 made some changes to the space calculation in AllocPage and may be the culprit. It was introduced in 10.3.2.1. Though, DERBY-2284 is reported against 10.3.1.4, so it might be something else that's causing it. Bogdan, is it possible for you to run with Derby 10.3.1.4 and see if the same happens there?

          There is also another bug, DERBY-3116, which mentions incorrect space calculation involving borrowedSpace. It has a patch attached. If it's not too much trouble for you, it would be very interesting if you could download the 10.3.2.1 sources, apply the DERBY-3116 patch and see if that fixes your problem.

          Show
          Knut Anders Hatlen added a comment - Just guessing here... The numbers 0 and 80 make me suspect that this is an issue with borrowedSpace. I think those are the only valid values for borrowedSpace. DERBY-3099 made some changes to the space calculation in AllocPage and may be the culprit. It was introduced in 10.3.2.1. Though, DERBY-2284 is reported against 10.3.1.4, so it might be something else that's causing it. Bogdan, is it possible for you to run with Derby 10.3.1.4 and see if the same happens there? There is also another bug, DERBY-3116 , which mentions incorrect space calculation involving borrowedSpace. It has a patch attached. If it's not too much trouble for you, it would be very interesting if you could download the 10.3.2.1 sources, apply the DERBY-3116 patch and see if that fixes your problem.
          Hide
          Bogdan Calmac added a comment -

          Sure, I'll go ahead with your suggestions and let you know how it goes.

          Show
          Bogdan Calmac added a comment - Sure, I'll go ahead with your suggestions and let you know how it goes.
          Hide
          Bogdan Calmac added a comment -

          All right, I've run the endurance test for 10.3.1.4 and 10.3.2.1 + 3116 and the error is still there, numerous times.

          This is a pretty serious issue in terms of reliability. Another thing is that after a bunch of these errors the server blows up with JDBC exceptions (probably because the failed checkpoints).

          Show
          Bogdan Calmac added a comment - All right, I've run the endurance test for 10.3.1.4 and 10.3.2.1 + 3116 and the error is still there, numerous times. This is a pretty serious issue in terms of reliability. Another thing is that after a bunch of these errors the server blows up with JDBC exceptions (probably because the failed checkpoints).
          Hide
          Knut Anders Hatlen added a comment -

          Thanks for taking the time to run these experiments, Bogdan!

          At least, now we know the problem is not caused by DERBY-3099 or DERBY-3116, and we know that it must have been introduced some time after 10.2.2.0 and before 10.3.1.4.

          One thing I know was changed in that area of the code in that period, is that RAFContainer started using java.nio.* to allow parallel reads and writes on JVMs that support nio. It is possible to force the use of the old mechanism by deleting or commenting out the following lines in java/engine/org/apache/derby/modules.properties and rebuilding derby.jar:

          1. store data using a StorageFactory
          2. Enhanced version using NIO API; requires Java 1.4
            derby.module.rawStore.data.genericJ4=org.apache.derby.impl.store.raw.data.BaseDataFileFactoryJ4
            derby.env.jdk.rawStore.data.genericJ4=4
            derby.env.classes.rawStore.data.genericJ4=java.nio.Buffer
            cloudscape.config.rawStore.data.genericJ4=derby

          It would be great if you could make that change and rerun your test.

          Show
          Knut Anders Hatlen added a comment - Thanks for taking the time to run these experiments, Bogdan! At least, now we know the problem is not caused by DERBY-3099 or DERBY-3116 , and we know that it must have been introduced some time after 10.2.2.0 and before 10.3.1.4. One thing I know was changed in that area of the code in that period, is that RAFContainer started using java.nio.* to allow parallel reads and writes on JVMs that support nio. It is possible to force the use of the old mechanism by deleting or commenting out the following lines in java/engine/org/apache/derby/modules.properties and rebuilding derby.jar: store data using a StorageFactory Enhanced version using NIO API; requires Java 1.4 derby.module.rawStore.data.genericJ4=org.apache.derby.impl.store.raw.data.BaseDataFileFactoryJ4 derby.env.jdk.rawStore.data.genericJ4=4 derby.env.classes.rawStore.data.genericJ4=java.nio.Buffer cloudscape.config.rawStore.data.genericJ4=derby It would be great if you could make that change and rerun your test.
          Hide
          Knut Anders Hatlen added a comment -

          Marking the issue as a regression since it is reported to work on Derby 10.2.2.0.

          Show
          Knut Anders Hatlen added a comment - Marking the issue as a regression since it is reported to work on Derby 10.2.2.0.
          Hide
          Bogdan Calmac added a comment -

          I've started the test for 10.3.2.1 without NIO and in the first 8 hours I got the exception below. I will have to stop the test tonight and restart it on Monday. I'll keep you posted how it goes.

          2008-02-01 20:31:59.903 GMT Thread[derby.rawStoreDaemon,5,derby.daemons] Cleanup action starting
          java.lang.NullPointerException
          at org.apache.derby.impl.store.access.btree.ControlRow.getControlRowForPage(ControlRow.java:878)
          at org.apache.derby.impl.store.access.btree.ControlRow.get(ControlRow.java:844)
          at org.apache.derby.impl.store.access.btree.ControlRow.get(ControlRow.java:820)
          at org.apache.derby.impl.store.access.btree.BTreePostCommit.purgeRowLevelCommittedDeletes(BTreePostCommit.java:462)
          at org.apache.derby.impl.store.access.btree.BTreePostCommit.performWork(BTreePostCommit.java:278)
          at org.apache.derby.impl.services.daemon.BasicDaemon.serviceClient(BasicDaemon.java:331)
          at org.apache.derby.impl.services.daemon.BasicDaemon.work(BasicDaemon.java:668)
          at org.apache.derby.impl.services.daemon.BasicDaemon.run(BasicDaemon.java:394)
          at java.lang.Thread.run(Thread.java:619)
          Cleanup action completed

          Show
          Bogdan Calmac added a comment - I've started the test for 10.3.2.1 without NIO and in the first 8 hours I got the exception below. I will have to stop the test tonight and restart it on Monday. I'll keep you posted how it goes. 2008-02-01 20:31:59.903 GMT Thread [derby.rawStoreDaemon,5,derby.daemons] Cleanup action starting java.lang.NullPointerException at org.apache.derby.impl.store.access.btree.ControlRow.getControlRowForPage(ControlRow.java:878) at org.apache.derby.impl.store.access.btree.ControlRow.get(ControlRow.java:844) at org.apache.derby.impl.store.access.btree.ControlRow.get(ControlRow.java:820) at org.apache.derby.impl.store.access.btree.BTreePostCommit.purgeRowLevelCommittedDeletes(BTreePostCommit.java:462) at org.apache.derby.impl.store.access.btree.BTreePostCommit.performWork(BTreePostCommit.java:278) at org.apache.derby.impl.services.daemon.BasicDaemon.serviceClient(BasicDaemon.java:331) at org.apache.derby.impl.services.daemon.BasicDaemon.work(BasicDaemon.java:668) at org.apache.derby.impl.services.daemon.BasicDaemon.run(BasicDaemon.java:394) at java.lang.Thread.run(Thread.java:619) Cleanup action completed
          Hide
          Bogdan Calmac added a comment -

          Since 10.3.2.1 without NIO is not a tested combination as illustrated by the exception above (so I wouldn't use it in productions), I continued testing 10.3.2.1 and 10.2.2.0 side by side.

          One interesting thing that I noticed is that the test is successful if the maximum heap memory is small. With only 128M, 10.3.2.1 did not fail for a week. But with 800M (or 512M in the original tests) the error comes up consistently. Could it be related to weak/soft references?

          Is there somebody familiar enough with the code to try to look into this at code level? This seems like a pretty important reliability bug to me, so I will try to debug the code, but it will be a while before I become comfortable with the code base.

          Show
          Bogdan Calmac added a comment - Since 10.3.2.1 without NIO is not a tested combination as illustrated by the exception above (so I wouldn't use it in productions), I continued testing 10.3.2.1 and 10.2.2.0 side by side. One interesting thing that I noticed is that the test is successful if the maximum heap memory is small. With only 128M, 10.3.2.1 did not fail for a week. But with 800M (or 512M in the original tests) the error comes up consistently. Could it be related to weak/soft references? Is there somebody familiar enough with the code to try to look into this at code level? This seems like a pretty important reliability bug to me, so I will try to debug the code, but it will be a while before I become comfortable with the code base.
          Hide
          Dyre Tjeldvoll added a comment -

          The call stack in your comment from 2008-02-01 looks like DERBY-3362

          Show
          Dyre Tjeldvoll added a comment - The call stack in your comment from 2008-02-01 looks like DERBY-3362
          Hide
          Kathey Marsden added a comment -

          Bogden, can you try the latest on the 10.3 branch, now that DERBY-3362 has been fixed
          and see if you still see the problem?

          Show
          Kathey Marsden added a comment - Bogden, can you try the latest on the 10.3 branch, now that DERBY-3362 has been fixed and see if you still see the problem?
          Hide
          Bogdan Calmac added a comment -

          Sure, I can try it, but DERBY-3362 has probably nothing to do with the original problem. My comment from Feb 1st refers to another issue that prevented me from running the test.

          Show
          Bogdan Calmac added a comment - Sure, I can try it, but DERBY-3362 has probably nothing to do with the original problem. My comment from Feb 1st refers to another issue that prevented me from running the test.
          Hide
          Bogdan Calmac added a comment -

          Would it be valuable to still disable NIO at code level or should I use the unmodified 10.3 branch?

          Show
          Bogdan Calmac added a comment - Would it be valuable to still disable NIO at code level or should I use the unmodified 10.3 branch?
          Hide
          Kathey Marsden added a comment -

          I think it would be valuable to try with NIO disabled. As I understand it you were blocked from doing that experiment because of DERBY-3362

          Show
          Kathey Marsden added a comment - I think it would be valuable to try with NIO disabled. As I understand it you were blocked from doing that experiment because of DERBY-3362
          Hide
          Bogdan Calmac added a comment -

          Yes, that is correct. OK.

          Show
          Bogdan Calmac added a comment - Yes, that is correct. OK.
          Hide
          Bogdan Calmac added a comment -

          I have run my test for a week with the 10.3.2.1 SVN and NIO disabled and the error did not occur. So then we can throw a guess that the issue is related to concurrency in the new code based on NIO.

          Show
          Bogdan Calmac added a comment - I have run my test for a week with the 10.3.2.1 SVN and NIO disabled and the error did not occur. So then we can throw a guess that the issue is related to concurrency in the new code based on NIO.
          Hide
          Kathey Marsden added a comment -

          There is another user that may be hitting this same issue. The trace is
          ERROR XSDB3: Container information cannot change once written: was 0, now 80
          at org.apache.derby.iapi.error.StandardException.newException(Unknown Source)
          at org.apache.derby.impl.store.raw.data.AllocPage.WriteContainerInfo(Unknown Source)
          at org.apache.derby.impl.store.raw.data.FileContainer.writeHeader(Unknown Source)
          at org.apache.derby.impl.store.raw.data.RAFContainer.writeRAFHeader(Unknown Source)
          at org.apache.derby.impl.store.raw.data.RAFContainer.clean(Unknown Source)
          at org.apache.derby.impl.services.cache.CachedItem.clean(Unknown Source)
          at org.apache.derby.impl.services.cache.Clock.cleanCache(Unknown Source)
          at org.apache.derby.impl.services.cache.Clock.cleanAll(Unknown Source)
          at org.apache.derby.impl.store.raw.data.BaseDataFileFactory.checkpoint(Unknown Source)
          at org.apache.derby.impl.store.raw.log.LogToFile.checkpointWithTran(Unknown Source)
          at org.apache.derby.impl.store.raw.log.LogToFile.checkpoint(Unknown Source)
          at org.apache.derby.impl.store.raw.RawStore.checkpoint(Unknown Source)
          at org.apache.derby.impl.store.raw.log.LogToFile.performWork(Unknown Source)
          at org.apache.derby.impl.services.daemon.BasicDaemon.serviceClient(Unknown Source)
          at org.apache.derby.impl.services.daemon.BasicDaemon.work(Unknown Source)
          at org.apache.derby.impl.services.daemon.BasicDaemon.run(Unknown Source)
          at java.lang.Thread.run(Thread.java:801)

          Would it be worthwhile to have a runtime option (something you can put in the derby.properties) to disable NIO or does anyone have ideas on how to proceed with this issue?

          Show
          Kathey Marsden added a comment - There is another user that may be hitting this same issue. The trace is ERROR XSDB3: Container information cannot change once written: was 0, now 80 at org.apache.derby.iapi.error.StandardException.newException(Unknown Source) at org.apache.derby.impl.store.raw.data.AllocPage.WriteContainerInfo(Unknown Source) at org.apache.derby.impl.store.raw.data.FileContainer.writeHeader(Unknown Source) at org.apache.derby.impl.store.raw.data.RAFContainer.writeRAFHeader(Unknown Source) at org.apache.derby.impl.store.raw.data.RAFContainer.clean(Unknown Source) at org.apache.derby.impl.services.cache.CachedItem.clean(Unknown Source) at org.apache.derby.impl.services.cache.Clock.cleanCache(Unknown Source) at org.apache.derby.impl.services.cache.Clock.cleanAll(Unknown Source) at org.apache.derby.impl.store.raw.data.BaseDataFileFactory.checkpoint(Unknown Source) at org.apache.derby.impl.store.raw.log.LogToFile.checkpointWithTran(Unknown Source) at org.apache.derby.impl.store.raw.log.LogToFile.checkpoint(Unknown Source) at org.apache.derby.impl.store.raw.RawStore.checkpoint(Unknown Source) at org.apache.derby.impl.store.raw.log.LogToFile.performWork(Unknown Source) at org.apache.derby.impl.services.daemon.BasicDaemon.serviceClient(Unknown Source) at org.apache.derby.impl.services.daemon.BasicDaemon.work(Unknown Source) at org.apache.derby.impl.services.daemon.BasicDaemon.run(Unknown Source) at java.lang.Thread.run(Thread.java:801) Would it be worthwhile to have a runtime option (something you can put in the derby.properties) to disable NIO or does anyone have ideas on how to proceed with this issue?
          Hide
          Myrna van Lunteren added a comment -

          I'd think an option to disable the NIO as a workaround until something better can be found may be helpful to the users hitting this. (Although possibly not so helpful to whomever will pick this up to fix).

          Show
          Myrna van Lunteren added a comment - I'd think an option to disable the NIO as a workaround until something better can be found may be helpful to the users hitting this. (Although possibly not so helpful to whomever will pick this up to fix).
          Hide
          Knut Anders Hatlen added a comment -

          I think I see how this can happen.

          When we read or write through the page cache, we use FileChannels to
          allow multiple threads to access the same data file in parallel (after
          DERBY-801). The first alloc page in a file can also be accessed
          directly by the container object (via the container cache), in which
          case RandomAccessFile.seek() + read/write is used instead of
          FileChannel.

          Theoretically, this should work because the FileChannel operations use
          absolute positions and should therefore not influence the positioning
          of the read/write performed by the container object, and
          synchronization ensures that RAF.seek+read/write only happens in one
          thread at a time. However, it seems like on some platforms[1] the
          position of the RandomAccessFile object can be changed when operations
          using absolute positioning are performed on the FileChannel object
          returned by getChannel() on the RAF. I guess this is to be expected,
          since FileChannel's javadoc only guarantees that the operations on a
          FileChannel object are thread safe, not that a mix of operations on
          FileChannel objects and RandomAccessFile objects is thread safe.

          So what I think is happening, is that the check point code is cleaning
          the container cache. At the same time, the page cache initiates the
          cleaning of a page in the same container as the container cache is
          cleaning, resulting in a FileChannel operation which changes the
          position of the RandomAccessFile used by the container cache. The
          subsequent call to RandomAccessFile.readFully() therefore reads data
          from the wrong position in the file. When the container cache later
          attempts to write the data back to the file, the inconsistency is
          detected and the XSDB3 error is raised.

          I believe that we can fix this issue by rewriting the parts that don't
          yet use FileChannel, so that FileChannel is used consistently and the
          thread-safety guarantees in FileChannel's javadoc apply.

          [1] Some platforms == Windows. I've also tested it on Linux and
          Solaris, and there the position on the RandomAccessFile doesn't seem
          to be affected by operations using absolute positions.

          Show
          Knut Anders Hatlen added a comment - I think I see how this can happen. When we read or write through the page cache, we use FileChannels to allow multiple threads to access the same data file in parallel (after DERBY-801 ). The first alloc page in a file can also be accessed directly by the container object (via the container cache), in which case RandomAccessFile.seek() + read/write is used instead of FileChannel. Theoretically, this should work because the FileChannel operations use absolute positions and should therefore not influence the positioning of the read/write performed by the container object, and synchronization ensures that RAF.seek+read/write only happens in one thread at a time. However, it seems like on some platforms [1] the position of the RandomAccessFile object can be changed when operations using absolute positioning are performed on the FileChannel object returned by getChannel() on the RAF. I guess this is to be expected, since FileChannel's javadoc only guarantees that the operations on a FileChannel object are thread safe, not that a mix of operations on FileChannel objects and RandomAccessFile objects is thread safe. So what I think is happening, is that the check point code is cleaning the container cache. At the same time, the page cache initiates the cleaning of a page in the same container as the container cache is cleaning, resulting in a FileChannel operation which changes the position of the RandomAccessFile used by the container cache. The subsequent call to RandomAccessFile.readFully() therefore reads data from the wrong position in the file. When the container cache later attempts to write the data back to the file, the inconsistency is detected and the XSDB3 error is raised. I believe that we can fix this issue by rewriting the parts that don't yet use FileChannel, so that FileChannel is used consistently and the thread-safety guarantees in FileChannel's javadoc apply. [1] Some platforms == Windows. I've also tested it on Linux and Solaris, and there the position on the RandomAccessFile doesn't seem to be affected by operations using absolute positions.
          Hide
          Thomas Nielsen added a comment - - edited

          Knut,

          Do you think that this, or a rather similar scenario, could also be the reason for some of the checksum inconsistencies that have been reported lately?
          I'm envisioning a scenario where instead of a parallel cache clean, there's a parallel flush and/or checksum calculation performed by the checkpoint and file cache, and ultimately causing the wrong checksum to be flushed to disk?

          Most inconsistencies seem to be related to windows, and different RandomAccessFile behaviour may possibly be the culprit?

          Just an idea...

          Show
          Thomas Nielsen added a comment - - edited Knut, Do you think that this, or a rather similar scenario, could also be the reason for some of the checksum inconsistencies that have been reported lately? I'm envisioning a scenario where instead of a parallel cache clean, there's a parallel flush and/or checksum calculation performed by the checkpoint and file cache, and ultimately causing the wrong checksum to be flushed to disk? Most inconsistencies seem to be related to windows, and different RandomAccessFile behaviour may possibly be the culprit? Just an idea...
          Hide
          Dyre Tjeldvoll added a comment -

          Knut, I see that you have assigned yourself to this issue. Does that mean you have a fix in the pipe line? If it could be merged to 10.4 fairly soon, I think we should consider spinning another RC with this fix...

          Show
          Dyre Tjeldvoll added a comment - Knut, I see that you have assigned yourself to this issue. Does that mean you have a fix in the pipe line? If it could be merged to 10.4 fairly soon, I think we should consider spinning another RC with this fix...
          Hide
          Knut Anders Hatlen added a comment -

          Thomas:
          > Do you think that this, or a rather similar scenario, could also be the reason for some of the checksum inconsistencies that have been reported lately?

          Yes, that could be the reason, at least if the checksum inconsistencies happen on page 0 (which is the first alloc page). There also are a lot of reports about EOFException thrown by RandomAccessFile.readFully() in FileContainer.getEmbryonicPage() (all seen on Windows only) which I think are likely caused by the same problem.

          Dyre: Yes, I have a patch in the pipeline. I haven't tested it yet, but I'll do so and post it tomorrow. I'll leave it up to you to decide whether it is worth a new RC. If it fixes these problems, I think it is worth it, but it is probably difficult to verify the fix that quickly.

          Show
          Knut Anders Hatlen added a comment - Thomas: > Do you think that this, or a rather similar scenario, could also be the reason for some of the checksum inconsistencies that have been reported lately? Yes, that could be the reason, at least if the checksum inconsistencies happen on page 0 (which is the first alloc page). There also are a lot of reports about EOFException thrown by RandomAccessFile.readFully() in FileContainer.getEmbryonicPage() (all seen on Windows only) which I think are likely caused by the same problem. Dyre: Yes, I have a patch in the pipeline. I haven't tested it yet, but I'll do so and post it tomorrow. I'll leave it up to you to decide whether it is worth a new RC. If it fixes these problems, I think it is worth it, but it is probably difficult to verify the fix that quickly.
          Hide
          Knut Anders Hatlen added a comment -

          Here's a preview patch for those who are interested. It is not ready for commit (the regression tests haven't completed yet, the code needs more comments, and some old comments need to be updated), but it should give an impression of how I'm planning to solve it. Basically, it factors out the occurrences of RandomAccessFile.seek()+read/write in FileChannel into separate methods that can be overridden by the RAFContainer4 class and replaced with FileChannel calls. I also needed to change the way RAFContainer4 initialized the field ourChannel, since with these changes it is too late to initialize it at the end of createContainer() and openContainer().

          Show
          Knut Anders Hatlen added a comment - Here's a preview patch for those who are interested. It is not ready for commit (the regression tests haven't completed yet, the code needs more comments, and some old comments need to be updated), but it should give an impression of how I'm planning to solve it. Basically, it factors out the occurrences of RandomAccessFile.seek()+read/write in FileChannel into separate methods that can be overridden by the RAFContainer4 class and replaced with FileChannel calls. I also needed to change the way RAFContainer4 initialized the field ourChannel, since with these changes it is too late to initialize it at the end of createContainer() and openContainer().
          Hide
          Knut Anders Hatlen added a comment -

          Another related problem:

          Before DERBY-801, there would be only one thread reading/writing the same data file at a time. After DERBY-801, there could be many threads accessing the same data file concurrently, but this should be OK since they would access different pages (read/write to the same page will be serialized by BasePage.setExclusive()). There is however no mechanism, as far as I can see, that prevents concurrent updates of the container information in page 0 and normal page updates that touch page 0. So I think it is possible that updates to the container information section of page 0 are lost because concurrent updates of AllocPage 0 overwrite the new information with the old value. We probably need to find a way to prevent AllocPage 0 from being cleaned while the container is being cleaned (that is, RAFContainer4.writePage(pageId=0) and RAFContainer.writeRAFHeader() must be serialized somehow).

          Show
          Knut Anders Hatlen added a comment - Another related problem: Before DERBY-801 , there would be only one thread reading/writing the same data file at a time. After DERBY-801 , there could be many threads accessing the same data file concurrently, but this should be OK since they would access different pages (read/write to the same page will be serialized by BasePage.setExclusive()). There is however no mechanism, as far as I can see, that prevents concurrent updates of the container information in page 0 and normal page updates that touch page 0. So I think it is possible that updates to the container information section of page 0 are lost because concurrent updates of AllocPage 0 overwrite the new information with the old value. We probably need to find a way to prevent AllocPage 0 from being cleaned while the container is being cleaned (that is, RAFContainer4.writePage(pageId=0) and RAFContainer.writeRAFHeader() must be serialized somehow).
          Hide
          Knut Anders Hatlen added a comment -

          The preview patch caused two failures in derbyall on Solaris 10 with JDK 6. unit/recoveryTest.unit and store/dropcrash2.java failed with an assert failure in AllocPage.ReadContainerInfo(): ASSERT FAILED N not what is expected : 0

          Show
          Knut Anders Hatlen added a comment - The preview patch caused two failures in derbyall on Solaris 10 with JDK 6. unit/recoveryTest.unit and store/dropcrash2.java failed with an assert failure in AllocPage.ReadContainerInfo(): ASSERT FAILED N not what is expected : 0
          Hide
          Knut Anders Hatlen added a comment -

          Attaching an updated preview patch, still not ready for commit. The updated patch fixes the assert failures seen in the recovery tests. The problem was that the new methods in RAFContainer4 used the FileChannel associated with RAFContainer.fileData instead of the FileChannel associated with the file argument. The fileData field and the file argument are normally the same object, but not when a container is stubbified, so redo of DROP TABLE operations would fail. Now the recovery tests pass.

          Show
          Knut Anders Hatlen added a comment - Attaching an updated preview patch, still not ready for commit. The updated patch fixes the assert failures seen in the recovery tests. The problem was that the new methods in RAFContainer4 used the FileChannel associated with RAFContainer.fileData instead of the FileChannel associated with the file argument. The fileData field and the file argument are normally the same object, but not when a container is stubbified, so redo of DROP TABLE operations would fail. Now the recovery tests pass.
          Hide
          Knut Anders Hatlen added a comment -

          I've added more comments and attached a new patch (d3347-1a) which is ready for review. It only addresses the problem with the mix of old-style I/O and NIO on some platforms. The problem with multiple threads accessing page 0 concurrently will have to be fixed in a follow-up patch.

          Show
          Knut Anders Hatlen added a comment - I've added more comments and attached a new patch (d3347-1a) which is ready for review. It only addresses the problem with the mix of old-style I/O and NIO on some platforms. The problem with multiple threads accessing page 0 concurrently will have to be fixed in a follow-up patch.
          Hide
          Knut Anders Hatlen added a comment -

          I'm also attaching a patch (d3347-2a) for the problem with concurrent I/O operations on page 0.

          This patch adds a wrapper around the methods readPage() and writePage() in RAFContainer4. If the page that is accessed is FileContainer.FIRST_ALLOC_PAGE_NUMBER, the method calls are synchronized on the container object, and therefore behave just like their non-NIO counterparts in RAFContainer. Since the code that accesses the container information on the first alloc page also is synchronized on the container object, this means that we never have two threads accessing the first alloc page concurrently. For the rest of the pages, the synchronization is not necessary, since they can only be accessed through the page cache and therefore no two threads access the same page concurrently.

          I haven't run the regression tests on this patch yet.

          Show
          Knut Anders Hatlen added a comment - I'm also attaching a patch (d3347-2a) for the problem with concurrent I/O operations on page 0. This patch adds a wrapper around the methods readPage() and writePage() in RAFContainer4. If the page that is accessed is FileContainer.FIRST_ALLOC_PAGE_NUMBER, the method calls are synchronized on the container object, and therefore behave just like their non-NIO counterparts in RAFContainer. Since the code that accesses the container information on the first alloc page also is synchronized on the container object, this means that we never have two threads accessing the first alloc page concurrently. For the rest of the pages, the synchronization is not necessary, since they can only be accessed through the page cache and therefore no two threads access the same page concurrently. I haven't run the regression tests on this patch yet.
          Hide
          Dag H. Wanvik added a comment -

          I don't know this code area very well, so caveats apply

          • Patch 1: Analysis that some read/writes use FileChannel to access
            first page while others use the RandomAccessFile methods directly
            (and hence no guarantee for serializability) seems correct to
            me. Refactoring to remedy that looks good.

          The cleanup on use/initialization of FileChannel is good, too.

          RAFContainer4#write has lost a check for getCommittedDropState(). Is
          this intentional?

          Naming: FileContainer#readHeader doesn't do any reading any longer,
          since it no longer calls getEmbryonicPage. This is slightly
          confusing.

          BTW: (not introduced by this patch, but I noticed so I thought I'd
          mention it) RAFContainer4 (in contrast to RAFContainer) allocates an
          encryption buffer for every page write. Guess it needs to since the
          write method is not synchronized (and can thus not use the dedicated
          buffer safely). Since such objects can be large, maybe a pool would be good?

          • Patch 2:
            Although after patch 1, writeRAFHeader always uses FileChannel I/O
            now, I agree there is a chance for the race you suggest. The patch
            is simple and looks good to me.
            +1

          Good detective work, Knut!

          Show
          Dag H. Wanvik added a comment - I don't know this code area very well, so caveats apply Patch 1: Analysis that some read/writes use FileChannel to access first page while others use the RandomAccessFile methods directly (and hence no guarantee for serializability) seems correct to me. Refactoring to remedy that looks good. The cleanup on use/initialization of FileChannel is good, too. RAFContainer4#write has lost a check for getCommittedDropState(). Is this intentional? Naming: FileContainer#readHeader doesn't do any reading any longer, since it no longer calls getEmbryonicPage. This is slightly confusing. BTW: (not introduced by this patch, but I noticed so I thought I'd mention it) RAFContainer4 (in contrast to RAFContainer) allocates an encryption buffer for every page write. Guess it needs to since the write method is not synchronized (and can thus not use the dedicated buffer safely). Since such objects can be large, maybe a pool would be good? Patch 2: Although after patch 1, writeRAFHeader always uses FileChannel I/O now, I agree there is a chance for the race you suggest. The patch is simple and looks good to me. +1 Good detective work, Knut!
          Hide
          Knut Anders Hatlen added a comment -

          Thanks for reviewing the patch, Dag!

          > RAFContainer4#write has lost a check for getCommittedDropState(). Is
          > this intentional?

          Good catch! No, it wasn't intentional. The new patch (1b) reintroduces the check, and also reintroduces the assert for !getCommittedDropState() in readPage().

          > Naming: FileContainer#readHeader doesn't do any reading any longer,
          > since it no longer calls getEmbryonicPage. This is slightly
          > confusing.

          Agreed, I think. Although the two methods it still invokes are called ReadContainerInfo() and readHeaderFromArray(), so I think there is some justification for keeping "read" in its name. If we come up with a better name, we can always rename it later.

          > BTW: (not introduced by this patch, but I noticed so I thought I'd
          > mention it) RAFContainer4 (in contrast to RAFContainer) allocates an
          > encryption buffer for every page write. Guess it needs to since the
          > write method is not synchronized (and can thus not use the dedicated
          > buffer safely). Since such objects can be large, maybe a pool would be good?

          (I guess you meant "every encrypted page write", not "every page write"...)

          Yes, that was also commented in the review of DERBY-801, and DERBY-2086 was logged for it.

          Show
          Knut Anders Hatlen added a comment - Thanks for reviewing the patch, Dag! > RAFContainer4#write has lost a check for getCommittedDropState(). Is > this intentional? Good catch! No, it wasn't intentional. The new patch (1b) reintroduces the check, and also reintroduces the assert for !getCommittedDropState() in readPage(). > Naming: FileContainer#readHeader doesn't do any reading any longer, > since it no longer calls getEmbryonicPage. This is slightly > confusing. Agreed, I think. Although the two methods it still invokes are called ReadContainerInfo() and readHeaderFromArray(), so I think there is some justification for keeping "read" in its name. If we come up with a better name, we can always rename it later. > BTW: (not introduced by this patch, but I noticed so I thought I'd > mention it) RAFContainer4 (in contrast to RAFContainer) allocates an > encryption buffer for every page write. Guess it needs to since the > write method is not synchronized (and can thus not use the dedicated > buffer safely). Since such objects can be large, maybe a pool would be good? (I guess you meant "every encrypted page write", not "every page write"...) Yes, that was also commented in the review of DERBY-801 , and DERBY-2086 was logged for it.
          Hide
          Knut Anders Hatlen added a comment -

          The 1b patch replaces the 1a patch. It only contains one change: The checks/asserts for getCommittedDropState() are reintroduced in readPage()/writePage().

          Show
          Knut Anders Hatlen added a comment - The 1b patch replaces the 1a patch. It only contains one change: The checks/asserts for getCommittedDropState() are reintroduced in readPage()/writePage().
          Hide
          Knut Anders Hatlen added a comment -

          I committed the 2a patch to trunk with revision 647091.

          I have started the regression tests on the 1b patch as well. The changes from 1a are minimal, so I plan to commit it if the tests pass.

          Show
          Knut Anders Hatlen added a comment - I committed the 2a patch to trunk with revision 647091. I have started the regression tests on the 1b patch as well. The changes from 1a are minimal, so I plan to commit it if the tests pass.
          Hide
          Dag H. Wanvik added a comment -

          > Agreed, I think. Although the two methods it still invokes are called
          > ReadContainerInfo() and readHeaderFromArray(), so I think there is
          > some justification for keeping "read" in its name. If we come up with
          > a better name, we can always rename it later.

          Yes, I notices there are several existing methods in there that are
          called read<Something> and write<Something> that do no I/O, but just
          analyze or format bytes arrays. It would be good to cleanup this code
          some, but I am OK with deferring for now.

          This comment in the StorageRandomAccessFile abstraction is also less
          precise/correct after the advent of RAFContainer4 as a user of
          StorageRandomAccessFile:

          "An implementation of StorageRandomAccessFile need not be thread
          safe. The database engine single-threads access to each
          StorageRandomAccessFile instance. Two threads will not access the
          same StorageRandomAccessFile instance at the same time."

          After you reintroduced the getCommittedDropState checks,
          +1 to the first patch as well!

          Show
          Dag H. Wanvik added a comment - > Agreed, I think. Although the two methods it still invokes are called > ReadContainerInfo() and readHeaderFromArray(), so I think there is > some justification for keeping "read" in its name. If we come up with > a better name, we can always rename it later. Yes, I notices there are several existing methods in there that are called read<Something> and write<Something> that do no I/O, but just analyze or format bytes arrays. It would be good to cleanup this code some, but I am OK with deferring for now. This comment in the StorageRandomAccessFile abstraction is also less precise/correct after the advent of RAFContainer4 as a user of StorageRandomAccessFile: "An implementation of StorageRandomAccessFile need not be thread safe. The database engine single-threads access to each StorageRandomAccessFile instance. Two threads will not access the same StorageRandomAccessFile instance at the same time." After you reintroduced the getCommittedDropState checks, +1 to the first patch as well!
          Hide
          Knut Anders Hatlen added a comment -

          Thanks again, Dag. Patch 1b has been committed with revision 647139.

          I agree that the code and comments in this area would benefit from an overhaul. The abstractions in the original code were obviously chosen to match the RandomAccessFile way of accessing the data files, and the NIO code doesn't fit very well into that model. Some of the comments added as part of DERBY-801 also mention that a restructuring of the interfaces would be beneficial.

          Regarding the comment in StorageRandomAccessFile, I think it's correct, actually. The only parts of a StorageRandomAccessFile that need to be thread safe, are the methods on the FileChannel object returned by the getChannel() method. However, the SRAF interface doesn't know anything about FileChannel or getChannel(), it's just RAFContainer4 that knows that some SRAF implementations are subclasses of java.io.RAF, and therefore manages to retrieve FileChannels from them. None of the SRAF methods are ever called concurrently on the same object.

          Show
          Knut Anders Hatlen added a comment - Thanks again, Dag. Patch 1b has been committed with revision 647139. I agree that the code and comments in this area would benefit from an overhaul. The abstractions in the original code were obviously chosen to match the RandomAccessFile way of accessing the data files, and the NIO code doesn't fit very well into that model. Some of the comments added as part of DERBY-801 also mention that a restructuring of the interfaces would be beneficial. Regarding the comment in StorageRandomAccessFile, I think it's correct, actually. The only parts of a StorageRandomAccessFile that need to be thread safe, are the methods on the FileChannel object returned by the getChannel() method. However, the SRAF interface doesn't know anything about FileChannel or getChannel(), it's just RAFContainer4 that knows that some SRAF implementations are subclasses of java.io.RAF, and therefore manages to retrieve FileChannels from them. None of the SRAF methods are ever called concurrently on the same object.
          Hide
          Knut Anders Hatlen added a comment -

          I guess we should try to back-port the fix at least to 10.4. It would be great if we could get the fix verified first, but lacking a reliable repro that may be difficult.

          Show
          Knut Anders Hatlen added a comment - I guess we should try to back-port the fix at least to 10.4. It would be great if we could get the fix verified first, but lacking a reliable repro that may be difficult.
          Hide
          Dyre Tjeldvoll added a comment -

          +1 to merging to 10.4 (I added 10.4.1.2 as a fix-version, and have canceled the ongoing vote on 10.4.1.1 on derby-dev).

          Show
          Dyre Tjeldvoll added a comment - +1 to merging to 10.4 (I added 10.4.1.2 as a fix-version, and have canceled the ongoing vote on 10.4.1.1 on derby-dev).
          Hide
          Knut Anders Hatlen added a comment -

          I have merged the fixes to the 10.4 branch (revisions 647310 and 647312).
          I'm leaving the issue open for now, since the fix hasn't been verified.

          Show
          Knut Anders Hatlen added a comment - I have merged the fixes to the 10.4 branch (revisions 647310 and 647312). I'm leaving the issue open for now, since the fix hasn't been verified.
          Hide
          Bogdan Calmac added a comment -

          I can rerun my endurance test next week using 10.4.

          Show
          Bogdan Calmac added a comment - I can rerun my endurance test next week using 10.4.
          Hide
          Knut Anders Hatlen added a comment -

          Thanks Bogdan, that would be great!

          Show
          Knut Anders Hatlen added a comment - Thanks Bogdan, that would be great!
          Hide
          Kathey Marsden added a comment -

          In working on another issue I hit this bug and thought I would post my program in case it can be expanded to become a more reliable reproduction. You run it with the number of threads as an argument and might want to bump up memory e.g.

          java -Xmx512M MiniStress 50

          I ran the program once to the end without error and then reran on the same db and got the error. I haven't yet updated up to get the fix.
          I ran it on my dual core Windows XP box with Sun jdk 1.6. Maybe on a box with more CPU's it will pop the bug more reliably.

          Here is the trace from the derby.log

          ------------ BEGIN SHUTDOWN ERROR STACK -------------

          ERROR XSDG1: Page Page(1039,Container(0, 5856)) could not be written to disk, please check if disk is full.

          at org.apache.derby.iapi.error.StandardException.newException(StandardException.java:296)

          at org.apache.derby.impl.store.raw.data.CachedPage.writePage(CachedPage.java:824)

          at org.apache.derby.impl.store.raw.data.CachedPage.clean(CachedPage.java:605)

          at org.apache.derby.impl.services.cache.ConcurrentCache.cleanAndUnkeepEntry(ConcurrentCache.java:551)

          at org.apache.derby.impl.services.cache.ClockPolicy.rotateClock(ClockPolicy.java:476)

          at org.apache.derby.impl.services.cache.ClockPolicy.insertEntry(ClockPolicy.java:176)

          at org.apache.derby.impl.services.cache.ConcurrentCache.insertIntoFreeSlot(ConcurrentCache.java:208)

          at org.apache.derby.impl.services.cache.ConcurrentCache.find(ConcurrentCache.java:284)

          at org.apache.derby.impl.store.raw.data.FileContainer.getUserPage(FileContainer.java:2412)

          at org.apache.derby.impl.store.raw.data.FileContainer.getPage(FileContainer.java:2462)

          at org.apache.derby.impl.store.raw.data.BaseContainerHandle.getPage(BaseContainerHandle.java:319)

          at org.apache.derby.impl.store.access.btree.ControlRow.get(ControlRow.java:833)

          at org.apache.derby.impl.store.access.btree.ControlRow.get(ControlRow.java:820)

          at org.apache.derby.impl.store.access.btree.BTreePostCommit.doShrink(BTreePostCommit.java:123)

          at org.apache.derby.impl.store.access.btree.BTreePostCommit.performWork(BTreePostCommit.java:249)

          at org.apache.derby.impl.store.raw.xact.Xact.postTermination(Xact.java:2045)

          at org.apache.derby.impl.store.raw.xact.Xact.completeCommit(Xact.java:818)

          at org.apache.derby.impl.store.raw.xact.Xact.commit(Xact.java:854)

          at org.apache.derby.impl.store.raw.xact.Xact.commit(Xact.java:649)

          at org.apache.derby.impl.store.access.RAMTransaction.commit(RAMTransaction.java:1964)

          at org.apache.derby.impl.sql.conn.GenericLanguageConnectionContext.doCommit(GenericLanguageConnectionContext.java:1211)

          at org.apache.derby.impl.sql.conn.GenericLanguageConnectionContext.userCommit(GenericLanguageConnectionContext.java:1031)

          at org.apache.derby.impl.jdbc.TransactionResourceImpl.commit(TransactionResourceImpl.java:237)

          at org.apache.derby.impl.jdbc.EmbedConnection.commit(EmbedConnection.java:1661)

          at WorkerThread.run(WorkerThread.java:24)

          Caused by: java.nio.channels.ClosedChannelException

          at sun.nio.ch.FileChannelImpl.ensureOpen(FileChannelImpl.java:91)

          at sun.nio.ch.FileChannelImpl.write(FileChannelImpl.java:642)

          at org.apache.derby.impl.store.raw.data.RAFContainer4.writeFull(RAFContainer4.java:397)

          at org.apache.derby.impl.store.raw.data.RAFContainer4.writePage(RAFContainer4.java:291)

          at org.apache.derby.impl.store.raw.data.CachedPage.writePage(CachedPage.java:782)

          ... 23 more

          ============= begin nested exception, level (1) ===========

          java.nio.channels.ClosedChannelException

          at sun.nio.ch.FileChannelImpl.ensureOpen(FileChannelImpl.java:91)

          at sun.nio.ch.FileChannelImpl.write(FileChannelImpl.java:642)

          at org.apache.derby.impl.store.raw.data.RAFContainer4.writeFull(RAFContainer4.java:397)

          at org.apache.derby.impl.store.raw.data.RAFContainer4.writePage(RAFContainer4.java:291)

          at org.apache.derby.impl.store.raw.data.CachedPage.writePage(CachedPage.java:782)

          at org.apache.derby.impl.store.raw.data.CachedPage.clean(CachedPage.java:605)

          at org.apache.derby.impl.services.cache.ConcurrentCache.cleanAndUnkeepEntry(ConcurrentCache.java:551)

          at org.apache.derby.impl.services.cache.ClockPolicy.rotateClock(ClockPolicy.java:476)

          at org.apache.derby.impl.services.cache.ClockPolicy.insertEntry(ClockPolicy.java:176)

          at org.apache.derby.impl.services.cache.ConcurrentCache.insertIntoFreeSlot(ConcurrentCache.java:208)

          at org.apache.derby.impl.services.cache.ConcurrentCache.find(ConcurrentCache.java:284)

          at org.apache.derby.impl.store.raw.data.FileContainer.getUserPage(FileContainer.java:2412)

          at org.apache.derby.impl.store.raw.data.FileContainer.getPage(FileContainer.java:2462)

          at org.apache.derby.impl.store.raw.data.BaseContainerHandle.getPage(BaseContainerHandle.java:319)

          at org.apache.derby.impl.store.access.btree.ControlRow.get(ControlRow.java:833)

          at org.apache.derby.impl.store.access.btree.ControlRow.get(ControlRow.java:820)

          at org.apache.derby.impl.store.access.btree.BTreePostCommit.doShrink(BTreePostCommit.java:123)

          at org.apache.derby.impl.store.access.btree.BTreePostCommit.performWork(BTreePostCommit.java:249)

          at org.apache.derby.impl.store.raw.xact.Xact.postTermination(Xact.java:2045)

          at org.apache.derby.impl.store.raw.xact.Xact.completeCommit(Xact.java:818)

          at org.apache.derby.impl.store.raw.xact.Xact.commit(Xact.java:854)

          at org.apache.derby.impl.store.raw.xact.Xact.commit(Xact.java:649)

          at org.apache.derby.impl.store.access.RAMTransaction.commit(RAMTransaction.java:1964)

          at org.apache.derby.impl.sql.conn.GenericLanguageConnectionContext.doCommit(GenericLanguageConnectionContext.java:1211)

          at org.apache.derby.impl.sql.conn.GenericLanguageConnectionContext.userCommit(GenericLanguageConnectionContext.java:1031)

          at org.apache.derby.impl.jdbc.TransactionResourceImpl.commit(TransactionResourceImpl.java:237)

          at org.apache.derby.impl.jdbc.EmbedConnection.commit(EmbedConnection.java:1661)

          at WorkerThread.run(WorkerThread.java:24)

          ============= end nested exception, level (1) ===========

          ------------ END SHUTDOWN ERROR STACK -------------

          Show
          Kathey Marsden added a comment - In working on another issue I hit this bug and thought I would post my program in case it can be expanded to become a more reliable reproduction. You run it with the number of threads as an argument and might want to bump up memory e.g. java -Xmx512M MiniStress 50 I ran the program once to the end without error and then reran on the same db and got the error. I haven't yet updated up to get the fix. I ran it on my dual core Windows XP box with Sun jdk 1.6. Maybe on a box with more CPU's it will pop the bug more reliably. Here is the trace from the derby.log ------------ BEGIN SHUTDOWN ERROR STACK ------------- ERROR XSDG1: Page Page(1039,Container(0, 5856)) could not be written to disk, please check if disk is full. at org.apache.derby.iapi.error.StandardException.newException(StandardException.java:296) at org.apache.derby.impl.store.raw.data.CachedPage.writePage(CachedPage.java:824) at org.apache.derby.impl.store.raw.data.CachedPage.clean(CachedPage.java:605) at org.apache.derby.impl.services.cache.ConcurrentCache.cleanAndUnkeepEntry(ConcurrentCache.java:551) at org.apache.derby.impl.services.cache.ClockPolicy.rotateClock(ClockPolicy.java:476) at org.apache.derby.impl.services.cache.ClockPolicy.insertEntry(ClockPolicy.java:176) at org.apache.derby.impl.services.cache.ConcurrentCache.insertIntoFreeSlot(ConcurrentCache.java:208) at org.apache.derby.impl.services.cache.ConcurrentCache.find(ConcurrentCache.java:284) at org.apache.derby.impl.store.raw.data.FileContainer.getUserPage(FileContainer.java:2412) at org.apache.derby.impl.store.raw.data.FileContainer.getPage(FileContainer.java:2462) at org.apache.derby.impl.store.raw.data.BaseContainerHandle.getPage(BaseContainerHandle.java:319) at org.apache.derby.impl.store.access.btree.ControlRow.get(ControlRow.java:833) at org.apache.derby.impl.store.access.btree.ControlRow.get(ControlRow.java:820) at org.apache.derby.impl.store.access.btree.BTreePostCommit.doShrink(BTreePostCommit.java:123) at org.apache.derby.impl.store.access.btree.BTreePostCommit.performWork(BTreePostCommit.java:249) at org.apache.derby.impl.store.raw.xact.Xact.postTermination(Xact.java:2045) at org.apache.derby.impl.store.raw.xact.Xact.completeCommit(Xact.java:818) at org.apache.derby.impl.store.raw.xact.Xact.commit(Xact.java:854) at org.apache.derby.impl.store.raw.xact.Xact.commit(Xact.java:649) at org.apache.derby.impl.store.access.RAMTransaction.commit(RAMTransaction.java:1964) at org.apache.derby.impl.sql.conn.GenericLanguageConnectionContext.doCommit(GenericLanguageConnectionContext.java:1211) at org.apache.derby.impl.sql.conn.GenericLanguageConnectionContext.userCommit(GenericLanguageConnectionContext.java:1031) at org.apache.derby.impl.jdbc.TransactionResourceImpl.commit(TransactionResourceImpl.java:237) at org.apache.derby.impl.jdbc.EmbedConnection.commit(EmbedConnection.java:1661) at WorkerThread.run(WorkerThread.java:24) Caused by: java.nio.channels.ClosedChannelException at sun.nio.ch.FileChannelImpl.ensureOpen(FileChannelImpl.java:91) at sun.nio.ch.FileChannelImpl.write(FileChannelImpl.java:642) at org.apache.derby.impl.store.raw.data.RAFContainer4.writeFull(RAFContainer4.java:397) at org.apache.derby.impl.store.raw.data.RAFContainer4.writePage(RAFContainer4.java:291) at org.apache.derby.impl.store.raw.data.CachedPage.writePage(CachedPage.java:782) ... 23 more ============= begin nested exception, level (1) =========== java.nio.channels.ClosedChannelException at sun.nio.ch.FileChannelImpl.ensureOpen(FileChannelImpl.java:91) at sun.nio.ch.FileChannelImpl.write(FileChannelImpl.java:642) at org.apache.derby.impl.store.raw.data.RAFContainer4.writeFull(RAFContainer4.java:397) at org.apache.derby.impl.store.raw.data.RAFContainer4.writePage(RAFContainer4.java:291) at org.apache.derby.impl.store.raw.data.CachedPage.writePage(CachedPage.java:782) at org.apache.derby.impl.store.raw.data.CachedPage.clean(CachedPage.java:605) at org.apache.derby.impl.services.cache.ConcurrentCache.cleanAndUnkeepEntry(ConcurrentCache.java:551) at org.apache.derby.impl.services.cache.ClockPolicy.rotateClock(ClockPolicy.java:476) at org.apache.derby.impl.services.cache.ClockPolicy.insertEntry(ClockPolicy.java:176) at org.apache.derby.impl.services.cache.ConcurrentCache.insertIntoFreeSlot(ConcurrentCache.java:208) at org.apache.derby.impl.services.cache.ConcurrentCache.find(ConcurrentCache.java:284) at org.apache.derby.impl.store.raw.data.FileContainer.getUserPage(FileContainer.java:2412) at org.apache.derby.impl.store.raw.data.FileContainer.getPage(FileContainer.java:2462) at org.apache.derby.impl.store.raw.data.BaseContainerHandle.getPage(BaseContainerHandle.java:319) at org.apache.derby.impl.store.access.btree.ControlRow.get(ControlRow.java:833) at org.apache.derby.impl.store.access.btree.ControlRow.get(ControlRow.java:820) at org.apache.derby.impl.store.access.btree.BTreePostCommit.doShrink(BTreePostCommit.java:123) at org.apache.derby.impl.store.access.btree.BTreePostCommit.performWork(BTreePostCommit.java:249) at org.apache.derby.impl.store.raw.xact.Xact.postTermination(Xact.java:2045) at org.apache.derby.impl.store.raw.xact.Xact.completeCommit(Xact.java:818) at org.apache.derby.impl.store.raw.xact.Xact.commit(Xact.java:854) at org.apache.derby.impl.store.raw.xact.Xact.commit(Xact.java:649) at org.apache.derby.impl.store.access.RAMTransaction.commit(RAMTransaction.java:1964) at org.apache.derby.impl.sql.conn.GenericLanguageConnectionContext.doCommit(GenericLanguageConnectionContext.java:1211) at org.apache.derby.impl.sql.conn.GenericLanguageConnectionContext.userCommit(GenericLanguageConnectionContext.java:1031) at org.apache.derby.impl.jdbc.TransactionResourceImpl.commit(TransactionResourceImpl.java:237) at org.apache.derby.impl.jdbc.EmbedConnection.commit(EmbedConnection.java:1661) at WorkerThread.run(WorkerThread.java:24) ============= end nested exception, level (1) =========== ------------ END SHUTDOWN ERROR STACK -------------
          Hide
          Mike Matrigali added a comment -

          The most likely way this bug will show up is concurrent writes to page 0. But the bug may not be recognized for a long time as it only corrupts the on disk page, while the page is cache is likely
          fine (i am not sure of this but seems possible). This bug in particular affects page 0, which happens to be accessed a lot and thus is in cache a lot. So best way to catch it, is as Kathey observed -
          run your concurrent test and then shutdown completely and then start up again and check. Easiest
          way to check is just to run the consistency checker against any tables you might be testing against.

          Show
          Mike Matrigali added a comment - The most likely way this bug will show up is concurrent writes to page 0. But the bug may not be recognized for a long time as it only corrupts the on disk page, while the page is cache is likely fine (i am not sure of this but seems possible). This bug in particular affects page 0, which happens to be accessed a lot and thus is in cache a lot. So best way to catch it, is as Kathey observed - run your concurrent test and then shutdown completely and then start up again and check. Easiest way to check is just to run the consistency checker against any tables you might be testing against.
          Hide
          Kathey Marsden added a comment -

          Thanks Mike for the info. I updated the program to shutdown the db when done and check the tables. I can reproduce corruption in page 0 pretty reliably with 100 threads. I am going to update my client now and see if the problem goes away with the fix.

          Show
          Kathey Marsden added a comment - Thanks Mike for the info. I updated the program to shutdown the db when done and check the tables. I can reproduce corruption in page 0 pretty reliably with 100 threads. I am going to update my client now and see if the problem goes away with the fix.
          Hide
          Kathey Marsden added a comment -

          I ran the program more than a dozen times without error with the fix after getting corruption 3/3 times without the fix. So the fix looks good from this case's perspective. Attached is the updated program that does the shutdown and table check MiniStress.zip. Maybe someone that runs stress tests could incorporate something similar.

          Show
          Kathey Marsden added a comment - I ran the program more than a dozen times without error with the fix after getting corruption 3/3 times without the fix. So the fix looks good from this case's perspective. Attached is the updated program that does the shutdown and table check MiniStress.zip. Maybe someone that runs stress tests could incorporate something similar.
          Hide
          Knut Anders Hatlen added a comment -

          Thanks for taking the time to test the fix, Kathey!
          I haven't been able to reproduce the problem myself, but I'm marking the issue as resolved based on your testing. If Bogdan's testing shows that it's not fixed, we can reopen it.

          Show
          Knut Anders Hatlen added a comment - Thanks for taking the time to test the fix, Kathey! I haven't been able to reproduce the problem myself, but I'm marking the issue as resolved based on your testing. If Bogdan's testing shows that it's not fixed, we can reopen it.
          Hide
          Kathey Marsden added a comment -

          Knut, do you plan to backport this fix to 10.3?

          Show
          Kathey Marsden added a comment - Knut, do you plan to backport this fix to 10.3?
          Hide
          Knut Anders Hatlen added a comment -

          Merged the fixes to the 10.3 branch and committed revision 648210.

          Show
          Knut Anders Hatlen added a comment - Merged the fixes to the 10.3 branch and committed revision 648210.
          Hide
          Mike Matrigali added a comment -

          Trying to figure out what to tell 10.3 users that may have hit this. Does anyone have a db that they can boot that has a table that gets this error on access. Can you drop the index or table if it gets this error? If you get this error during recovery you are going to have to go to backup, but i am hoping that if the error is on an index we can figure out some way to just be able to drop and recreate the index.

          How likely is this error to have caused more than page 0 to be corrupted? Does the unintended i/o interaction have a chance to push a write that was supposed to start at offset 0 to somewhere later thus causing it to overwrite page 1?

          Show
          Mike Matrigali added a comment - Trying to figure out what to tell 10.3 users that may have hit this. Does anyone have a db that they can boot that has a table that gets this error on access. Can you drop the index or table if it gets this error? If you get this error during recovery you are going to have to go to backup, but i am hoping that if the error is on an index we can figure out some way to just be able to drop and recreate the index. How likely is this error to have caused more than page 0 to be corrupted? Does the unintended i/o interaction have a chance to push a write that was supposed to start at offset 0 to somewhere later thus causing it to overwrite page 1?
          Hide
          Knut Anders Hatlen added a comment -

          I'm not sure (since I haven't been able to reproduce the problem myself) but I don't think you'll be able to drop the table/index. I think DROP TABLE will open the container it's about to drop, and then it'll also read the container information stored in page 0.

          I also think it is possible that the problem with mixing FileChannel operations and RandomAccessFile.seek() can cause corruption on any page in the container, although based on the reports, it seems like page 0 is the most likely one to be hit.

          I'm attaching the program I used to track down the problem (no Derby code, only pure file operations), FileOffset.java. It runs 10 threads that write at random positions in a file using a FileChannel, while the main thread runs a loop with RAF.seek(0L) + sleep() + RAF.getFilePointer(). On Solaris and Linux, getFilePointer() always returns 0, but on Windows it frequently returns other values pointing at random positions in the file (only tested with Sun's JVMs). So it is quite possible that a write occurs at the wrong position and corrupts another page than page 0.

          The other problem (concurrent access to page 0 by container cache and page cache) can only hit page 0, I believe.

          Show
          Knut Anders Hatlen added a comment - I'm not sure (since I haven't been able to reproduce the problem myself) but I don't think you'll be able to drop the table/index. I think DROP TABLE will open the container it's about to drop, and then it'll also read the container information stored in page 0. I also think it is possible that the problem with mixing FileChannel operations and RandomAccessFile.seek() can cause corruption on any page in the container, although based on the reports, it seems like page 0 is the most likely one to be hit. I'm attaching the program I used to track down the problem (no Derby code, only pure file operations), FileOffset.java. It runs 10 threads that write at random positions in a file using a FileChannel, while the main thread runs a loop with RAF.seek(0L) + sleep() + RAF.getFilePointer(). On Solaris and Linux, getFilePointer() always returns 0, but on Windows it frequently returns other values pointing at random positions in the file (only tested with Sun's JVMs). So it is quite possible that a write occurs at the wrong position and corrupts another page than page 0. The other problem (concurrent access to page 0 by container cache and page cache) can only hit page 0, I believe.
          Hide
          Knut Anders Hatlen added a comment -

          Attaching a release note.

          Show
          Knut Anders Hatlen added a comment - Attaching a release note.
          Hide
          Bryan Pendleton added a comment -

          Is it true that this problem only affected Windows platforms? If so, would this be worth including in the note?

          Show
          Bryan Pendleton added a comment - Is it true that this problem only affected Windows platforms? If so, would this be worth including in the note?
          Hide
          Knut Anders Hatlen added a comment -

          It is true that parts of the problem didn't seem to happen on Solaris and Linux, whereas it did on Windows (see the attached FileOffset.java). But I only tried it on a limited set of JVM's, so I can't say definitely that it's not a problem on any JVM on any other OS than Windows.

          Also, the possibility of overwriting parts of page 0 with old data due to lack of synchronization between page writes and container meta-data writes, is not limited to a single platform.

          Show
          Knut Anders Hatlen added a comment - It is true that parts of the problem didn't seem to happen on Solaris and Linux, whereas it did on Windows (see the attached FileOffset.java). But I only tried it on a limited set of JVM's, so I can't say definitely that it's not a problem on any JVM on any other OS than Windows. Also, the possibility of overwriting parts of page 0 with old data due to lack of synchronization between page writes and container meta-data writes, is not limited to a single platform.
          Hide
          Bogdan Calmac added a comment -

          I've run our endurance test for about a week using 10.4 svn 648059 and it was solid as a rock.

          Show
          Bogdan Calmac added a comment - I've run our endurance test for about a week using 10.4 svn 648059 and it was solid as a rock.
          Hide
          Knut Anders Hatlen added a comment -

          Thanks for testing it, Bogdan! I'm glad it solved your problems.

          Show
          Knut Anders Hatlen added a comment - Thanks for testing it, Bogdan! I'm glad it solved your problems.

            People

            • Assignee:
              Knut Anders Hatlen
              Reporter:
              Bogdan Calmac
            • Votes:
              2 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development