Uploaded image for project: 'Derby'
  1. Derby
  2. DERBY-4239

Possible corruption if SYSCS_UTIL.SYSCS_INPLACE_COMPRESS_TABLE is called during checkpoint

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Critical
    • Resolution: Fixed
    • 10.1.3.3, 10.2.2.1, 10.3.2.1, 10.4.2.0, 10.5.1.1, 10.6.1.0
    • Store
    • None
    • Data corruption, Regression Test Failure

    Description

      corruption with storerecovery oc_rec? tests. ERROR XSLA7: Cannot redo operation null in the log when compress occurs during checkpoint, then jvm exits

      I saw corruption on z/OS with the storerecovery tests and 10.5.1.1. The failure comes in oc_rec3 trying to connect to the database, but the actual problem seems to have occurred with the prior test oc_rec2. The problem is somewhat intermittent, happening approximately 1/4 times. I extracted the case from the harness and will attach the reproduction and run the script repro.ksh. The script will loop up to 50 times until it gets the failure which looks like.

      ERROR XSLA7: Cannot redo operation null in the log.
      at org.apache.derby.iapi.error.StandardException.newException(Unknown Source)
      at org.apache.derby.impl.store.raw.log.FileLogger.redo(Unknown Source)
      at org.apache.derby.impl.store.raw.log.LogToFile.recover(Unknown Source)
      at org.apache.derby.impl.store.raw.RawStore.boot(Unknown Source)
      at org.apache.derby.impl.services.monitor.BaseMonitor.boot(Unknown Source)
      at org.apache.derby.impl.services.monitor.TopService.bootModule(Unknown Source)
      at org.apache.derby.impl.services.monitor.BaseMonitor.startModule(Unknown Source)
      at org.apache.derby.iapi.services.monitor.Monitor.bootServiceModule(Unknown Source)
      at org.apache.derby.impl.store.access.RAMAccessManager.boot(Unknown Source)
      at org.apache.derby.impl.services.monitor.BaseMonitor.boot(Unknown Source)
      at org.apache.derby.impl.services.monitor.TopService.bootModule(Unknown Source)
      at org.apache.derby.impl.services.monitor.BaseMonitor.startModule(Unknown Source)
      at org.apache.derby.iapi.services.monitor.Monitor.bootServiceModule(Unknown Source)
      at org.apache.derby.impl.db.BasicDatabase.bootStore(Unknown Source)
      at org.apache.derby.impl.db.BasicDatabase.boot(Unknown Source)
      at org.apache.derby.impl.services.monitor.BaseMonitor.boot(Unknown Source)
      at org.apache.derby.impl.services.monitor.TopService.bootModule(Unknown Source)
      at org.apache.derby.impl.services.monitor.BaseMonitor.bootService(Unknown Source)
      at org.apache.derby.impl.services.monitor.BaseMonitor.startProviderService(Unknown Source)
      at org.apache.derby.impl.services.monitor.BaseMonitor.findProviderAndStartService(Unknown Source)
      at org.apache.derby.impl.services.monitor.BaseMonitor.startPersistentService(Unknown Source)
      at org.apache.derby.iapi.services.monitor.Monitor.startPersistentService(Unknown Source)
      at org.apache.derby.impl.jdbc.EmbedConnection.bootDatabase(Unknown Source)
      at org.apache.derby.impl.jdbc.EmbedConnection.<init>(Unknown Source)
      at org.apache.derby.jdbc.Driver40.getNewEmbedConnection(Unknown Source)
      at org.apache.derby.jdbc.InternalDriver.connect(Unknown Source)
      at org.apache.derby.jdbc.AutoloadedDriver.connect(Unknown Source)
      at java.sql.DriverManager.getConnection(DriverManager.java:311)
      at java.sql.DriverManager.getConnection(DriverManager.java:268)
      at CheckTables.main(CheckTables.java:8)
      Caused by: ERROR XSDBB: Unknown page format at page Page(16,Container(0, 1073)), page dump follows: Hex dump:
      00000000: 0000 0000 0000 0000 0000 0000 0000 0000 ................
      00000010: 0000 0000 0000 0000 0000 0000 0000 0000 ................
      <snip lots of 000's>

      I ran it with 10.3 and it completed all 50 iterations, so whether JVM or Derby issue it seems new since 10.3. (I haven't tried with 10.4). Oddly I have run tests many times before on this machine using in the 10.5.1.1 release and the same jvm and have never seen this failure, so am looking into whether maybe something changed on the machine or environment.

      Attachments

        1. DERBY-4239_3.diff
          13 kB
          Mike Matrigali
        2. DERBY-4239_2.diff
          13 kB
          Mike Matrigali
        3. derby-4239_1.diff
          13 kB
          Mike Matrigali
        4. derby_dumponly.zip
          2.14 MB
          Mike Matrigali
        5. reproBackgroundCheckpoint.zip
          15 kB
          Katherine Marsden
        6. identifyBadContainer.ksh
          0.9 kB
          Katherine Marsden
        7. goodlogsizes.txt
          2 kB
          Katherine Marsden
        8. badlogsizes.txt
          3 kB
          Katherine Marsden
        9. wombat_keeplog_notcorrupt.zip
          2.35 MB
          Katherine Marsden
        10. derby.log
          2.19 MB
          Mike Matrigali
        11. reproDerby4239.zip
          14 kB
          Katherine Marsden
        12. derby.log
          194 kB
          Katherine Marsden
        13. wombat_with_keeplog.zip
          2.35 MB
          Katherine Marsden

        Issue Links

          Activity

            People

              mikem Mike Matrigali
              kmarsden Katherine Marsden
              Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: