Derby
  1. Derby
  2. DERBY-3811

ArrayIndexOutOfBoundsException in BasePage.getHeaderAtSlot accessing modified Clob with READ_UNCOMMITTED isolation

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 10.5.1.1
    • Fix Version/s: None
    • Component/s: Store
    • Urgency:
      Normal
    • Issue & fix info:
      High Value Fix, Repro attached
    • Bug behavior facts:
      Seen in production

      Description

      The attached Java program does the following, using two concurrent threads for the last part.
      a) Inserts a 1 MB Clob.
      b) UpdateThread changes the Clob to a 0.5 MB Clob, then sleeps for 5 seconds before it performs a rollback.
      c) SelectThread selects the Clob and calls Clob.length().

      Both threads/connections run with READ_UNCOMMITTED isolation and with autocommit off.

      Running the attached Java program results in the following exception:
      1217602873292 :: Using isolation level READ_UNCOMMITTED (1)
      1217602873341 :: <<UpdateThread>> Starting update.
      1217602873649 :: <<UpdateThread>> Update completed!
      1217602875305 :: <<SelectThread>> Executing select query...
      1217602875328 :: <<SelectThread>> Getting Clob object...
      java.sql.SQLException: Java exception: '-1: java.lang.ArrayIndexOutOfBoundsException'.
      at org.apache.derby.impl.jdbc.SQLExceptionFactory40.getSQLException(SQLExceptionFactory40.java:95)
      at org.apache.derby.impl.jdbc.Util.newEmbedSQLException(Util.java:87)
      at org.apache.derby.impl.jdbc.Util.javaException(Util.java:244)
      at org.apache.derby.impl.jdbc.TransactionResourceImpl.wrapInSQLException(TransactionResourceImpl.java:403)
      at org.apache.derby.impl.jdbc.EmbedResultSet.noStateChangeException(EmbedResultSet.java:4386)
      at org.apache.derby.impl.jdbc.StoreStreamClob.noStateChangeLOB(StoreStreamClob.java:292)
      at org.apache.derby.impl.jdbc.StoreStreamClob.getCharLength(StoreStreamClob.java:170)
      at org.apache.derby.impl.jdbc.EmbedClob.length(EmbedClob.java:175)
      at ClobStabilityTest$SelectThread.myRun(ClobStabilityTest.java:102)
      at ClobStabilityTest$SelectThread.run(ClobStabilityTest.java:87)
      at java.lang.Thread.run(Thread.java:619)
      Caused by: java.sql.SQLException: Java exception: '-1: java.lang.ArrayIndexOutOfBoundsException'.
      at org.apache.derby.impl.jdbc.SQLExceptionFactory.getSQLException(SQLExceptionFactory.java:45)
      at org.apache.derby.impl.jdbc.SQLExceptionFactory40.wrapArgsForTransportAcrossDRDA(SQLExceptionFactory40.java:119)
      at org.apache.derby.impl.jdbc.SQLExceptionFactory40.getSQLException(SQLExceptionFactory40.java:70)
      ... 10 more
      Caused by: java.lang.ArrayIndexOutOfBoundsException: -1
      at org.apache.derby.impl.store.raw.data.BasePage.getHeaderAtSlot(BasePage.java:1794)
      at org.apache.derby.impl.store.raw.data.StoredPage.restorePortionLongColumn(StoredPage.java:5816)
      at org.apache.derby.impl.store.raw.data.OverflowInputStream.fillByteHolder(OverflowInputStream.java:151)
      at org.apache.derby.impl.store.raw.data.BufferedByteHolderInputStream.read(BufferedByteHolderInputStream.java:44)
      at java.io.DataInputStream.read(DataInputStream.java:132)
      at org.apache.derby.impl.jdbc.PositionedStoreStream.read(PositionedStoreStream.java:106)
      at java.io.BufferedInputStream.fill(BufferedInputStream.java:218)
      at java.io.BufferedInputStream.read(BufferedInputStream.java:237)
      at org.apache.derby.iapi.util.UTF8Util.internalSkip(UTF8Util.java:111)
      at org.apache.derby.iapi.util.UTF8Util.skipUntilEOF(UTF8Util.java:59)
      at org.apache.derby.impl.jdbc.StoreStreamClob.getCharLength(StoreStreamClob.java:167)
      ... 4 more
      1217602878848 :: <<UpdateThread>> Rolling back!

      Is the Java program valid, and should this work?

      1. LOBLocksTest.html
        65 kB
        Kristian Waagan
      2. derby-3811-1a-LOBLocksTest.diff
        25 kB
        Kristian Waagan
      3. ClobStabilityTest.java
        5 kB
        Kristian Waagan

        Issue Links

          Activity

          Hide
          Knut Anders Hatlen added a comment -

          [bulk update] Close all resolved issues that haven't been updated for more than one year.

          Show
          Knut Anders Hatlen added a comment - [bulk update] Close all resolved issues that haven't been updated for more than one year.
          Hide
          Kathey Marsden added a comment -

          It looks like this was resolved in 10.6 although we do not know exactly which change resolved the issue.

          Show
          Kathey Marsden added a comment - It looks like this was resolved in 10.6 although we do not know exactly which change resolved the issue.
          Hide
          Jonathan Kusel added a comment - - edited

          Ran the ClobStabilityTest against v10.6.1.0 and it did not give any errors:

          1350427570537 :: Using isolation level READ_UNCOMMITTED (1)
          1350427570568 :: <<UpdateThread>> Starting update.
          1350427570600 :: <<UpdateThread>> Update completed!
          1350427572550 :: <<SelectThread>> Executing select query...
          1350427572582 :: <<SelectThread>> Getting Clob object...
          1350427575624 :: <<SelectThread>> Clob length #1: 524288
          1350427575624 :: <<UpdateThread>> Rolling back!
          1350427585625 :: <<SelectThread>> Clob length #2: 524288

          Can confirm though that it fails for v10.5.3.0

          Show
          Jonathan Kusel added a comment - - edited Ran the ClobStabilityTest against v10.6.1.0 and it did not give any errors: 1350427570537 :: Using isolation level READ_UNCOMMITTED (1) 1350427570568 :: <<UpdateThread>> Starting update. 1350427570600 :: <<UpdateThread>> Update completed! 1350427572550 :: <<SelectThread>> Executing select query... 1350427572582 :: <<SelectThread>> Getting Clob object... 1350427575624 :: <<SelectThread>> Clob length #1: 524288 1350427575624 :: <<UpdateThread>> Rolling back! 1350427585625 :: <<SelectThread>> Clob length #2: 524288 Can confirm though that it fails for v10.5.3.0
          Hide
          Kathey Marsden added a comment -

          Running the repro program against trunk, I get no errors:

          /repro/derby-3811 $java ClobStabilityTest
          348688920719 :: Using isolation level READ_UNCOMMITTED (1)
          348688920774 :: <<UpdateThread>> Starting update.
          348688920833 :: <<UpdateThread>> Update completed!
          348688922779 :: <<SelectThread>> Executing select query...
          348688922809 :: <<SelectThread>> Getting Clob object...
          348688925931 :: <<SelectThread>> Clob length #1: 524288
          348688925946 :: <<UpdateThread>> Rolling back!
          348688935931 :: <<SelectThread>> Clob length #2: 524288

          Is this issue now fixed? Any idea what might have fixed it?

          Show
          Kathey Marsden added a comment - Running the repro program against trunk, I get no errors: /repro/derby-3811 $java ClobStabilityTest 348688920719 :: Using isolation level READ_UNCOMMITTED (1) 348688920774 :: <<UpdateThread>> Starting update. 348688920833 :: <<UpdateThread>> Update completed! 348688922779 :: <<SelectThread>> Executing select query... 348688922809 :: <<SelectThread>> Getting Clob object... 348688925931 :: <<SelectThread>> Clob length #1: 524288 348688925946 :: <<UpdateThread>> Rolling back! 348688935931 :: <<SelectThread>> Clob length #2: 524288 Is this issue now fixed? Any idea what might have fixed it?
          Hide
          Kristian Waagan added a comment -

          Attaching a basic LOB locking test as patch 1a.
          I'm also attaching the JavaDoc as HTML for easier reading (reading the overall test description may help).

          I'm seeing failures with when running this test. It would be nice if someone could verify if the test is doing the right things.
          You can expect to see eight or nine errors when running the tests.
          For the record, I think the test exposes more than one problem.

          Test ready for review.

          Show
          Kristian Waagan added a comment - Attaching a basic LOB locking test as patch 1a. I'm also attaching the JavaDoc as HTML for easier reading (reading the overall test description may help). I'm seeing failures with when running this test. It would be nice if someone could verify if the test is doing the right things. You can expect to see eight or nine errors when running the tests. For the record, I think the test exposes more than one problem. Test ready for review.
          Hide
          Kristian Waagan added a comment -

          Thanks for checking, Kathey.

          I was just thinking about studying the corrupted page itself. Being an "ad-hoc" procedure, I don't have any concrete steps for you.
          I guess waiting for another chance to look at a corrupted page is a better option, but if you have some spare time:

          • Is the page all nulls?
          • Is it an overflow page?
          • Does the slot table look correct?
          • Is the checksum(s) correct?
            (I'm thinking Derby should have detected it if it isn't?)
          • Does the various flags look okay?
            (decoding these are cumbersome unless you have a tool...)
          • Are there any other rows on the page, and are these valid)

          I think the first questions are easier to answer, then the answers get harder to obtain as you go down the list...
          Also, are you able to run the consistency checker? Or does it fail with the same error?

          I haven't checked if the original repro corrupts the database, so that's something that should be checked as well (I won't have time to do this now, but maybe later).

          Show
          Kristian Waagan added a comment - Thanks for checking, Kathey. I was just thinking about studying the corrupted page itself. Being an "ad-hoc" procedure, I don't have any concrete steps for you. I guess waiting for another chance to look at a corrupted page is a better option, but if you have some spare time: Is the page all nulls? Is it an overflow page? Does the slot table look correct? Is the checksum(s) correct? (I'm thinking Derby should have detected it if it isn't?) Does the various flags look okay? (decoding these are cumbersome unless you have a tool...) Are there any other rows on the page, and are these valid) I think the first questions are easier to answer, then the answers get harder to obtain as you go down the list... Also, are you able to run the consistency checker? Or does it fail with the same error? I haven't checked if the original repro corrupts the database, so that's something that should be checked as well (I won't have time to do this now, but maybe later).
          Hide
          Kathey Marsden added a comment -

          I am sorry but I cannot post the corrupt database, but I do have access to it. If there are any debugging steps you can walk me through, I would be more than happy to try.

          Show
          Kathey Marsden added a comment - I am sorry but I cannot post the corrupt database, but I do have access to it. If there are any debugging steps you can walk me through, I would be more than happy to try.
          Hide
          Kathey Marsden added a comment -

          I will ask the user if I can post it here, but it may take a while to get a response. In the meanwhile if there is anything I can try with the database please let me know. The table had only 4 rows and one of the rows and the index was corrupt. The corruption was in a clob(10485760) NOT NULL column. I have not yet been able to get much information on what was going on at the time of the corruption, but my understanding is that this table holds somewhat transient data. It usually has only a few rows that change frequently. I am not sure if the changes come only in the form of inserts and deletes or if there are updates as well.

          Show
          Kathey Marsden added a comment - I will ask the user if I can post it here, but it may take a while to get a response. In the meanwhile if there is anything I can try with the database please let me know. The table had only 4 rows and one of the rows and the index was corrupt. The corruption was in a clob(10485760) NOT NULL column. I have not yet been able to get much information on what was going on at the time of the corruption, but my understanding is that this table holds somewhat transient data. It usually has only a few rows that change frequently. I am not sure if the changes come only in the form of inserts and deletes or if there are updates as well.
          Hide
          Kristian Waagan added a comment -

          Kathey,

          Any chance of getting the db with the corrupted column?

          Show
          Kristian Waagan added a comment - Kathey, Any chance of getting the db with the corrupted column?
          Hide
          Kathey Marsden added a comment -

          A user reported an error similar to this selecting from a CLOB column, but in their case the row is actually corrupt. If I select from their table with ij I get:
          java.lang.ArrayIndexOutOfBoundsException: -1

          at org.apache.derby.impl.store.raw.data.BasePage.getHeaderAtSlot(BasePage.java:2013)

          at org.apache.derby.impl.store.raw.data.StoredPage.restorePortionLongColumn(StoredPage.java:5806)

          at org.apache.derby.impl.store.raw.data.OverflowInputStream.fillByteHolder(OverflowInputStream.java:82)

          at org.apache.derby.impl.store.raw.data.OverflowInputStream.<init>(OverflowInputStream.java:64)

          at org.apache.derby.impl.store.raw.data.StoredPage.readRecordFromArray(StoredPage.java:5555)

          at org.apache.derby.impl.store.raw.data.StoredPage.restoreRecordFromSlot(StoredPage.java:1499)

          at org.apache.derby.impl.store.raw.data.BasePage.fetchFromSlot(BasePage.java:498)

          at org.apache.derby.impl.store.access.conglomerate.GenericScanController.fetchRows(GenericScanController.java:768)

          at org.apache.derby.impl.store.access.heap.HeapScan.fetchNextGroup(HeapScan.java:338)

          at org.apache.derby.impl.sql.execute.BulkTableScanResultSet.reloadArray(BulkTableScanResultSet.java:329)

          at org.apache.derby.impl.sql.execute.BulkTableScanResultSet.getNextRowCore(BulkTableScanResultSet.java:284)

          at org.apache.derby.impl.sql.execute.BasicNoPutResultSetImpl.getNextRow(BasicNoPutResultSetImpl.java:475)

          at org.apache.derby.impl.jdbc.EmbedResultSet.movePosition(EmbedResultSet.java:448)

          at org.apache.derby.impl.jdbc.EmbedResultSet.next(EmbedResultSet.java:392)

          at org.apache.derby.tools.JDBCDisplayUtil.indent_DisplayResults(JDBCDisplayUtil.java:382)

          at org.apache.derby.tools.JDBCDisplayUtil.indent_DisplayResults(JDBCDisplayUtil.java:338)

          at org.apache.derby.tools.JDBCDisplayUtil.indent_DisplayResults(JDBCDisplayUtil.java:241)

          at org.apache.derby.tools.JDBCDisplayUtil.DisplayResults(JDBCDisplayUtil.java:229)

          at org.apache.derby.impl.tools.ij.utilMain.displayResult(utilMain.java:456)

          at org.apache.derby.impl.tools.ij.utilMain.doCatch(utilMain.java:530)

          at org.apache.derby.impl.tools.ij.utilMain.runScriptGuts(utilMain.java:371)

          at org.apache.derby.impl.tools.ij.utilMain.go(utilMain.java:269)

          at org.apache.derby.impl.tools.ij.Main.go(Main.java:204)

          at org.apache.derby.impl.tools.ij.Main.mainCore(Main.java:170)

          at org.apache.derby.impl.tools.ij.Main14.main(Main14.java:56)

          at org.apache.derby.tools.ij.main(ij.java:71)

          I do not yet have a description of what they were doing that led up to the corruption, but apparently they have seen this type of corruption before with this application. Any ideas what could cause it? This is actually occurring with 10.2. Does it look related to this issue or look like something else?

          Show
          Kathey Marsden added a comment - A user reported an error similar to this selecting from a CLOB column, but in their case the row is actually corrupt. If I select from their table with ij I get: java.lang.ArrayIndexOutOfBoundsException: -1 at org.apache.derby.impl.store.raw.data.BasePage.getHeaderAtSlot(BasePage.java:2013) at org.apache.derby.impl.store.raw.data.StoredPage.restorePortionLongColumn(StoredPage.java:5806) at org.apache.derby.impl.store.raw.data.OverflowInputStream.fillByteHolder(OverflowInputStream.java:82) at org.apache.derby.impl.store.raw.data.OverflowInputStream.<init>(OverflowInputStream.java:64) at org.apache.derby.impl.store.raw.data.StoredPage.readRecordFromArray(StoredPage.java:5555) at org.apache.derby.impl.store.raw.data.StoredPage.restoreRecordFromSlot(StoredPage.java:1499) at org.apache.derby.impl.store.raw.data.BasePage.fetchFromSlot(BasePage.java:498) at org.apache.derby.impl.store.access.conglomerate.GenericScanController.fetchRows(GenericScanController.java:768) at org.apache.derby.impl.store.access.heap.HeapScan.fetchNextGroup(HeapScan.java:338) at org.apache.derby.impl.sql.execute.BulkTableScanResultSet.reloadArray(BulkTableScanResultSet.java:329) at org.apache.derby.impl.sql.execute.BulkTableScanResultSet.getNextRowCore(BulkTableScanResultSet.java:284) at org.apache.derby.impl.sql.execute.BasicNoPutResultSetImpl.getNextRow(BasicNoPutResultSetImpl.java:475) at org.apache.derby.impl.jdbc.EmbedResultSet.movePosition(EmbedResultSet.java:448) at org.apache.derby.impl.jdbc.EmbedResultSet.next(EmbedResultSet.java:392) at org.apache.derby.tools.JDBCDisplayUtil.indent_DisplayResults(JDBCDisplayUtil.java:382) at org.apache.derby.tools.JDBCDisplayUtil.indent_DisplayResults(JDBCDisplayUtil.java:338) at org.apache.derby.tools.JDBCDisplayUtil.indent_DisplayResults(JDBCDisplayUtil.java:241) at org.apache.derby.tools.JDBCDisplayUtil.DisplayResults(JDBCDisplayUtil.java:229) at org.apache.derby.impl.tools.ij.utilMain.displayResult(utilMain.java:456) at org.apache.derby.impl.tools.ij.utilMain.doCatch(utilMain.java:530) at org.apache.derby.impl.tools.ij.utilMain.runScriptGuts(utilMain.java:371) at org.apache.derby.impl.tools.ij.utilMain.go(utilMain.java:269) at org.apache.derby.impl.tools.ij.Main.go(Main.java:204) at org.apache.derby.impl.tools.ij.Main.mainCore(Main.java:170) at org.apache.derby.impl.tools.ij.Main14.main(Main14.java:56) at org.apache.derby.tools.ij.main(ij.java:71) I do not yet have a description of what they were doing that led up to the corruption, but apparently they have seen this type of corruption before with this application. Any ideas what could cause it? This is actually occurring with 10.2. Does it look related to this issue or look like something else?
          Hide
          Knut Anders Hatlen added a comment -

          Triaged for 10.5.2.

          Show
          Knut Anders Hatlen added a comment - Triaged for 10.5.2.
          Show
          Knut Anders Hatlen added a comment - A similar problem was reported on derby-user: http://mail-archives.apache.org/mod_mbox/db-derby-user/200905.mbox/%3C23587108.post@talk.nabble.com%3E
          Hide
          Kristian Waagan added a comment -

          Attaching repro. The system property "isolationLevel" can be used to specify the isolation level to be used (1 - 4).

          Show
          Kristian Waagan added a comment - Attaching repro. The system property "isolationLevel" can be used to specify the isolation level to be used (1 - 4).

            People

            • Assignee:
              Unassigned
              Reporter:
              Kristian Waagan
            • Votes:
              1 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development