Derby
  1. Derby
  2. DERBY-5650

ERROR XSDBB: Unknown page format at page Page(1936,Container(0, 1009)), page dump follows: Hex dump: 00000000:

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Cannot Reproduce
    • Affects Version/s: 10.6.1.0
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None
    • Urgency:
      Urgent
    • Bug behavior facts:
      Data corruption, Seen in production

      Description

      i am trying to insert values into the database table EVENTS_TABLE where i have 18 columns with EVENT_ID as Unique but can be null. Indexes for this table is EVENT_INDEX which points to EVENT_ID internally. Now at some point of insertion into this Table, i am getting "The statement was aborted because it would have caused a duplicate key value in a unique or primary key constraint or unique index identified by 'EVENT_INDEX' defined on 'EVENTS_TABLE'." error in my logs and in derby logs i am getting ERROR XSDBB: Unknown page format at page Page(1936,Container(0, 1009)), page dump follows: Hex dump: 00000000: error.

      Please let me know why this is happening and what is the meaning of this error.

      1. derby.log
        379 kB
        Deepika Kochhar

        Activity

        Hide
        Rick Hillegas added a comment -

        Hi Deepika,

        The XSDBB indicates a database corruption. Can you attach the derby.log? Thanks.

        Show
        Rick Hillegas added a comment - Hi Deepika, The XSDBB indicates a database corruption. Can you attach the derby.log? Thanks.
        Hide
        Deepika Kochhar added a comment -

        Requested Derby log

        Show
        Deepika Kochhar added a comment - Requested Derby log
        Hide
        Deepika Kochhar added a comment -

        Hi Rick,

        Yes this indicates database corruption but can you tell me under what circumstances this might have occurred.

        Show
        Deepika Kochhar added a comment - Hi Rick, Yes this indicates database corruption but can you tell me under what circumstances this might have occurred.
        Hide
        Rick Hillegas added a comment -

        Thanks for attaching derby.log, Deepika. The corruption first appeared with an INSERT INTO EVENTS_TABLE statement, as you note above. Before that statement, all I see in derby.log are boot and shutdown diagnostics. It appears that the installation was using Derby 10.2.2.0 at first on 2011-07-21. About 8 hours later the installation was brought down and rebooted with Derby 10.6.1.0. The installation was rebooted a couple times and ran fine with 10.6.1.0 until the first error occurred on 2011-11-03.

        After that, the corruption choked a database backup on 2011-11-16. The INSERT statement failed again on 2011-11-17, 2011-11-22, 2011-11-23, 2012-02-06, and 2012-02-21. The following corrupted pages (all in the same conglomerate) appear in derby.log: 2953, 1936, 2914, 2125. I do not see any evidence that any other conglomerates are corrupted. It is possible that you may be able to repair the database (at least to the point that you can dump and reload it) by dropping and recreating the indexes on EVENTS_TABLE.

        The initial corruption occurred several months ago. This long afterward, we generally don't have much success in figuring out what caused the original problem. Did anything special happen on 2011-11-03 that you can remember?

        Thanks,
        -Rick

        Show
        Rick Hillegas added a comment - Thanks for attaching derby.log, Deepika. The corruption first appeared with an INSERT INTO EVENTS_TABLE statement, as you note above. Before that statement, all I see in derby.log are boot and shutdown diagnostics. It appears that the installation was using Derby 10.2.2.0 at first on 2011-07-21. About 8 hours later the installation was brought down and rebooted with Derby 10.6.1.0. The installation was rebooted a couple times and ran fine with 10.6.1.0 until the first error occurred on 2011-11-03. After that, the corruption choked a database backup on 2011-11-16. The INSERT statement failed again on 2011-11-17, 2011-11-22, 2011-11-23, 2012-02-06, and 2012-02-21. The following corrupted pages (all in the same conglomerate) appear in derby.log: 2953, 1936, 2914, 2125. I do not see any evidence that any other conglomerates are corrupted. It is possible that you may be able to repair the database (at least to the point that you can dump and reload it) by dropping and recreating the indexes on EVENTS_TABLE. The initial corruption occurred several months ago. This long afterward, we generally don't have much success in figuring out what caused the original problem. Did anything special happen on 2011-11-03 that you can remember? Thanks, -Rick
        Hide
        Mike Matrigali added a comment -

        I agree with rick's assessment. Some of the stacks seem to have btree in them, so you may be lucky and dropping and recreating the indexes may work. To be safe I would make a copy of the db to try recovering on. Running offline compress on the EVENTS_TABLE would probably be easiest first step. Then I would suggest running a consistency check across the whole database.

        As rick says, figuring out what happened 7 months ago is hard.

        In my quick look at the log it looks like all the corrupted pages (at least those found so far) are all zero's, which is definitely not a legal
        page in derby. Any chance the file system you are running this database on at that time was unique in some way? Maybe with
        a hardware crash or bad disk? Derby depends
        on being able to sync to disk at certain times, and will not run properly if this sync to disk feature is disabled at the hardware or
        filesystem level. The all zero's has the feel of an improper file allocation where the actual data never made it to disk.

        There is at least one outstanding report of a corruption due to running compress table concurrently. Do you run either of the
        compress table system procedures?

        I would also recommend running on a later release to make sure you have all available fixes. The latest release is 10.8.2.2.
        If for some reason you can't move
        off of 10.6 at least move to latest 10.6 available which is 10.6.2.1. Derby puts a lot of testing into backward compatibility, and most
        users are able to upgrade to latest software with no problems.

        Show
        Mike Matrigali added a comment - I agree with rick's assessment. Some of the stacks seem to have btree in them, so you may be lucky and dropping and recreating the indexes may work. To be safe I would make a copy of the db to try recovering on. Running offline compress on the EVENTS_TABLE would probably be easiest first step. Then I would suggest running a consistency check across the whole database. As rick says, figuring out what happened 7 months ago is hard. In my quick look at the log it looks like all the corrupted pages (at least those found so far) are all zero's, which is definitely not a legal page in derby. Any chance the file system you are running this database on at that time was unique in some way? Maybe with a hardware crash or bad disk? Derby depends on being able to sync to disk at certain times, and will not run properly if this sync to disk feature is disabled at the hardware or filesystem level. The all zero's has the feel of an improper file allocation where the actual data never made it to disk. There is at least one outstanding report of a corruption due to running compress table concurrently. Do you run either of the compress table system procedures? I would also recommend running on a later release to make sure you have all available fixes. The latest release is 10.8.2.2. If for some reason you can't move off of 10.6 at least move to latest 10.6 available which is 10.6.2.1. Derby puts a lot of testing into backward compatibility, and most users are able to upgrade to latest software with no problems.
        Hide
        Kathey Marsden added a comment -

        I noticed for the 10.6 version the build is: 10.6.1.0 - (exported) rather than a specific build number
        Did you build 10.6 yourself? Are there any changes to the Derby code in this build?

        Show
        Kathey Marsden added a comment - I noticed for the 10.6 version the build is: 10.6.1.0 - (exported) rather than a specific build number Did you build 10.6 yourself? Are there any changes to the Derby code in this build?
        Hide
        Rick Hillegas added a comment -

        Hi Deepika,

        Were you able to repair your indexes? Has the problem happened again? Are you, as Kathey wondered, using a custom-built version of Derby? Without more information it will be hard to make progress on this issue. Thanks.

        Show
        Rick Hillegas added a comment - Hi Deepika, Were you able to repair your indexes? Has the problem happened again? Are you, as Kathey wondered, using a custom-built version of Derby? Without more information it will be hard to make progress on this issue. Thanks.
        Hide
        Rick Hillegas added a comment -

        Hi Deepika,

        Gentle nudge: Can you give us any more feedback on the questions which Kathey and I posed in March and April? Thanks.

        Show
        Rick Hillegas added a comment - Hi Deepika, Gentle nudge: Can you give us any more feedback on the questions which Kathey and I posed in March and April? Thanks.
        Hide
        Rick Hillegas added a comment -

        Without more information, we can't make progress on this issue. This issue can be re-opened if more information becomes available.

        Show
        Rick Hillegas added a comment - Without more information, we can't make progress on this issue. This issue can be re-opened if more information becomes available.

          People

          • Assignee:
            Unassigned
            Reporter:
            Deepika Kochhar
          • Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development