Derby
  1. Derby
  2. DERBY-5045

Assert failures in UpdateStatisticsTest

    Details

    • Urgency:
      Normal
    • Bug behavior facts:
      Regression Test Failure

      Description

      I saw these failures when running suites.All against revision 1070949 + the 1b patch attached to DERBY-642:

      junit.framework.AssertionFailedError: Index statistics for <ALL TABLES>
      1:

      {tableId=94a458d3-012e-2a1f-d23d-ffff93077fa3, tableName=<n/a>, indexName=<n/a>, lcols=1, rows=160, unique/card=160, created=2011-02-15 17:31:00.951}

      expected:<0> but was:<1>
      at org.apache.derbyTesting.junit.IndexStatsUtil.assertStats(IndexStatsUtil.java:132)
      at org.apache.derbyTesting.junit.IndexStatsUtil.assertNoStats(IndexStatsUtil.java:109)
      at org.apache.derbyTesting.functionTests.tests.lang.UpdateStatisticsTest.testUpdateStatistics(UpdateStatisticsTest.java:91)
      at org.apache.derbyTesting.junit.BaseTestCase.runBare(BaseTestCase.java:112)
      at junit.extensions.TestDecorator.basicRun(TestDecorator.java:24)
      at junit.extensions.TestSetup$1.protect(TestSetup.java:23)
      at junit.extensions.TestSetup.run(TestSetup.java:27)
      at org.apache.derbyTesting.junit.BaseTestSetup.run(BaseTestSetup.java:57)
      at junit.extensions.TestDecorator.basicRun(TestDecorator.java:24)
      at junit.extensions.TestSetup$1.protect(TestSetup.java:23)
      at junit.extensions.TestSetup.run(TestSetup.java:27)
      at org.apache.derbyTesting.junit.BaseTestSetup.run(BaseTestSetup.java:57)

      junit.framework.AssertionFailedError: Index statistics for <ALL TABLES>
      1:

      {tableId=94a458d3-012e-2a1f-d23d-ffff93077fa3, tableName=<n/a>, indexName=<n/a>, lcols=1, rows=160, unique/card=160, created=2011-02-15 17:31:00.951}

      expected:<0> but was:<1>
      at org.apache.derbyTesting.junit.IndexStatsUtil.assertStats(IndexStatsUtil.java:132)
      at org.apache.derbyTesting.junit.IndexStatsUtil.assertNoStats(IndexStatsUtil.java:109)
      at org.apache.derbyTesting.functionTests.tests.lang.UpdateStatisticsTest.testUpdateStatistics(UpdateStatisticsTest.java:91)
      at org.apache.derbyTesting.junit.BaseTestCase.runBare(BaseTestCase.java:112)
      at junit.extensions.TestDecorator.basicRun(TestDecorator.java:24)
      at junit.extensions.TestSetup$1.protect(TestSetup.java:23)
      at junit.extensions.TestSetup.run(TestSetup.java:27)
      at org.apache.derbyTesting.junit.BaseTestSetup.run(BaseTestSetup.java:57)
      at junit.extensions.TestDecorator.basicRun(TestDecorator.java:24)
      at junit.extensions.TestSetup$1.protect(TestSetup.java:23)
      at junit.extensions.TestSetup.run(TestSetup.java:27)
      at junit.extensions.TestDecorator.basicRun(TestDecorator.java:24)
      at junit.extensions.TestSetup$1.protect(TestSetup.java:23)
      at junit.extensions.TestSetup.run(TestSetup.java:27)
      at org.apache.derbyTesting.junit.BaseTestSetup.run(BaseTestSetup.java:57)
      at junit.extensions.TestDecorator.basicRun(TestDecorator.java:24)
      at junit.extensions.TestSetup$1.protect(TestSetup.java:23)
      at junit.extensions.TestSetup.run(TestSetup.java:27)
      at org.apache.derbyTesting.junit.BaseTestSetup.run(BaseTestSetup.java:57)

      1. derby-5045-1a.diff
        1 kB
        Knut Anders Hatlen

        Issue Links

          Activity

          Hide
          Rick Hillegas added a comment -

          I have not been able to reproduce this error on my Mac. I have set up a codeline which is revision 1070949 + the 1b patch attached to DERBY-642 and built both sane and insane jars. I have run UpdateStatisticsTest standalone 20 times against each set of jars. I have also run the lang _Suite against both sets of jars. Finally, I have run the full JUnit suite against both sets of jars. But I have been able to reproduce this problem.

          Show
          Rick Hillegas added a comment - I have not been able to reproduce this error on my Mac. I have set up a codeline which is revision 1070949 + the 1b patch attached to DERBY-642 and built both sane and insane jars. I have run UpdateStatisticsTest standalone 20 times against each set of jars. I have also run the lang _Suite against both sets of jars. Finally, I have run the full JUnit suite against both sets of jars. But I have been able to reproduce this problem.
          Hide
          Kristian Waagan added a comment -

          Happened again in the nightly run for 2011-02-18 18:00:25 MET:
          http://dbtg.foundry.sun.com/derby/test/Daily/jvm1.6/testing/testlog/vista/1072063-suitesAll_diff.txt
          (overview of all runs this day at http://dbtg.foundry.sun.com/derby/test/Daily/jvm1.6/testing/Limited/testSummary-1072063.html)
          Note that the patch for DERBY-642 has been committed now.
          Note also that the automatic statistics update feature is supposed to be disabled for this test.

          I don't know why there is no table/index name in the stats debug printout. But maybe someone knows which table has 160 rows [at some point in time]

          The test is checking for statistics for all tables, one step forward would be to figure out if the stats are actually for a table in the failing test, or if it is a left-over from another test.
          If the stats belong to an unrelated table, we could use a clean database for the UpdateStatisticsTest, or query for the relevant tables only. The failure seems to be timing sensitive.

          Show
          Kristian Waagan added a comment - Happened again in the nightly run for 2011-02-18 18:00:25 MET: http://dbtg.foundry.sun.com/derby/test/Daily/jvm1.6/testing/testlog/vista/1072063-suitesAll_diff.txt (overview of all runs this day at http://dbtg.foundry.sun.com/derby/test/Daily/jvm1.6/testing/Limited/testSummary-1072063.html ) Note that the patch for DERBY-642 has been committed now. Note also that the automatic statistics update feature is supposed to be disabled for this test. I don't know why there is no table/index name in the stats debug printout. But maybe someone knows which table has 160 rows [at some point in time] The test is checking for statistics for all tables, one step forward would be to figure out if the stats are actually for a table in the failing test, or if it is a left-over from another test. If the stats belong to an unrelated table, we could use a clean database for the UpdateStatisticsTest, or query for the relevant tables only. The failure seems to be timing sensitive.
          Hide
          Rick Hillegas added a comment -

          In a side conversation with Knut (which I am now bringing back to the community) it seemed that this test failure is evidence for a pre-existing bug in the cleanup after DROP TABLE. When a table is dropped, its dependent objects (like rows in SYSSTATISTICS) should be dropped too. The relationships between catalogs are not enforced by foreign keys. That is because the DataDictionary lives beneath the SQL layer. It is expected that the dependencies between catalogs will be handled by the DataDictionary itself. It seems that there may be a bug in the logic which keeps SYSTABLES and SYSSTATISTICS in sync. The UpdateStatisticsTest appears to trigger this bug occasionally.

          Show
          Rick Hillegas added a comment - In a side conversation with Knut (which I am now bringing back to the community) it seemed that this test failure is evidence for a pre-existing bug in the cleanup after DROP TABLE. When a table is dropped, its dependent objects (like rows in SYSSTATISTICS) should be dropped too. The relationships between catalogs are not enforced by foreign keys. That is because the DataDictionary lives beneath the SQL layer. It is expected that the dependencies between catalogs will be handled by the DataDictionary itself. It seems that there may be a bug in the logic which keeps SYSTABLES and SYSSTATISTICS in sync. The UpdateStatisticsTest appears to trigger this bug occasionally.
          Hide
          Knut Anders Hatlen added a comment -

          I performed an experiment to see if we could get these orphaned entries in SYSSTATISTICS if a table was dropped just as istat was about to write the statistics. The experiment is described in DERBY-5124, and resulted in a NullPointerException in the istat thread. However, if I comment out the log statement that causes the NPE and repeat the experiment, I do end up with a row in SYSSTATISTICTS describing an index that no longer exists. I don't see how this could happen without triggering the NPE, though. And the NPE causes the update in SYSSTATISTICS to be rolled back, so then we shouldn't see this problem.

          Show
          Knut Anders Hatlen added a comment - I performed an experiment to see if we could get these orphaned entries in SYSSTATISTICS if a table was dropped just as istat was about to write the statistics. The experiment is described in DERBY-5124 , and resulted in a NullPointerException in the istat thread. However, if I comment out the log statement that causes the NPE and repeat the experiment, I do end up with a row in SYSSTATISTICTS describing an index that no longer exists. I don't see how this could happen without triggering the NPE, though. And the NPE causes the update in SYSSTATISTICS to be rolled back, so then we shouldn't see this problem.
          Hide
          Knut Anders Hatlen added a comment -

          The log statement that caused the NPE was added in DERBY-5089, and I cannot find any occurrence of this test failure after DERBY-5089 was resolved. So it sounds likely that this issue is indeed caused by istat writing to SYSSTATISTICS after the index has been dropped, and thereby leaving an orphaned row. But right now the issue is hidden by the NPE in DERBY-5124 (which is only printed, so it won't cause any test failures).

          Show
          Knut Anders Hatlen added a comment - The log statement that caused the NPE was added in DERBY-5089 , and I cannot find any occurrence of this test failure after DERBY-5089 was resolved. So it sounds likely that this issue is indeed caused by istat writing to SYSSTATISTICS after the index has been dropped, and thereby leaving an orphaned row. But right now the issue is hidden by the NPE in DERBY-5124 (which is only printed, so it won't cause any test failures).
          Hide
          Knut Anders Hatlen added a comment -

          I believe that the attached patch fixes the problem. Since the istat thread doesn't take any read locks, not even intention locks on the table, the index may be dropped by someone else until we start writing the stats to SYSSTATISTICS. Once we start writing to SYSSTATISTICS, the istat thread will have exclusive locks that prevent the table and its indexes from being dropped.

          The patch checks if the index still exists after it has written the new statistics to SYSSTATISTICS, but before it has committed the transaction. If the index doesn't exist, it aborts the transaction instead of committing it. When I run the DERBY-5124 experiment with this patch, the orphaned rows don't turn up in SYSSTATISTICS.

          Show
          Knut Anders Hatlen added a comment - I believe that the attached patch fixes the problem. Since the istat thread doesn't take any read locks, not even intention locks on the table, the index may be dropped by someone else until we start writing the stats to SYSSTATISTICS. Once we start writing to SYSSTATISTICS, the istat thread will have exclusive locks that prevent the table and its indexes from being dropped. The patch checks if the index still exists after it has written the new statistics to SYSSTATISTICS, but before it has committed the transaction. If the index doesn't exist, it aborts the transaction instead of committing it. When I run the DERBY-5124 experiment with this patch, the orphaned rows don't turn up in SYSSTATISTICS.
          Hide
          Knut Anders Hatlen added a comment -

          Committed revision 1080947.

          This should hopefully make the test failure go away. Resolving the issue on that assumption. Please reopen if the failure is still seen.

          Show
          Knut Anders Hatlen added a comment - Committed revision 1080947. This should hopefully make the test failure go away. Resolving the issue on that assumption. Please reopen if the failure is still seen.

            People

            • Assignee:
              Knut Anders Hatlen
              Reporter:
              Knut Anders Hatlen
            • Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development