|
I am slightly confused by the 1st line in the description of this Jira entry. It says "There must be an entry in the SYSSTATISTICS table in order for the cardinality statistics in SYSSTATISTICS to be created with SYSCS_UTIL.SYSCS_COMPRESS_TABLE "
Shouldn't it read "There must be an entry in the SYSSTATISTICS table in order for the cardinality statistics in SYSSTATISTICS to be *updated* with SYSCS_UTIL.SYSCS_COMPRESS_TABLE "? The new sentence I am suggesting describes the current behavior of Derby which is that currently, SYSCS_UTIL.SYSCS_COMPRESS_TABLE can update statistics for indexes that already have rows in SYSSTATISTICS table. If there is no row for an index in SYSSTATISTICS table, then no statistics will be generated for that index. With this Jira, we want to change Derby behavior such that if there is no row for an index in SYSSTATISTICS table, then at the time of SYSCS_UTIL.SYSCS_COMPRESS_TABLE, we should create a row for that index into SYSSTATISTICS table(provided that there is data in the table on which SYSCS_UTIL.SYSCS_COMPRESS_TABLE is getting run, no?) . If a row already exists for an index, then SYSCS_UTIL.SYSCS_COMPRESS_TABLE already does the job of updating the statistics for that index. I am interested in working on this issue and want to be clear that I understand the current behavior and the expected new behavior. I believe your interpretation is correct. The request is to always "update" statistics when running the compress table command. Internally this may mean updating a row or creating a new row - the difference need not be documented to the user. Since the entire index is getting rebuilt, I can think of no reason not to gather the statistics and record them at this time.
I would like to submit a patch for this Jira entry. It is attached as DERBY737_v1_diff_SYSCS_COMPRESS_TABLE.txt The changes have been very localized in AlterTableConstantAction.java!updateIndex() Currently, this method checks if statistics already exist for an index. If yes, then it sets a flag updateStatistics to true. Later, the code checks for this flag and drops the existing statistics and creates new statistics for that index provided the user table at this point is not empty. So, as we can see, if there is an index with no preexisting statistics, the flag updateStatistics will be set to false and hence no statistics related code is executed and hence even though the user table is not empty at the time of compress, no statistics get generated for such an index.
I am proposing to fix the problem by still using the flag to see if an index has pre-existing statistics. If yes, then we should drop those statistics. Next, whether the index has pre-existing statistics or not, go ahead and create new statistics for the index provided the user table is not currently empty. I ran the derbyall suite on Windows XP with Sun JDK 1.4 and there were no new failures. In addition, I have added few tests to lang/compressTable.sql Can someone please review this patch for me? Hi Mamta, I had a look at your patch.
Your changes seem good to me. Your new tests failed as expected without the code changes, and passed as expected with the code changes. I also had a clean derbyall run with your changes. Do you feel that this change is ready for commit? Is anyone else reviewing this change? Thanks for taking the time out to review the patch, Bryan. Yes, the patch is ready for commit unless someone else is reviewing it too. Thanks again.
Committed the patch to subversion as revision 464551.
Mamta, do you think this patch needs to be ported to any prior releases? Bryan, thanks for committing the patch to the trunk. I think it will be useful to port this patch to prior releases because optimizer relies on the statistics information for indexes on a table. As for documentation, I will open another Jira entry for the doc changes.
The merge to the 10.2 branch was straightforward, and my 10.2 derbyall test run was clean. I propose to commit this merged change to the 10.2 branch.
Merged the trunk fix to the 10.2 branch and committed to subversion
as revision 464683. Merged changes into 10.1 codeline using revision 632065
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
First try at this might be just removing the if in opensource/java/engine/org/apache/derby/impl/sql/execute/AlterTableConstantAction.java!updateIndex():
if (td.statisticsExist(cd))
{
cCount = new CardinalityCounter(tc.openSortRowSource(sortIds[ind
ex]));
updateStatistics = true;
}
else
cCount = tc.openSortRowSource(sortIds[index]);
But life is probably not that easy. Likely there is slightly more work to create the statistics row vs. updating it. The work to insert the row can be
found in:
opensource/java/engine/org/apache/derby/impl/sql/execute/CreateIndexConstantAction.java