Issue Details (XML | Word | Printable)

Key: DERBY-737
Type: Improvement Improvement
Status: Closed Closed
Resolution: Fixed
Priority: Minor Minor
Assignee: Mamta A. Satoor
Reporter: Kathey Marsden
Votes: 0
Watchers: 0
Operations

If you were logged in you would be able to see more operations.
Derby

SYSCS_UTIL.SYSCS_COMPRESS_TABLE should create statistics if they do not exist

Created: 01/Dec/05 09:52 AM   Updated: 28/Feb/08 05:21 PM
Return to search
Component/s: SQL
Affects Version/s: 10.0.2.0, 10.0.2.1, 10.0.2.2, 10.1.1.0, 10.1.2.1, 10.1.3.1, 10.2.1.6
Fix Version/s: 10.2.2.0, 10.3.1.4

Time Tracking:
Not Specified

File Attachments:
  Size
Text File Licensed for inclusion in ASF works DERBY737_v1_diff_SYSCS_COMPRESS_TABLE.txt 2006-10-12 03:11 AM Mamta A. Satoor 15 kB
Text File Licensed for inclusion in ASF works DERBY737_v1_stat_SYSCS_COMPRESS_TABLE.txt 2006-10-12 03:11 AM Mamta A. Satoor 0.4 kB
Issue Links:
Reference
 

Resolution Date: 16/Oct/06 09:44 PM


 Description  « Hide
There must be an entry in the SYSSTATISTICS table in order for the cardinality statistics in SYSSTATISTICS to be created with SYSCS_UTIL.SYSCS_COMPRESS_TABLE

SYSCS_UTIL.SYSCS_COMPRESS_TABLE should create statistics if they don't exist. The only way to create them if the index was created on an empty table, seems to be to drop and recreate the index after the table has been loaded.


The documentation will also need updating if this change is made.
http://db.apache.org/derby/docs/10.1/tuning/ctunstats57373.html

 All   Comments   Work Log   Change History   Subversion Commits      Sort Order: Ascending order - Click to sort in descending order
Mike Matrigali added a comment - 08/Dec/05 08:51 AM
I don't see any downside to creating the statistics whether they exist or not. Not sure why the distinction was made.
First try at this might be just removing the if in opensource/java/engine/org/apache/derby/impl/sql/execute/AlterTableConstantAction.java!updateIndex():

            if (td.statisticsExist(cd))
            {
                cCount = new CardinalityCounter(tc.openSortRowSource(sortIds[ind
ex]));
                updateStatistics = true;
            }
            else
                cCount = tc.openSortRowSource(sortIds[index]);


But life is probably not that easy. Likely there is slightly more work to create the statistics row vs. updating it. The work to insert the row can be
found in:
opensource/java/engine/org/apache/derby/impl/sql/execute/CreateIndexConstantAction.java

Mamta A. Satoor added a comment - 05/Oct/06 06:23 AM
I am slightly confused by the 1st line in the description of this Jira entry. It says "There must be an entry in the SYSSTATISTICS table in order for the cardinality statistics in SYSSTATISTICS to be created with SYSCS_UTIL.SYSCS_COMPRESS_TABLE "
Shouldn't it read "There must be an entry in the SYSSTATISTICS table in order for the cardinality statistics in SYSSTATISTICS to be *updated* with SYSCS_UTIL.SYSCS_COMPRESS_TABLE "? The new sentence I am suggesting describes the current behavior of Derby which is that currently, SYSCS_UTIL.SYSCS_COMPRESS_TABLE can update statistics for indexes that already have rows in SYSSTATISTICS table. If there is no row for an index in SYSSTATISTICS table, then no statistics will be generated for that index.

With this Jira, we want to change Derby behavior such that if there is no row for an index in SYSSTATISTICS table, then at the time of SYSCS_UTIL.SYSCS_COMPRESS_TABLE, we should create a row for that index into SYSSTATISTICS table(provided that there is data in the table on which SYSCS_UTIL.SYSCS_COMPRESS_TABLE is getting run, no?) . If a row already exists for an index, then SYSCS_UTIL.SYSCS_COMPRESS_TABLE already does the job of updating the statistics for that index.

I am interested in working on this issue and want to be clear that I understand the current behavior and the expected new behavior.

Mike Matrigali added a comment - 05/Oct/06 04:58 PM
I believe your interpretation is correct. The request is to always "update" statistics when running the compress table command. Internally this may mean updating a row or creating a new row - the difference need not be documented to the user. Since the entire index is getting rebuilt, I can think of no reason not to gather the statistics and record them at this time.

Mamta A. Satoor added a comment - 12/Oct/06 03:11 AM
I would like to submit a patch for this Jira entry. It is attached as DERBY737_v1_diff_SYSCS_COMPRESS_TABLE.txt The changes have been very localized in AlterTableConstantAction.java!updateIndex() Currently, this method checks if statistics already exist for an index. If yes, then it sets a flag updateStatistics to true. Later, the code checks for this flag and drops the existing statistics and creates new statistics for that index provided the user table at this point is not empty. So, as we can see, if there is an index with no preexisting statistics, the flag updateStatistics will be set to false and hence no statistics related code is executed and hence even though the user table is not empty at the time of compress, no statistics get generated for such an index.

I am proposing to fix the problem by still using the flag to see if an index has pre-existing statistics. If yes,
then we should drop those statistics. Next, whether the index has pre-existing statistics or not, go ahead and
create new statistics for the index provided the user table is not currently empty.

I ran the derbyall suite on Windows XP with Sun JDK 1.4 and there were no new failures. In addition, I have added few tests to lang/compressTable.sql

Can someone please review this patch for me?

Bryan Pendleton added a comment - 15/Oct/06 07:04 PM
Hi Mamta, I had a look at your patch.

Your changes seem good to me. Your new tests failed as expected without the code changes,
and passed as expected with the code changes. I also had a clean derbyall run with your changes.

Do you feel that this change is ready for commit? Is anyone else reviewing this change?

Mamta A. Satoor added a comment - 16/Oct/06 02:24 PM
Thanks for taking the time out to review the patch, Bryan. Yes, the patch is ready for commit unless someone else is reviewing it too. Thanks again.

Bryan Pendleton added a comment - 16/Oct/06 04:33 PM
Committed the patch to subversion as revision 464551.

Mamta, do you think this patch needs to be ported to any prior releases?

Mamta A. Satoor added a comment - 16/Oct/06 04:45 PM
Bryan, thanks for committing the patch to the trunk. I think it will be useful to port this patch to prior releases because optimizer relies on the statistics information for indexes on a table. As for documentation, I will open another Jira entry for the doc changes.

Bryan Pendleton added a comment - 16/Oct/06 07:45 PM
The merge to the 10.2 branch was straightforward, and my 10.2 derbyall test run was clean. I propose to commit this merged change to the 10.2 branch.

Bryan Pendleton added a comment - 16/Oct/06 09:44 PM
Merged the trunk fix to the 10.2 branch and committed to subversion
as revision 464683.

Mamta A. Satoor added a comment - 28/Feb/08 05:21 PM
Merged changes into 10.1 codeline using revision 632065