Issue Details (XML | Word | Printable)

Key: DERBY-4116
Type: Bug Bug
Status: Closed Closed
Resolution: Fixed
Priority: Major Major
Assignee: Kathey Marsden
Reporter: Kathey Marsden
Votes: 0
Watchers: 0
Operations

If you were logged in you would be able to see more operations.
Derby

SYSCS_UTIL.SYSCS_UPDATE_STATISTICS should update the store estimated row count for the table

Created: 24/Mar/09 09:43 PM   Updated: 30/Jun/09 04:02 PM
Component/s: SQL
Affects Version/s: 10.6.0.0
Fix Version/s: 10.5.1.1, 10.6.0.0

Time Tracking:
Not Specified

File Attachments:
  Size
Text File Licensed for inclusion in ASF works derby-4116_diff.txt 2009-03-25 05:25 PM Kathey Marsden 10 kB
Text File Licensed for inclusion in ASF works derby-4116_diff2.txt 2009-03-26 01:12 PM Kathey Marsden 11 kB
Text File Licensed for inclusion in ASF works derby-4116_diff3.txt 2009-03-30 05:39 PM Kathey Marsden 12 kB
Issue Links:
Blocker
 
Reference
 

Issue & fix info: High Value Fix
Resolution Date: 02/Apr/09 12:33 PM
Labels:



 All   Comments   Work Log   Change History   Subversion Commits      Sort Order: Ascending order - Click to sort in descending order
Kathey Marsden made changes - 24/Mar/09 09:43 PM
Field Original Value New Value
Link This issue is related to DERBY-269 [ DERBY-269 ]
Kathey Marsden made changes - 24/Mar/09 09:45 PM
Link This issue blocks DERBY-3955 [ DERBY-3955 ]
Kathey Marsden added a comment - 24/Mar/09 09:45 PM
This issue blocks DERBY-3955. The test won't be able to be converted until this issue is resolved.

Kathey Marsden added a comment - 24/Mar/09 11:45 PM
Well just to see if this was a really easy fix, so I could continue my testing, I tried the trivial patch below:

--- java/engine/org/apache/derby/impl/sql/execute/AlterTableConstantAction.java (revision 757498)
+++ java/engine/org/apache/derby/impl/sql/execute/AlterTableConstantAction.java (working copy)
@@ -744,6 +744,7 @@
                                        rowBufferArray[GROUP_FETCH_SIZE - 1] = lastUniqueKey;
                                        lastUniqueKey = tmp;
                                } // while
+ gsc.setEstimatedRowCount(numRows);
                        } // try
                        finally
                        {


but it doesn't affect my test case. I verified that with this change the code goes through
OpenBTree.setEstimatedRowCount and sets the count to 4000, but there is this comment in OpenBTree.setEstimatedRowCount():

     * This call is currently only supported on Heap conglomerates, it
     * will throw an exception if called on btree conglomerates.
     *
It doesn't throw an exception, but it doesn't seem to do too much either. Any ideas what's wrong?

Knut Anders Hatlen added a comment - 25/Mar/09 10:22 AM
I guess it's only implemented for heap conglomerates because the row count should really be the same in the base table and all its indexes, and there's no need to store the estimated row count for a table more than one place. The optimizer will probably always look for the estimate in the heap conglomerate, so it doesn't notice the estimate in the index.

Does it work as expected if you do something like this near the end of the method?

        ScanController heapSC = tc.openScan(td.getHeapConglomerateId(), ...);
        heapSC.setEstimatedRowCount(rowCount);
        heapSC.close();

Kathey Marsden added a comment - 25/Mar/09 12:45 PM
Thanks Knut, that seemed to do the trick. I will make a proper patch for the issue and do testing. I think I'll keep the index estimatedRowCount update since it does seem to be being set, just in case someone is using it for something.



Kathey Marsden made changes - 25/Mar/09 12:45 PM
Assignee Kathey Marsden [ kmarsden ]
Kathey Marsden added a comment - 25/Mar/09 05:25 PM
Attached is a patch for this issue. derby-4116_diff.txt. It also included the initial partial conversion of selectivity.sql DERBY-3955 which tests this issue. suites.All and derbyall passed except for DERBY-3993, a known issue.

Please review. Especially, please check my arguments to openScan to make sure I got that right.



Kathey Marsden made changes - 25/Mar/09 05:25 PM
Attachment derby-4116_diff.txt [ 12403623 ]
Kathey Marsden made changes - 25/Mar/09 05:26 PM
Derby Info [Patch Available]
Knut Anders Hatlen added a comment - 26/Mar/09 11:35 AM
The arguments to openScan() look correct to me. Since we don't actually scan the conglomerate, many of them wouldn't make any difference anyway, I think.

Now we update the heap conglomerate inside the for loop, which means that we'll update it once per index. Would it be enough to update it once, after we have gone through the last index?

Nit: openScan() should be called outside the try block (if it fails, there's no open scan to close). If it's moved out of the try block, the check for heapSC!=null in the finally block can be removed too.

Kathey Marsden added a comment - 26/Mar/09 12:52 PM
Thanks Knut for reviewing the changes, especially catching that I was doing the update inside the loop. I will make the changes and post a new patch soon.



Kathey Marsden added a comment - 26/Mar/09 01:12 PM
Here is an updated patch for this issue. I am rerunning tests and will check in later today if all goes well.

Kathey Marsden made changes - 26/Mar/09 01:12 PM
Attachment derby-4116_diff2.txt [ 12403710 ]
Knut Anders Hatlen added a comment - 26/Mar/09 01:25 PM
With the latest patch, I think the row count estimate will always be set to 0 if you call SYSCS_UPDATE_STATISTICS on a table with no indexes, even if the table is not empty. Would it be possible to initialize rowCount to -1, so that we can detect that the row count was not calculated, and then add "if (rowCount >= 0)" around the call to setEstimatedRowCount()?

Kathey Marsden added a comment - 26/Mar/09 09:09 PM
Unchecking patch available. I will make a new patch with Knut's suggestion., but it is looking like that won't happen today. Thanks again Knut for your valuable input.

Kathey Marsden made changes - 26/Mar/09 09:09 PM
Derby Info [Patch Available]
Kathey Marsden added a comment - 27/Mar/09 05:03 PM
Attached is an updated patch for this issue. This patch will not update the store estimated row count if no index scan was performed.

I tried to add a test to UpdateStatisticsTest to verify this, but ran into some trouble because the estimated row counts we get back from runtime statistics do not return this information directly. I did manually verify that we do not go thre setEstimatedRowCount for update statistics on a table with no index.

Regarding the use of internal interfaces in checkEstimatedRowCount, Myrna suggested that I clobber and build and make sure there were no build order issues and also that I make sure derbyTesting.jar didn't pick up the product classes. I checked both and the change seems ok from a build perspective.

I ran SelectivityTest and UpdateStatisticsTest and am running tests now but don't anticipate any problems since they don't exercise this code. Please review.

Kathey Marsden made changes - 27/Mar/09 05:03 PM
Derby Info [Patch Available]
Kathey Marsden added a comment - 30/Mar/09 05:39 PM
The patch derby-4116_diff3.txt is the current patch as of this comment. Somehow it did not get attached with my last attempt.

Kathey Marsden made changes - 30/Mar/09 05:39 PM
Attachment derby-4116_diff3.txt [ 12404165 ]
Mike Matrigali added a comment - 30/Mar/09 06:00 PM
patch looks good to me. The only thing I noticed is that the current index scans use an open for read (argument 0), but your new heap open does an open for update. It seems like both
opens should use the same - either read or write, and since the existing scans and update of the estimate seems to work it seems better to open the heap for read also.

Knut Anders Hatlen added a comment - 31/Mar/09 08:13 AM
Patch 3 looks good to me too. Thanks for all the work.

Repository Revision Date User Message
ASF #760497 Tue Mar 31 15:55:26 UTC 2009 kmarsden DERBY-4116 SYSCS_UTIL.SYSCS_UPDATE_STATISTICS should update the store estimated row count for the table
Files Changed
MODIFY /db/derby/code/trunk/java/testing/org/apache/derbyTesting/functionTests/tests/lang/_Suite.java
MODIFY /db/derby/code/trunk/java/testing/org/apache/derbyTesting/junit/BaseJDBCTestCase.java
MODIFY /db/derby/code/trunk/java/engine/org/apache/derby/impl/sql/execute/AlterTableConstantAction.java
ADD /db/derby/code/trunk/java/testing/org/apache/derbyTesting/functionTests/tests/lang/SelectivityTest.java

Kathey Marsden added a comment - 31/Mar/09 03:58 PM
I made the change recommended by Mike and committed revision 760497 to trunk. I will wait for clean nightly results and backport to 10.5. Meanwhile I'll leave this issue open. Thanks for the reviews!


Repository Revision Date User Message
ASF #761265 Thu Apr 02 12:26:38 UTC 2009 kmarsden DERBY-4116 SYSCS_UTIL.SYSCS_UPDATE_STATISTICS should update the store estimated row count for the table
Files Changed
MODIFY /db/derby/code/branches/10.5/java/testing/org/apache/derbyTesting/functionTests/tests/lang/_Suite.java
MODIFY /db/derby/code/branches/10.5/java/testing/org/apache/derbyTesting/junit/BaseJDBCTestCase.java
ADD /db/derby/code/branches/10.5/java/testing/org/apache/derbyTesting/functionTests/tests/lang/SelectivityTest.java (from /db/derby/code/trunk/java/testing/org/apache/derbyTesting/functionTests/tests/lang/SelectivityTest.java)
MODIFY /db/derby/code/branches/10.5/java/engine/org/apache/derby/impl/sql/execute/AlterTableConstantAction.java

Kathey Marsden made changes - 02/Apr/09 12:33 PM
Status Open [ 1 ] Closed [ 6 ]
Fix Version/s 10.5.1.1 [ 12313771 ]
Fix Version/s 10.6.0.0 [ 12313727 ]
Resolution Fixed [ 1 ]
Myrna van Lunteren made changes - 04/May/09 06:24 PM
Affects Version/s 10.5.1.0 [ 12313770 ]
Dag H. Wanvik made changes - 30/Jun/09 04:02 PM
Issue & fix info [Patch Available] [High Value Fix]