Derby
  1. Derby
  2. DERBY-3498

Performance sensitive to FROM order list in three way join with GROUP BY

    Details

    • Type: Bug Bug
    • Status: Open
    • Priority: Minor Minor
    • Resolution: Unresolved
    • Affects Version/s: 10.2.2.0
    • Fix Version/s: None
    • Component/s: SQL
    • Urgency:
      Normal
    • Bug behavior facts:
      Performance

      Description

      The Order Entry system test has this query in Checks.testCondition2():

      SELECT D.D_ID, D.D_W_ID,D.D_NEXT_O_ID,
      MAX(O.O_ID),MAX(N.NO_O_ID) FROM NEWORDERS N,
      DISTRICT D, ORDERS O WHERE D.D_W_ID=O.O_W_ID
      AND D.D_W_ID = N.NO_W_ID AND D.D_ID = O.O_D_ID
      AND D.D_ID = N.NO_D_ID GROUP BY
      D.D_ID,D.D_W_ID,D.D_NEXT_O_ID

      Changing the FROM list to put DISTRICT first reduced the running time from ~138seconds to ~105 and a similar reduction on another machine.
      (with a 1 warehouse database)

      Beyond that, the query seems to take too long. 10 rows will be returned and for each district row (returned row) the MAX() values in the select list should just be a max lookup in the primary key for the ORDERS and NEWORDERS table. The primary key for these tables is W_ID, D_ID, O_ID.
      Thus it should be a scan of the 10 row DISTRICT table with 20 index lookups, hard to see how that should table 100 seconds.

      One can see this by using the java-system-mini-codeline-jars target.

        Activity

        Hide
        Tiago R. Espinha added a comment -

        Triaged for 10.5.2.

        Assigned Normal urgency.

        Show
        Tiago R. Espinha added a comment - Triaged for 10.5.2. Assigned Normal urgency.
        Hide
        Daniel John Debrunner added a comment -

        Creating the indexes after the load did not affect the running time of the condition2 test.

        Adding more rows to the ORDERS and NEWORDERS table has a major effect on the running time, up to around 265seconds when adding 13,500 rows to ORDERS (originally 30,000) and NEWORDERS (9,000).

        Show
        Daniel John Debrunner added a comment - Creating the indexes after the load did not affect the running time of the condition2 test. Adding more rows to the ORDERS and NEWORDERS table has a major effect on the running time, up to around 265seconds when adding 13,500 rows to ORDERS (originally 30,000) and NEWORDERS (9,000).
        Hide
        Daniel John Debrunner added a comment -

        Indexes and constraints are created before the load in this case.

        Show
        Daniel John Debrunner added a comment - Indexes and constraints are created before the load in this case.
        Hide
        A B added a comment - - edited

        I notice that run/Populate.java has a parameter "createConstraintsBeforeLoad" which defaults to "true", meaning that all indexes are created BEFORE data is loaded. Is that parameter overridden by the "junit-system-mini-codeline-jars" target? If not, then it's possible that the statistics for the tables are completely wrong (known issue in Derby when indexes are created before data is loaded), which could cause the optimizer to pick a bad plan.

        If keeping "createConstraintsBeforeLoad()" is important to the test, then it might be good to add some calls to compress the tables after the load completes, so that the stats are updated accordingly. That might (hopefully) help things a bit...

        Show
        A B added a comment - - edited I notice that run/Populate.java has a parameter "createConstraintsBeforeLoad" which defaults to "true", meaning that all indexes are created BEFORE data is loaded. Is that parameter overridden by the "junit-system-mini-codeline-jars" target? If not, then it's possible that the statistics for the tables are completely wrong (known issue in Derby when indexes are created before data is loaded), which could cause the optimizer to pick a bad plan. If keeping "createConstraintsBeforeLoad()" is important to the test, then it might be good to add some calls to compress the tables after the load completes, so that the stats are updated accordingly. That might (hopefully) help things a bit...

          People

          • Assignee:
            Unassigned
            Reporter:
            Daniel John Debrunner
          • Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

            • Created:
              Updated:

              Development