Derby
  1. Derby
  2. DERBY-2130

Optimizer performance slowdown from 10.1 to 10.2

    Details

    • Urgency:
      Normal
    • Issue & fix info:
      High Value Fix, Newcomer, Repro attached
    • Bug behavior facts:
      Performance, Regression

      Description

      Attached is 'repro.sql', an IJ script which demonstrates what I
      believe to be a serious performance issue in the Optimizer.

      I have run this script in a number of configurations:

      • 10.1.2.1: the script runs successfully. The 'prepare' statement
        takes about 90 seconds, on a fairly powerful Windows machine
      • 10.1.3.1: the script produces a NPE. I believe this is DERBY-1777
      • 10.2.1.8/trunk: the script runs successfully. The 'prepare' statement
        often takes about 220 seconds, on the same Windows machine

      Intermittently, on 10.2 and on the trunk, the prepare statement takes
      15+ minutes. I cannot reliably reproduce this; I run the same script
      several times in a row and I cannot predict whether it will take 220
      seconds or whether it will take 15+ minutes.

      I am quite motivated to work on this problem, as this is blocking me from
      using Derby for a project that I'm quite keen on, but I need some
      suggestions and ideas about how to attack it. From my perspective
      there are 3 primary topics:

      1) Why did optimizer performance for this query degrade so significantly
      from 10.1.2.1 to 10.2? The optimizer seems to be at least 2.5 times slower,
      for this particular query at least, in 10.2. Sometimes it is 10x slower.

      2) What is the source of the non-determinism? Why does the optimizer
      often take 4 minutes to optimize this query on the trunk, but sometimes
      take 15+ minutes? I don't believe that I'm changing anything from
      run to run.

      3) Can we improve the optimizer performance even beyond what it was
      for 10.1.2? I realize that this is an ugly query, but I was hoping to
      see an optimization time of 5-10 seconds, not 90 seconds (and certainly
      not 220 seconds).

      I have attempted to start answering some of these questions, with
      limited success. Here is some of what I think I've discovered so far:

      • the optimizer changes in 10.2 seem to have given the optimizer many
        more choices of possible query plans to consider. I think this means
        that, if the optimizer does not time out, it will spend substantially
        more time optimizing because there are more choices to evaluate. Does
        this by itself mean that the optimizer will take 2.5 times longer in
        10.2 than it did in 10.1?
      • something about this query seems to make the costing mechanism go
        haywire, and produce extreme costs. While stepping through the
        optimization of this query in the debugger I have seen it compute
        costs like 1e63 and 1e200. This might be very closely related to
        DERBY-1905, although I don't think I'm doing any subqueries here.
        But maybe I'm misunderstanding the term "subquery" in DERBY-1905.
        At any rate, due to the enormous estimated costs, timeout does not
        occur.
      • the WHERE clause in this query is converted during compilation to
        an equivalent IN clause, I believe, which then causes me to run into
        a number of the problems described in DERBY-47 and DERBY-713.
        Specifically, rather than constructing a plan which involves 4
        index probes for the 4 WHERE clause values, the optimizer decides
        that an index scan must be performed and that it will have to process
        the entire index (because the query uses parameter markers, not
        literal values). So perhaps solving DERBY-47 would help me
      • the optimizer in fact comes up with a "decent" query plan quite quickly.
        I have experimented with placing a hard limit into the optimizer
        timeout code, so that I can force optimization to stop after an
        arbitrary fixed period of time. Then I have been able to set that
        value to as low as 1 second, and the optimizer has produced plans
        that then execute in a few milliseconds. Of course, I have only tried
        this with a trivial amount of data in my database, so it's possible
        that the plan produced by the optimizer after just a second of
        optimizing is in fact poor, and I'm just not noticing it because my
        data sizes are so small.

      At this point, what would be really helpful to me would be some suggestions
      about some general approaches or techniques to try to start breaking down
      and analyzing this problem.

      1. repro.sql
        9 kB
        Bryan Pendleton
      2. plans.diff
        13 kB
        Bryan Pendleton
      3. plan10_2.txt
        78 kB
        Bryan Pendleton
      4. plan10_1_2_1.txt
        79 kB
        Bryan Pendleton
      5. jumpReset.patch
        2 kB
        A B
      6. jumpReset_10_11_trunk_diff.txt
        2 kB
        Kathey Marsden

        Issue Links

          Activity

          Bryan Pendleton created issue -
          Bryan Pendleton made changes -
          Field Original Value New Value
          Attachment repro.sql [ 12346070 ]
          Bryan Pendleton made changes -
          Attachment plan10_2.txt [ 12346469 ]
          Attachment plan10_1_2_1.txt [ 12346468 ]
          Attachment plans.diff [ 12346470 ]
          A B made changes -
          Attachment jumpReset.patch [ 12347032 ]
          Kathey Marsden made changes -
          Derby Info [Regression]
          Derby Categories [High Value Fix]
          Dag H. Wanvik made changes -
          Component/s Performance [ 11709 ]
          Dag H. Wanvik made changes -
          Derby Categories [High Value Fix] [High Value Fix, Performance]
          Dag H. Wanvik made changes -
          Bug behavior facts [Performance, High Value Fix] [Regression]
          Dag H. Wanvik made changes -
          Bug behavior facts [Regression] [Performance, Regression]
          Issue & fix info [High Value Fix]
          Knut Anders Hatlen made changes -
          Urgency Normal
          Issue & fix info [High Value Fix] [High Value Fix, Repro attached]
          Bryan Pendleton made changes -
          Link This issue relates to DERBY-4818 [ DERBY-4818 ]
          Kathey Marsden made changes -
          Labels derby_triage10_5_2
          Kathey Marsden made changes -
          Labels derby_triage10_5_2 derby_triage10_11
          Gavin made changes -
          Workflow jira [ 12391074 ] Default workflow, editable Closed status [ 12802014 ]
          Kathey Marsden made changes -
          Issue & fix info High Value Fix,Repro attached [ 10422, 10424 ] High Value Fix,Newcomer,Repro attached [ 10422, 10423, 10424 ]
          Kathey Marsden made changes -
          Attachment jumpReset_10_11_trunk_diff.txt [ 12609310 ]
          Kathey Marsden made changes -
          Link This issue relates to DERBY-6392 [ DERBY-6392 ]
          Kathey Marsden made changes -
          Status Open [ 1 ] Resolved [ 5 ]
          Fix Version/s 10.11.0.0 [ 12324243 ]
          Resolution Fixed [ 1 ]
          Mike Matrigali made changes -
          Link This issue is related to DERBY-6481 [ DERBY-6481 ]
          Mike Matrigali made changes -
          Assignee Mike Matrigali [ mikem ]
          Mike Matrigali made changes -
          Resolution Fixed [ 1 ]
          Status Resolved [ 5 ] Reopened [ 4 ]
          Mike Matrigali made changes -
          Status Reopened [ 4 ] Resolved [ 5 ]
          Fix Version/s 10.10.1.4 [ 12324247 ]
          Resolution Fixed [ 1 ]
          Myrna van Lunteren made changes -
          Fix Version/s 10.10.2.0 [ 12326659 ]
          Fix Version/s 10.10.1.4 [ 12324247 ]
          Myrna van Lunteren made changes -
          Status Resolved [ 5 ] Closed [ 6 ]

            People

            • Assignee:
              Mike Matrigali
              Reporter:
              Bryan Pendleton
            • Votes:
              1 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development