Uploaded image for project: 'Derby'
  1. Derby
  2. DERBY-4007

Optimization of IN with nested SELECT

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Minor
    • Resolution: Unresolved
    • 10.4.2.0
    • None
    • SQL
    • Linux
    • Normal
    • Repro attached
    • Performance

    Description

      The problem is with the following query:

      UPDATE summa_records SET base='foobar' WHERE id IN ( SELECT parentId FROM summa_relations WHERE childId='horizon_2615441');

      It takes in the order of 30s to run when we expect something in the order of 1-2ms.

      We have a setup with two tables

      summa_records: 1,5M rows
      summa_relations: ~350000 rows

      summa_records have and 'id' column that is also indexed and is the primary key. The summa_relations table holds mappings between different ids.

      In our case the nested SELECT produces 2 hits, say, 'foo' and 'bar'. So the UPDATE on these two hits should be quite snappy. If we run the SELECT alone it runs in an instant, and also if we run with hardcoded ids for the IN clause:

      UPDATE summa_records SET base='foobar' WHERE id IN ('foo', 'bar');

      We have instant execution. I'll attach a query plan in a sec.

      Attachments

        1. CreateDatabase4007.java
          1 kB
          Knut Anders Hatlen
        2. dblook_p_index.log
          1.0 kB
          Mikkel Kamstrup Erlandsen
        3. dblook.log
          1.0 kB
          Mikkel Kamstrup Erlandsen
        4. derby_p_index.log
          5 kB
          Mikkel Kamstrup Erlandsen
        5. derby.log
          4 kB
          Mikkel Kamstrup Erlandsen

        Activity

          People

            Unassigned Unassigned
            kamstrup Mikkel Kamstrup Erlandsen
            Votes:
            1 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated: