Derby
  1. Derby
  2. DERBY-5911

WHERE condition getting pushed into sub-query with FETCH

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 10.9.1.0
    • Fix Version/s: 10.8.3.0, 10.9.2.2, 10.10.1.1
    • Component/s: SQL
    • Labels:
      None
    • Environment:
      Tested with Derby 10.9.1.0 on Windows 7 x64, Java 1.6.0_27-b07 server
    • Bug behavior facts:
      Wrong query result

      Description

      Derby pushes query conditions down into subqueries with FETCH limits, thus creating wrong results. Take the following snippet:

      CREATE TABLE COFFEES (COF_NAME VARCHAR(254),PRICE INTEGER);

      INSERT INTO COFFEES (COF_NAME,PRICE) VALUES ('Colombian', 5);
      INSERT INTO COFFEES (COF_NAME,PRICE) VALUES ('French_Roast', 5);
      INSERT INTO COFFEES (COF_NAME,PRICE) VALUES ('Colombian_Decaf', 20);

      select COF_NAME, PRICE from COFFEES order by COF_NAME fetch next 2 rows only;

      select * from (
      select COF_NAME, PRICE from COFFEES order by COF_NAME fetch next 2 rows only
      ) t where t.PRICE < 10;

      The first query correctly returns the rows (Colombian,5), (Colombian_Decaf,20).

      The second query (which filters the result of the first one) returns (Colombian,5), (French_Roast,5). The row (French_Roast,5) should not be there since it is not a result of the first query. It shows up because (supposedly) the filter condition has been evaluated before the fetch limit.

      1. derby5911b.diff
        4 kB
        Dag H. Wanvik
      2. derby5911b.stat
        0.2 kB
        Dag H. Wanvik
      3. derby5911a.stat
        0.3 kB
        Dag H. Wanvik
      4. derby5911a.diff
        5 kB
        Dag H. Wanvik

        Activity

        Hide
        Stefan Zeiger added a comment -

        The fix works for me (tested with current source from branches/10.9 vs 10.9.1.0 binaries).

        Show
        Stefan Zeiger added a comment - The fix works for me (tested with current source from branches/10.9 vs 10.9.1.0 binaries).
        Hide
        Dag H. Wanvik added a comment -

        Thanks for confirming that, Mamta!

        Show
        Dag H. Wanvik added a comment - Thanks for confirming that, Mamta!
        Hide
        Mamta A. Satoor added a comment -

        I forgot to add a comment that I ran ij script with OFFSET(from jira comment 10/Sep/12 21:14) after applying Dag's patch and it gave the correct results.

        Show
        Mamta A. Satoor added a comment - I forgot to add a comment that I ran ij script with OFFSET(from jira comment 10/Sep/12 21:14) after applying Dag's patch and it gave the correct results.
        Hide
        Dag H. Wanvik added a comment -

        Backported to 10.8 as svn 1384155, resolving. Stefan, if you are OK with the fix, please close this issue.

        Show
        Dag H. Wanvik added a comment - Backported to 10.8 as svn 1384155, resolving. Stefan, if you are OK with the fix, please close this issue.
        Hide
        Dag H. Wanvik added a comment -

        Backported to 10.9 as svn 1384041.

        Show
        Dag H. Wanvik added a comment - Backported to 10.9 as svn 1384041.
        Hide
        Dag H. Wanvik added a comment -

        Regressions ran ok, committed as svn 1384035.

        Show
        Dag H. Wanvik added a comment - Regressions ran ok, committed as svn 1384035.
        Hide
        Knut Anders Hatlen added a comment -

        > Knut: I refrained from pushing inside, cf item #3 above.

        +1. It makes sense to keep the fix simple for now. If someone comes up with a real-world example that's meaningful without ORDER BY, and whose performance would benefit from predicate push-down, we can always add more sophisticated logic later.

        Show
        Knut Anders Hatlen added a comment - > Knut: I refrained from pushing inside, cf item #3 above. +1. It makes sense to keep the fix simple for now. If someone comes up with a real-world example that's meaningful without ORDER BY, and whose performance would benefit from predicate push-down, we can always add more sophisticated logic later.
        Hide
        Dag H. Wanvik added a comment -

        Knut: I refrained from pushing inside, cf item #3 above.

        Show
        Dag H. Wanvik added a comment - Knut: I refrained from pushing inside, cf item #3 above.
        Hide
        Dag H. Wanvik added a comment -

        Thanks, Mamta. Uploaded version b of this patch:

        • added comment as suggested in PRN also for FETCH/OFFSET
        • added test case for OFFSET alone
        • removed check for nested ORDER BY: there should always be one if windowing or FETCH/OFFSET is used, and if the user has omitted it, I think it's better to avoid pushing in any case (least surprise principle)

        Running regressions.

        Show
        Dag H. Wanvik added a comment - Thanks, Mamta. Uploaded version b of this patch: added comment as suggested in PRN also for FETCH/OFFSET added test case for OFFSET alone removed check for nested ORDER BY: there should always be one if windowing or FETCH/OFFSET is used, and if the user has omitted it, I think it's better to avoid pushing in any case (least surprise principle) Running regressions.
        Hide
        Mamta A. Satoor added a comment -

        Dag, I will try the patch for the OFFSET clause query and see how that goes. We probably should add a test case for OFFSET in the new test fixture. Also, I was wondering if the comments in ProjectRestrictNode.java should mention OFFSET and FETCH along with windows function for ORDER BY. Thanks

        Show
        Mamta A. Satoor added a comment - Dag, I will try the patch for the OFFSET clause query and see how that goes. We probably should add a test case for OFFSET in the new test fixture. Also, I was wondering if the comments in ProjectRestrictNode.java should mention OFFSET and FETCH along with windows function for ORDER BY. Thanks
        Hide
        Dag H. Wanvik added a comment - - edited

        Yes, the OFFSET clause alone is also susceptible to the problem. I thought my patch addressed that also? Did you run this query against the patch, Mamta?

        Show
        Dag H. Wanvik added a comment - - edited Yes, the OFFSET clause alone is also susceptible to the problem. I thought my patch addressed that also? Did you run this query against the patch, Mamta?
        Hide
        Mamta A. Satoor added a comment -

        I tried following script in ij for OFFSET(notice the data here is different than the earlier script in this jira)

        CREATE TABLE COFFEES (COF_NAME VARCHAR(254),PRICE INTEGER);
        INSERT INTO COFFEES (COF_NAME,PRICE) VALUES ('Colombian', 5);
        INSERT INTO COFFEES (COF_NAME,PRICE) VALUES ('French_Roast', 15);
        INSERT INTO COFFEES (COF_NAME,PRICE) VALUES ('Colombian_Decaf', 15);

        ij> select COF_NAME, PRICE from COFFEES order by COF_NAME offset 1 rows;
        COF_NAME |PRICE
        --------------------------------------------------------------------------------------------------------------------------------------------
        Colombian_Decaf |15
        French_Roast |15

        ij> select * from(
        select COF_NAME, PRICE from COFFEES order by COF_NAME offset 1 rows
        )t where t.price>10;
        COF_NAME |PRICE
        --------------------------------------------------------------------------------------------------------------------------------------------
        French_Roast |15

        If I understand this correctly, the last query should have returned two rows.

        Show
        Mamta A. Satoor added a comment - I tried following script in ij for OFFSET(notice the data here is different than the earlier script in this jira) CREATE TABLE COFFEES (COF_NAME VARCHAR(254),PRICE INTEGER); INSERT INTO COFFEES (COF_NAME,PRICE) VALUES ('Colombian', 5); INSERT INTO COFFEES (COF_NAME,PRICE) VALUES ('French_Roast', 15); INSERT INTO COFFEES (COF_NAME,PRICE) VALUES ('Colombian_Decaf', 15); ij> select COF_NAME, PRICE from COFFEES order by COF_NAME offset 1 rows; COF_NAME |PRICE -------------------------------------------------------------------------------------------------------------------------------------------- Colombian_Decaf |15 French_Roast |15 ij> select * from( select COF_NAME, PRICE from COFFEES order by COF_NAME offset 1 rows )t where t.price>10; COF_NAME |PRICE -------------------------------------------------------------------------------------------------------------------------------------------- French_Roast |15 If I understand this correctly, the last query should have returned two rows.
        Hide
        Mamta A. Satoor added a comment -

        Can the OFFSET clause in combination with order by cause any problems?

        Show
        Mamta A. Satoor added a comment - Can the OFFSET clause in combination with order by cause any problems?
        Hide
        Dag H. Wanvik added a comment -

        If there is no ORDER BY inside the FETCH/OFFSET or windowing, one could push the predicate all the way without breaking SQL semantics, but it might still confuse users (give unexpected results), since the row order is usually deterministic in Derby, so it might be a good idea to never push past FETCH/OFFSET or windowing, what do you think?

        Show
        Dag H. Wanvik added a comment - If there is no ORDER BY inside the FETCH/OFFSET or windowing, one could push the predicate all the way without breaking SQL semantics, but it might still confuse users (give unexpected results), since the row order is usually deterministic in Derby, so it might be a good idea to never push past FETCH/OFFSET or windowing, what do you think?
        Hide
        Dag H. Wanvik added a comment - - edited

        Yes that is true, and we'd want to stop pushing one level above the FETCH/OFFSET or windowing, since the WHERE clause is evaluated before the ORDER BY. I'll see if I can make it better without complicating the code too much.

        Show
        Dag H. Wanvik added a comment - - edited Yes that is true, and we'd want to stop pushing one level above the FETCH/OFFSET or windowing, since the WHERE clause is evaluated before the ORDER BY. I'll see if I can make it better without complicating the code too much.
        Hide
        Knut Anders Hatlen added a comment -

        The fix looks right to me. The patch seems to prevent the predicate from being pushed if any of the sub-queries under the query with the fetch clause is ordered. Is that broader than it needs to be? Could it be pushed further down until it reaches the node on top of the ordered one?

        Show
        Knut Anders Hatlen added a comment - The fix looks right to me. The patch seems to prevent the predicate from being pushed if any of the sub-queries under the query with the fetch clause is ordered. Is that broader than it needs to be? Could it be pushed further down until it reaches the node on top of the ordered one?
        Hide
        Dag H. Wanvik added a comment -

        Uploading a patch for this issue. The problem is, as suggested, that the predicate is pushed down during optimization. In presence of ROW_NUMBER and/or FETCH/OFFSET, this cannot be presumed not be correct.

        The patch adds logic to check for this, and adds new test cases to OrderByAndOffsetFetchInSubqueries.

        Running regressions.

        Show
        Dag H. Wanvik added a comment - Uploading a patch for this issue. The problem is, as suggested, that the predicate is pushed down during optimization. In presence of ROW_NUMBER and/or FETCH/OFFSET, this cannot be presumed not be correct. The patch adds logic to check for this, and adds new test cases to OrderByAndOffsetFetchInSubqueries. Running regressions.
        Hide
        Stefan Zeiger added a comment -

        Note that this attempt at a work-around suffers from the same issue:

        select cof_name, price from (
        select row_number() over() as rownum, COF_NAME, PRICE from (select * from COFFEES order by COF_NAME) t
        ) as t where t.rownum <= 2 and t.PRICE < 10

        Show
        Stefan Zeiger added a comment - Note that this attempt at a work-around suffers from the same issue: select cof_name, price from ( select row_number() over() as rownum, COF_NAME, PRICE from (select * from COFFEES order by COF_NAME) t ) as t where t.rownum <= 2 and t.PRICE < 10

          People

          • Assignee:
            Dag H. Wanvik
            Reporter:
            Stefan Zeiger
          • Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development