Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 1.11.0
    • Fix Version/s: 1.12.0
    • Component/s: core
    • Labels:
      None

      Description

      I'm not sure why, but DateRangeRules can lose some filters. For example in this query:

      SELECT COUNT(*) FROM druid.foo WHERE EXTRACT(YEAR FROM __time) = 2000 AND EXTRACT(MONTH FROM __time) IN (2, 3, 5)
      

      The filter gets resolved to __time >= 2000-02-01 and __time < 2000-03-01, losing the months "3" and "5". Removing DateRangeRules.FILTER_INSTANCE from the planner's rule set stops this from happening.

        Activity

        Hide
        julianhyde Julian Hyde added a comment -

        Resolved in release 1.12.0 (2017-03-24).

        Show
        julianhyde Julian Hyde added a comment - Resolved in release 1.12.0 (2017-03-24).
        Hide
        julianhyde Julian Hyde added a comment -

        I had missed one plan improved by this fix. Fixed in http://git-wip-us.apache.org/repos/asf/calcite/commit/4a3df4ef.

        Show
        julianhyde Julian Hyde added a comment - I had missed one plan improved by this fix. Fixed in http://git-wip-us.apache.org/repos/asf/calcite/commit/4a3df4ef .
        Hide
        julianhyde Julian Hyde added a comment -

        Thanks!

        Show
        julianhyde Julian Hyde added a comment - Thanks!
        Hide
        gian Gian Merlino added a comment -

        I collected the failing tests in CALCITE-1658.

        Show
        gian Gian Merlino added a comment - I collected the failing tests in CALCITE-1658 .
        Hide
        gian Gian Merlino added a comment -

        Sure thing Julian. I'll poke at it a bit more and write up another jira.

        Show
        gian Gian Merlino added a comment - Sure thing Julian. I'll poke at it a bit more and write up another jira.
        Hide
        julianhyde Julian Hyde added a comment -

        Gian Merlino, OK, it seems that we have several problems. I confess the logic was a bit ad hoc (i.e. tests are required to find all the bugs) but I thought I made it a bit more robust. Maybe it's the '<>' on year; we the algorithm relies on having a finite set of years to work on before it starts chopping the range into months and days. Can you please log another bug, and include any and all Druid test cases that fail when this rule is enabled. Then we can cover them all in one fell swoop.

        Show
        julianhyde Julian Hyde added a comment - Gian Merlino , OK, it seems that we have several problems. I confess the logic was a bit ad hoc (i.e. tests are required to find all the bugs) but I thought I made it a bit more robust. Maybe it's the '<>' on year; we the algorithm relies on having a finite set of years to work on before it starts chopping the range into months and days. Can you please log another bug, and include any and all Druid test cases that fail when this rule is enabled. Then we can cover them all in one fell swoop.
        Hide
        gian Gian Merlino added a comment -

        Btw, the way to read that test case is

        1. SQL
        2. Expected Druid queries the SQL is translated to
        3. Expected results

        Show
        gian Gian Merlino added a comment - Btw, the way to read that test case is 1. SQL 2. Expected Druid queries the SQL is translated to 3. Expected results
        Hide
        gian Gian Merlino added a comment - - edited

        Hey Julian,

        I haven't reviewed the patch (I still haven't wrapped my head around how DateRangeRules is supposed to work) but I tried it in Druid and this test case failed:

            testQuery(
                "SELECT COUNT(*) FROM druid.foo\n"
                + "WHERE\n"
                + "  EXTRACT(YEAR FROM __time) <> 2000 AND ("
                + "       (EXTRACT(YEAR FROM __time) = 2000 AND EXTRACT(MONTH FROM __time) IN (2, 3, 5))\n"
                + "    OR (EXTRACT(YEAR FROM __time) = 2001 AND EXTRACT(MONTH FROM __time) = 1)\n"
                + "  )",
                ImmutableList.<Query>of(
                    Druids.newTimeseriesQueryBuilder()
                          .dataSource(CalciteTests.DATASOURCE1)
                          .intervals(QSS(new Interval("2001-01-01/P1M")))
                          .granularity(QueryGranularities.ALL)
                          .aggregators(AGGS(new CountAggregatorFactory("a0")))
                          .context(TIMESERIES_CONTEXT_DEFAULT)
                          .build()
                ),
                ImmutableList.of(
                    new Object[]{3L}
                )
            );
        

        The query got planned into

        BindableValues(tuples=[[{ 0 }]]
        

        so I think that means the predicate was incorrectly simplified to "false". Without DateRangeRules.FILTER_INSTANCE, the result is correct. (Although of course in that case the query doesn't match the expected query, as the filter ends up a regular filter rather than "intervals" filter)

        Show
        gian Gian Merlino added a comment - - edited Hey Julian, I haven't reviewed the patch (I still haven't wrapped my head around how DateRangeRules is supposed to work) but I tried it in Druid and this test case failed: testQuery( "SELECT COUNT(*) FROM druid.foo\n" + "WHERE\n" + " EXTRACT(YEAR FROM __time) <> 2000 AND (" + " (EXTRACT(YEAR FROM __time) = 2000 AND EXTRACT(MONTH FROM __time) IN (2, 3, 5))\n" + " OR (EXTRACT(YEAR FROM __time) = 2001 AND EXTRACT(MONTH FROM __time) = 1)\n" + " )" , ImmutableList.<Query>of( Druids.newTimeseriesQueryBuilder() .dataSource(CalciteTests.DATASOURCE1) .intervals(QSS( new Interval( "2001-01-01/P1M" ))) .granularity(QueryGranularities.ALL) .aggregators(AGGS( new CountAggregatorFactory( "a0" ))) .context(TIMESERIES_CONTEXT_DEFAULT) .build() ), ImmutableList.of( new Object []{3L} ) ); The query got planned into BindableValues(tuples=[[{ 0 }]] so I think that means the predicate was incorrectly simplified to "false". Without DateRangeRules.FILTER_INSTANCE, the result is correct. (Although of course in that case the query doesn't match the expected query, as the filter ends up a regular filter rather than "intervals" filter)
        Show
        julianhyde Julian Hyde added a comment - Fixed in http://git-wip-us.apache.org/repos/asf/calcite/commit/095acb10 .
        Hide
        julianhyde Julian Hyde added a comment -
        Show
        julianhyde Julian Hyde added a comment - Gian Merlino , I have a tentative fix: can you please review https://github.com/julianhyde/calcite/commit/c46b3ffb74ec588c72a1a782253e24b5564fd0ee .

          People

          • Assignee:
            julianhyde Julian Hyde
            Reporter:
            gian Gian Merlino
          • Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development