Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-4301

IGNORE NULLS clause in analytic function fails with WHERE-clause subqueries.

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: Impala 2.5.0, Impala 2.6.0, Impala 2.7.0
    • Fix Version/s: Impala 2.8.0
    • Component/s: Frontend
    • Labels:
      None

      Description

      The IGNORE NULLS clause for the FIRST_VALUE() and LAST_VALUE() analytic functions does not work in combination with WHERE-clause subqueries that need to be rewritten into joins. Such queries fail with an analysis exception.

      Reproduction:

      select first_value(tinyint_col ignore nulls) over (order by id)
      from functional.alltypesagg t1
      where exists (select 1 from functional.alltypes t2 where t1.id = t2.id)
      ERROR: AnalysisException: Function FIRST_VALUE_IGNORE_NULLS does not accept the keyword IGNORE NULLS.
      

      Workaround
      Manually rewrite the WHERE-clause subquery into an appropriate join.

        Activity

        Hide
        alex.behm Alexander Behm added a comment -

        commit c6fc89913453befbf03d206d20c35323288702f4
        Author: Alex Behm <alex.behm@cloudera.com>
        Date: Sat Oct 15 21:35:56 2016 -0700

        IMPALA-4301: Fix IGNORE NULLS with subquery rewriting.

        AnayticExpr.analyze() replaces the original FIRST/LAST_VALUE
        function with a FIRST/LAST_VALUE_IGNORE_NULLS function if
        the IGNORE NULLS clause is specified.

        The bug was that several places in AnalyticExpr.analyze() assumed
        and asserted that only the original FIRST/LAST_VALUE function
        could be encountered during analysis. However, with subquery
        rewriting the IGNORE NULLS version of the function may also be
        seen because the whole statement is re-analyzed after rewriting.

        The fix is to unset the IGNORE NULLS flag of the function params
        after changing the analytic function name.

        Change-Id: I708de7925fe6aeef582fd7510da93d24c71229d9
        Reviewed-on: http://gerrit.cloudera.org:8080/4732
        Reviewed-by: Alex Behm <alex.behm@cloudera.com>
        Tested-by: Internal Jenkins

        Show
        alex.behm Alexander Behm added a comment - commit c6fc89913453befbf03d206d20c35323288702f4 Author: Alex Behm <alex.behm@cloudera.com> Date: Sat Oct 15 21:35:56 2016 -0700 IMPALA-4301 : Fix IGNORE NULLS with subquery rewriting. AnayticExpr.analyze() replaces the original FIRST/LAST_VALUE function with a FIRST/LAST_VALUE_IGNORE_NULLS function if the IGNORE NULLS clause is specified. The bug was that several places in AnalyticExpr.analyze() assumed and asserted that only the original FIRST/LAST_VALUE function could be encountered during analysis. However, with subquery rewriting the IGNORE NULLS version of the function may also be seen because the whole statement is re-analyzed after rewriting. The fix is to unset the IGNORE NULLS flag of the function params after changing the analytic function name. Change-Id: I708de7925fe6aeef582fd7510da93d24c71229d9 Reviewed-on: http://gerrit.cloudera.org:8080/4732 Reviewed-by: Alex Behm <alex.behm@cloudera.com> Tested-by: Internal Jenkins

          People

          • Assignee:
            alex.behm Alexander Behm
            Reporter:
            alex.behm Alexander Behm
          • Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development