Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-4592

IllegalStateException when using nondeterministic functions in partition filter

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Critical
    • Resolution: Fixed
    • Affects Version/s: Impala 2.8.0
    • Fix Version/s: Impala 2.8.0
    • Component/s: Frontend
    • Labels:

      Description

      Using non-deterministic UDFs like rand() in certain expressions that don't reference columns can hit an IllegalStateException. I hit this on commit b9034ea0d54ad40e11b482b577362ceee3768f1e

      [tarmstrong-box.ca.cloudera.com:21000] > explain select * from functional.alltypes where rand() > 100 limit 5;
      Query: explain select * from functional.alltypes where rand() > 100 limit 5
      ERROR: IllegalStateException: null
      

        Issue Links

          Activity

          Hide
          alex.behm Alexander Behm added a comment -

          Pretty sure this is fixed by IMPALA-4574. I hit this problem there as well.

          Show
          alex.behm Alexander Behm added a comment - Pretty sure this is fixed by IMPALA-4574 . I hit this problem there as well.
          Hide
          tarmstrong Tim Armstrong added a comment -

          Great! I'll try rebasing.

          Show
          tarmstrong Tim Armstrong added a comment - Great! I'll try rebasing.
          Hide
          tarmstrong Tim Armstrong added a comment -

          This still fails for me:

          explain select * from functional.alltypes where rand() > 1000 limit 5;
          
          Show
          tarmstrong Tim Armstrong added a comment - This still fails for me: explain select * from functional.alltypes where rand() > 1000 limit 5;
          Hide
          tarmstrong Tim Armstrong added a comment -

          It looks like the problem is that partition pruning assumes implicitly that if it substitutes all the partition values into an expression bound only by partition columns, then the resulting expression will be constant.

          Show
          tarmstrong Tim Armstrong added a comment - It looks like the problem is that partition pruning assumes implicitly that if it substitutes all the partition values into an expression bound only by partition columns, then the resulting expression will be constant.
          Hide
          alex.behm Alexander Behm added a comment -

          Thanks for checking. Looks like a different bug then. Will look into it.

          Show
          alex.behm Alexander Behm added a comment - Thanks for checking. Looks like a different bug then. Will look into it.
          Hide
          tarmstrong Tim Armstrong added a comment -

          I think there's some connection between the partition pruning issue and the fact that "where rand() > 0.9" doesn't work.

          Show
          tarmstrong Tim Armstrong added a comment - I think there's some connection between the partition pruning issue and the fact that "where rand() > 0.9" doesn't work.
          Hide
          tarmstrong Tim Armstrong added a comment -

          I'm going to upgrade this since it is a regression

          Show
          tarmstrong Tim Armstrong added a comment - I'm going to upgrade this since it is a regression
          Hide
          alex.behm Alexander Behm added a comment -

          The query would run before in Impala 2.7, but return wrong results because the non-deterministic function was evaluated at the partition level, but should be evaluated at the row level. Returning an error is better than returning wrong results.

          We should clarify the error message, but that's not really a blocker issue.

          Show
          alex.behm Alexander Behm added a comment - The query would run before in Impala 2.7, but return wrong results because the non-deterministic function was evaluated at the partition level, but should be evaluated at the row level. Returning an error is better than returning wrong results. We should clarify the error message, but that's not really a blocker issue.
          Hide
          alex.behm Alexander Behm added a comment -

          This fix only improves the error message. A proper fix is tracked by IMPALA-4605.

          commit 6098ac7162742c11350de708188ce6c3f7ce11a7
          Author: Alex Behm <alex.behm@cloudera.com>
          Date: Tue Dec 6 14:30:43 2016 -0800

          IMPALA-4592: Improve error msg for non-deterministic predicates.

          Impala cannot correctly evaluate or assign some non-deterministic
          predicates. This patch improves the error message shown when
          trying to evaluate such unsupported predicates for the purpose
          of partition pruning.

          Change-Id: I94765f62bde94f4faa7fc5c26d928099ca1496d1
          Reviewed-on: http://gerrit.cloudera.org:8080/5386
          Reviewed-by: Alex Behm <alex.behm@cloudera.com>
          Tested-by: Internal Jenkins

          Show
          alex.behm Alexander Behm added a comment - This fix only improves the error message. A proper fix is tracked by IMPALA-4605 . commit 6098ac7162742c11350de708188ce6c3f7ce11a7 Author: Alex Behm <alex.behm@cloudera.com> Date: Tue Dec 6 14:30:43 2016 -0800 IMPALA-4592 : Improve error msg for non-deterministic predicates. Impala cannot correctly evaluate or assign some non-deterministic predicates. This patch improves the error message shown when trying to evaluate such unsupported predicates for the purpose of partition pruning. Change-Id: I94765f62bde94f4faa7fc5c26d928099ca1496d1 Reviewed-on: http://gerrit.cloudera.org:8080/5386 Reviewed-by: Alex Behm <alex.behm@cloudera.com> Tested-by: Internal Jenkins

            People

            • Assignee:
              alex.behm Alexander Behm
              Reporter:
              tarmstrong Tim Armstrong
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development