Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-9694

IllegalStateException when inlineView has AggregationNode and different alias on the same column

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Critical
    • Resolution: Fixed
    • None
    • Impala 4.0.0
    • Frontend
    • None
    • ghx-label-4

    Description

      The following query fails with IllegalStateException:

      > create table t1(ts TIMESTAMP, id BIGINT);
      > select count(*) from (
          select lead(ts) over (partition by id order by ts),
                 id foo,
                 id bar
          from t1
        ) v;
      ERROR: IllegalStateException: Illegal reference to non-materialized slot: tid=0 sid=1
      

      The query works without the PartitionBy clause:

      select count(*) from (
        select lead(ts) over (order by ts),
               id foo,
               id bar
        from t1
      ) v;
      

      Or avoid different alias on the same column:

      select count(*) from (
        select lead(ts) over (partition by id order by ts),
               id foo
        from t1
      ) v;
      

      Clues
      Enabling TRACE level logging of the coordinator and rerun the failed query, the illegal predicate "v.foo = v.bar" shows up:

      I0426 20:10:10.194453  1068 Analyzer.java:2143] 4241809b91bf54de:d5365a1700000000] Created inferred predicate: BinaryPredicate{op==, SlotRef{label=v.foo, type=BIGINT, id=4} SlotRef{label=v.bar, type=BIGINT, id=5}, isInferred=true}
      I0426 20:10:10.194509  1068 Analyzer.java:2574] 4241809b91bf54de:d5365a1700000000] Assigned BinaryPredicate{op==, SlotRef{label=v.foo, type=BIGINT, id=4} SlotRef{label=v.bar, type=BIGINT, id=5}, isInferred=true}
      

      It's finally migrated into the inlineView and references to "t1.id" which is non-materialized. The planner generates this predicate since there are two auxiliary predicates "v.foo = t1.id", "v.bar = t1.id". These two auxiliary predicates only occur when the AggregationNode exists. We need to look deeper into this bug.

      Workaround
      Replace the second occurrence of the reference column with an identity expression, e.g. change the above-failed query to:

      select count(*) from (
        select lead(ts) over (partition by id order by ts),
               id foo,
               id - 1 + 1 as bar
        from t1
      ) v;
      

      By replacing the second occurrence of "id" to "id- 1 + 1", the planner can't detect that the equivalence between "v.foo" and "v.bar". So the AggregationNode won't have the redundant predicate "v.foo = v.bar" and won't have illegal reference to "t1.id".

      Attachments

        Issue Links

          Activity

            People

              amansinha Aman Sinha
              stigahuang Quanlong Huang
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: