Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-3065

Incorrect placement of a !empty() predicate at an outer join.

Attach filesAttach ScreenshotVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    Description

      The following query incorrectly returns 0.

      SELECT COUNT(*)
      FROM table_3.field_87 a1
      INNER JOIN table_1.field_18 a2 ON (a2.key) = (a1.value)
      INNER JOIN a2.value a3
      LEFT JOIN (
      SELECT
      (MAX(a7.pos)) + (711) AS int_col
      FROM a2.value a4
      INNER JOIN a4.item a5
      INNER JOIN a4.item a6
      INNER JOIN a2.value.item a7 ON ((a7.pos) = (a4.pos)) AND ((a7.pos) = (a4.pos))
      WHERE
      (822.8632536034) < (a6.pos)
      ) a8
      RIGHT JOIN table_3.field_88.value a9 ON (a3.pos) >= (a3.pos)
      WHERE
      (a9.pos) = (a3.pos)
      

      It's strange that when the Where clause (WHERE (a9.pos) = (a3.pos)) is removed, the following error is returned.

      ERROR: NotImplementedException: Error generating a valid execution plan for this query. A RIGHT OUTER JOIN type with no equi-join predicates can only be executed with a single node plan.
      

      The equivalent query in postgres returns 14299657:

      SELECT COUNT(*)
      FROM table_3_field_87 a1
      INNER JOIN  table_1_field_18 a2 ON (a2.key) = (a1.value)
      INNER JOIN LATERAL (
      SELECT a3.*
      FROM table_1_field_18__values a3
      WHERE
      (a2.id) = (a3.table_1_field_18_id)
      ) a3 ON True
      LEFT JOIN LATERAL (
      SELECT
      (MAX(a7.idx)) + (711) AS int_col
      FROM table_1_field_18__values a4
      INNER JOIN LATERAL (
      SELECT a5.*
      FROM table_1_field_18__values__values a5
      WHERE
      (a4.id) = (a5.table_1_field_18__values_id)
      ) a5 ON True
      INNER JOIN LATERAL (
      SELECT a6.*
      FROM table_1_field_18__values__values a6
      WHERE
      (a4.id) = (a6.table_1_field_18__values_id)
      ) a6 ON True
      INNER JOIN  (
      SELECT a7.*
      FROM table_1_field_18__values tmp_alias_1
      INNER JOIN  table_1_field_18__values__values a7 ON (tmp_alias_1.id) = (a7.table_1_field_18__values_id)
      WHERE
      (a2.id) = (tmp_alias_1.table_1_field_18_id)
      ) a7 ON ((a7.idx) = (a4.idx)) AND ((a7.idx) = (a4.idx))
      WHERE
      ((a2.id) = (a4.table_1_field_18_id)) AND ((822.8632536034) < (a6.idx))
      ) a8 ON True
      RIGHT JOIN  table_3_field_88__values a9 ON (a3.idx) >= (a3.idx)
      WHERE
      (a9.idx) = (a3.idx)
      

      To reproduce:

      ssh dev@vd0206.halxg.cloudera.com -p 22222 (pw: cloudera)
      Impala db name: random_nested_db_0
      Postgres db name: random_nested_db_flat_0
      
      

      Attachments

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            alex.behm Alexander Behm
            tarasbob Taras Bobrovytsky
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment