Uploaded image for project: 'Apache Drill'
  1. Apache Drill
  2. DRILL-6949

Query fails with "UNSUPPORTED_OPERATION ERROR: Hash-Join can not partition the inner data any further" when Semi join is enabled

    XMLWordPrintableJSON

    Details

      Description

      Following query fails when with Error: UNSUPPORTED_OPERATION ERROR: Hash-Join can not partition the inner data any further (probably due to too many join-key duplicates) on TPC-H SF100 data.

      
      set `exec.hashjoin.enable.runtime_filter` = true;
      set `exec.hashjoin.runtime_filter.max.waiting.time` = 10000;
      set `planner.enable_broadcast_join` = false;
      
      
      select
       count(*)
      from
       lineitem l1
      where
       l1.l_discount IN (
       select
       distinct(cast(l2.l_discount as double))
       from
       lineitem l2);
      
      reset `exec.hashjoin.enable.runtime_filter`;
      reset `exec.hashjoin.runtime_filter.max.waiting.time`;
      reset `planner.enable_broadcast_join`;
      
      

      The subquery contains distinct keyword and hence there should not be duplicate values.

      I suspect that the failure is caused by semijoin because the query succeeds when semijoin is disabled explicitly.
       

       

        Attachments

          Activity

            People

            • Assignee:
              ben-zvi Boaz Ben-Zvi
              Reporter:
              aravi5 Abhishek Ravi
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated: