Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-4531

In-predicate not pushed to Kudu scan node

    Details

    • Type: New Feature
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: Kudu_Impala
    • Fix Version/s: Impala 2.8.0
    • Component/s: Backend
    • Labels:
      None

      Description

      In-predicates are not pushed to Kudu scan node, it appears that Kudu client does support in-predicates.

      Query

      select count(*) from lineitem where l_orderkey in (1,2) and l_orderkey = 10
      

      Plan

      +----------------------------------------------------------+
      | Explain String                                           |
      +----------------------------------------------------------+
      | Estimated Per-Host Requirements: Memory=10.00MB VCores=1 |
      |                                                          |
      | PLAN-ROOT SINK                                           |
      | |                                                        |
      | 03:AGGREGATE [FINALIZE]                                  |
      | |  output: count:merge(*)                                |
      | |                                                        |
      | 02:EXCHANGE [UNPARTITIONED]                              |
      | |                                                        |
      | 01:AGGREGATE                                             |
      | |  output: count(*)                                      |
      | |                                                        |
      | 00:SCAN KUDU [tpch_100_kudu.lineitem]                    |
      |    predicates: l_orderkey IN (1, 2)                      |
      |    kudu predicates: l_orderkey = 10                      |
      +----------------------------------------------------------+
      

        Activity

        Hide
        mjacobs Matthew Jacobs added a comment -

        commit 9f387c858354a5c7df5fd922e731f887aa7e51f7
        Author: Matthew Jacobs <mj@cloudera.com>
        Date: Thu Dec 1 18:16:33 2016 -0800

        IMPALA-4571: Push IN predicates to Kudu

        Fixes the KuduScanNode to convert InPredicates to
        KuduPredicates and push them to the Kudu scan if possible.

        An InPredicate can be pushed to the scan if expression is of
        the exact form:
        <SlotRef> IN (<LiteralExpr>, <LiteralExpr>, ...)

        That means the InPredicate has the following properties:
        1) It has a list of literal values (i.e. not a subquery);
        All values are LiteralExprs (not SlotRefs).
        2) Not negative, i.e. only 'IN' supported, not 'NOT IN'
        3) The SlotRef is not wrapped in any casts
        4) The types of all values match the type of the SlotRef
        exactly.

        A planner test was added exercising all supported types as
        well as exprs where the values would not be supported.

        TODO: perf testing
        TODO: consider a limit on the number of list values before
        keeping the predicate on the Impala scan node
        (determine from testing)

        Change-Id: I8988d4819d20d467b48e286917e347ca00f60cf0
        Reviewed-on: http://gerrit.cloudera.org:8080/5316
        Reviewed-by: Matthew Jacobs <mj@cloudera.com>
        Tested-by: Internal Jenkins

        Show
        mjacobs Matthew Jacobs added a comment - commit 9f387c858354a5c7df5fd922e731f887aa7e51f7 Author: Matthew Jacobs <mj@cloudera.com> Date: Thu Dec 1 18:16:33 2016 -0800 IMPALA-4571 : Push IN predicates to Kudu Fixes the KuduScanNode to convert InPredicates to KuduPredicates and push them to the Kudu scan if possible. An InPredicate can be pushed to the scan if expression is of the exact form: <SlotRef> IN (<LiteralExpr>, <LiteralExpr>, ...) That means the InPredicate has the following properties: 1) It has a list of literal values (i.e. not a subquery); All values are LiteralExprs (not SlotRefs). 2) Not negative, i.e. only 'IN' supported, not 'NOT IN' 3) The SlotRef is not wrapped in any casts 4) The types of all values match the type of the SlotRef exactly. A planner test was added exercising all supported types as well as exprs where the values would not be supported. TODO: perf testing TODO: consider a limit on the number of list values before keeping the predicate on the Impala scan node (determine from testing) Change-Id: I8988d4819d20d467b48e286917e347ca00f60cf0 Reviewed-on: http://gerrit.cloudera.org:8080/5316 Reviewed-by: Matthew Jacobs <mj@cloudera.com> Tested-by: Internal Jenkins

          People

          • Assignee:
            mjacobs Matthew Jacobs
            Reporter:
            mmokhtar Mostafa Mokhtar
          • Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development