Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-2805

Order filters based on selectivity and cost

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • Impala 2.2, Impala 2.3.0
    • None
    • Frontend

    Description

      The planner should use statistics and simple heuristics to order filters based on selectivity and cost applying the filter per tuple highly selective filters should be applied before less selective ones

      Ordering can be as follows

      • C1 = 10
      • C1 in (1,2,3)
      • C1 < 10
      • C1 between 10 and 20
      • C1 like '%A'
      • C1 like '%A%'
      • Other....

      For the query below the correct ordering gives a 30% speedup.

      Query

      select 
          count(*)
      from
          lineitem
      where
          l_comment like '%a%';
      and l_orderkey = 1024
      

      Will need to do a followup experiment to calculate the cost of filters for different data types.

      Attachments

        Issue Links

          Activity

            People

              twmarshall Thomas Tauber-Marshall
              mmokhtar Mostafa Mokhtar
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: