Uploaded image for project: 'IMPALA'
  2. IMPALA-5003

Add 'constant propagation' for Views with a partition filter



    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • Impala 2.9.0
    • Impala 2.9.0
    • Frontend


      There are scenarios when you (1) want to reduce the number of partitions by hash&mod or part of a date, etc and (2) do not want to expose this complexity to end users who do not know how the table might be partitioned.

      As an example: Say I have a column that has 1000 values and I want to partition on customer_id, but the data is too small for 1000 partitions so I want to mod the customer_id to put it into 100 'buckets', to give me a 100x faster scan when selecting a single customer_id.

      When using 3rd party tools or exposing this table to end users who may not be educated on the partitioning scheme, those tools will not properly filter based on simply selecting a specific customer_id.

      If this is my partitioned table:
      CREATE TABLE default.lesspartitions ( customer_id INT, customer_name STRING, some_data STRING ) PARTITIONED BY ( partition_id INT )
      select customer_id, customer_name, some_data, customer_id % 100 as partition_id from source

      It would be nice to create a view where the partition is hidden from users and filtered on automatically:
      CREATE VIEW default.hiddenpartitions AS SELECT customer_id, customer_name, some_data FROM default.lesspartitions WHERE partition_id = customer_id % 100

      So then a user/3rd party tool can write the following:
      select * from hiddenpartitions where customer_id = 4

      This jira would adding the 'constant propagation' of the filter into the view so that only 1 partition is read.


        Issue Links



              zamsden Zach Amsden
              PeterEbert Peter Ebert
              0 Vote for this issue
              4 Start watching this issue