Uploaded image for project: 'Pig'
  1. Pig
  2. PIG-4767

Partition filter not pushed down when filter clause references variable from another load path

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 0.15.0
    • 0.18.0
    • None
    • None
    • Reviewed

    Description

      To reproduce:

      test.pig
      a = load 'a.txt';
      a_group = group a all;
      a_count = foreach a_group generate COUNT(a) as count;
      
      b = load 'mytable' using org.apache.hcatalog.pig.HCatLoader();
      b = filter b by datepartition == '2015-09-01-00' and foo == a_count.count;
      
      dump b;
      

      The above query ends up reading all the table partitions. If you remove the foo == a_count.count clause or replace a_count.count with a constant, then partition filtering happens properly.

      Attachments

        1. pig-4767-v01.patch
          5 kB
          Koji Noguchi

        Issue Links

          Activity

            People

              knoguchi Koji Noguchi
              erwaman Anthony Hsu
              Votes:
              1 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: