Uploaded image for project: 'Pig'
  1. Pig
  2. PIG-1568

Optimization rule FilterAboveForeach is too restrictive and doesn't handle project * correctly

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • None
    • 0.8.0
    • None
    • None
    • Reviewed

    Description

      FilterAboveForeach rule is to optimize the plan by pushing up filter above previous foreach operator. However, during code review, two major problems were found:

      1. Current implementation assumes that if no projection is found in the filter condition then all columns from foreach are projected. This issue prevents the following optimization:
      A = LOAD 'file.txt' AS (a(u,v), b, c);
      B = FOREACH A GENERATE $0, b;
      C = FILTER B BY 8 > 5;
      STORE C INTO 'empty';

      2. Current implementation doesn't handle * probjection, which means project all columns. As a result, it wasn't able to optimize the following:
      A = LOAD 'file.txt' AS (a(u,v), b, c);
      B = FOREACH A GENERATE $0, b;
      C = FILTER B BY Identity.class.getName > 5;
      STORE C INTO 'empty';

      Attachments

        1. jira-1568-1.patch
          57 kB
          Xuefu Zhang
        2. jira-1568-1.patch
          58 kB
          Xuefu Zhang

        Activity

          People

            xuefuz Xuefu Zhang
            xuefuz Xuefu Zhang
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: