Pig
  1. Pig
  2. PIG-3395

Large filter expression makes Pig hang

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.12.0
    • Component/s: impl
    • Labels:
      None

      Description

      Currently, partition filter push down is quite costly. For example, if you have many nested or/and expressions, Pig hangs:

      base = load '<partitioned table>' using MyStorage();
      filt = filter base by
      (dateint == 20130719 and batchid == 'merged_1' and hour IN (19,20,21,22,23))
      or
      (dateint == 20130720 and batchid == 'merged_1' and hour IN (0,1,2,3,4,5,6,7,8))
      or
      (dateint == 20130720 and batchid == 'merged_2' and hour == 7)
      or
      (dateint == 20130720 and batchid == 'merged_1' and hour IN (9,10,11,12,13,14,15,16,17,18,19,20,21,22,23))
      or
      (dateint == 20130721 and batchid == 'merged_1' and hour IN (0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23))
      or
      (dateint == 20130722 and batchid == 'merged_1' and hour IN (0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16));
      dump filt;
      

      Note that IN operator is converted to nested OR's by Pig parser.

      Looking at the thread dump, I found it creates almost 60 stack frames and makes JVM suffer. (I will attach full stack trace.)

      <repeated ...>
      at org.apache.pig.newplan.PColFilterExtractor.visit(PColFilterExtractor.java:504)
      at org.apache.pig.newplan.PColFilterExtractor.visit(PColFilterExtractor.java:237)
      at org.apache.pig.newplan.PColFilterExtractor.visit(PColFilterExtractor.java:504)
      at org.apache.pig.newplan.PColFilterExtractor.visit(PColFilterExtractor.java:214)
      at org.apache.pig.newplan.PColFilterExtractor.visit(PColFilterExtractor.java:504)
      at org.apache.pig.newplan.PColFilterExtractor.visit(PColFilterExtractor.java:211)
      at org.apache.pig.newplan.PColFilterExtractor.visit(PColFilterExtractor.java:108)
      

      Although the filter expression can be simplified, it seems possible to make PColFilterExtractor more efficient.

      1. thread_dump.txt
        10 kB
        Cheolsoo Park
      2. PIG-3395.patch
        8 kB
        Cheolsoo Park
      3. PIG-3395-2.patch
        12 kB
        Cheolsoo Park

        Issue Links

          Activity

          Hide
          Cheolsoo Park added a comment -

          Thank you Rohini for the review! Committed it to trunk.

          I fixed all the white space problems in PColFilterExtractor.java (trailing white spaces and tabs), so it should be easy on the eyes now.

          Show
          Cheolsoo Park added a comment - Thank you Rohini for the review! Committed it to trunk. I fixed all the white space problems in PColFilterExtractor.java (trailing white spaces and tabs), so it should be easy on the eyes now.
          Hide
          Rohini Palaniswamy added a comment -

          +1. There are few white spaces. Can you just remove them before the commit?

          Show
          Rohini Palaniswamy added a comment - +1. There are few white spaces. Can you just remove them before the commit?
          Hide
          Cheolsoo Park added a comment -

          Added new test cases to confirm that the filter doesn't get pushed down if udf/cast/null expressions are mixed with or/and expressions.

          ReviewBoard:
          https://reviews.apache.org/r/13186/

          Show
          Cheolsoo Park added a comment - Added new test cases to confirm that the filter doesn't get pushed down if udf/cast/null expressions are mixed with or/and expressions. ReviewBoard: https://reviews.apache.org/r/13186/
          Hide
          Cheolsoo Park added a comment -

          I have been testing my patch as per Rohini's suggestion, and it works correctly. Here are what I tested:

          • (cast) or (pcond and pcond) or (pcond and pcond)
          • (pcond and pcond) or (cast) or (pcond and pcond)
          • (pcond and pcond) or (pcond and pcond) or (cast)

          In all these cases, the entire filter is rejected due to the cast expression, which is the same as before.

          Adding test cases is a bit more involving because the test helper function isn't written for such conditions. But I will add a few test cases.

          Show
          Cheolsoo Park added a comment - I have been testing my patch as per Rohini's suggestion, and it works correctly. Here are what I tested: (cast) or (pcond and pcond) or (pcond and pcond) (pcond and pcond) or (cast) or (pcond and pcond) (pcond and pcond) or (pcond and pcond) or (cast) In all these cases, the entire filter is rejected due to the cast expression, which is the same as before. Adding test cases is a bit more involving because the test helper function isn't written for such conditions. But I will add a few test cases.
          Hide
          Cheolsoo Park added a comment -

          Rohini Palaniswamy, I can certainly add more test cases to verify that. Let me do it.

          Show
          Cheolsoo Park added a comment - Rohini Palaniswamy , I can certainly add more test cases to verify that. Let me do it.
          Hide
          Rohini Palaniswamy added a comment -

          Before the whole and/or tree will not be pushed down. The visit of lhs and rhs is still there, but I am not sure how the replace will behave because it does not have full context and something partial might get pushed. Can you just modify your testcase to include one of those conditions to test the behaviour if we have cast or null check?

          Show
          Rohini Palaniswamy added a comment - Before the whole and/or tree will not be pushed down. The visit of lhs and rhs is still there, but I am not sure how the replace will behave because it does not have full context and something partial might get pushed. Can you just modify your testcase to include one of those conditions to test the behaviour if we have cast or null check?
          Hide
          Cheolsoo Park added a comment -

          Rohini Palaniswamy, thank you for the comment. Please correct me if I am wrong, but I am not removing all the recursive calls from visit() but taking out one particular check (i.e. (pcond and cond) or (pcond and cond)). So I think it still checks other cases that you mentioned, no?

          Show
          Cheolsoo Park added a comment - Rohini Palaniswamy , thank you for the comment. Please correct me if I am wrong, but I am not removing all the recursive calls from visit() but taking out one particular check (i.e. (pcond and cond) or (pcond and cond)). So I think it still checks other cases that you mentioned, no?
          Hide
          Rohini Palaniswamy added a comment -

          Additional cases that visit(ProjectExpression project) checks like userfunc, cast, null, bincond will not be done right? It might lead to script failing if they are present.

          Show
          Rohini Palaniswamy added a comment - Additional cases that visit(ProjectExpression project) checks like userfunc, cast, null, bincond will not be done right? It might lead to script failing if they are present.
          Hide
          Cheolsoo Park added a comment -

          All unit tests pass.

          Show
          Cheolsoo Park added a comment - All unit tests pass.
          Hide
          Cheolsoo Park added a comment -

          I am attaching a patch that refactors the logic of detecting the pattern of "(pcond and cond) or (pcond and cond)" that was added by PIG-3173 into a separate function.

          I also added a unit test case that demonstrates the issue. If you run it without the fix, it infinitely hangs. With the fix, it runs within seconds.

          In terms of testing,

          • ant test-commit passes.
          • ant test -Dtestcase=TestPartitionFilterPushDown passes.
          Show
          Cheolsoo Park added a comment - I am attaching a patch that refactors the logic of detecting the pattern of "(pcond and cond) or (pcond and cond)" that was added by PIG-3173 into a separate function. I also added a unit test case that demonstrates the issue. If you run it without the fix, it infinitely hangs. With the fix, it runs within seconds. In terms of testing, ant test-commit passes. ant test -Dtestcase=TestPartitionFilterPushDown passes.
          Hide
          Cheolsoo Park added a comment -

          Attaching the tread dump. I will shortly upload a patch that refactors PColFilterExtractor that makes it work.

          Show
          Cheolsoo Park added a comment - Attaching the tread dump. I will shortly upload a patch that refactors PColFilterExtractor that makes it work.

            People

            • Assignee:
              Cheolsoo Park
              Reporter:
              Cheolsoo Park
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development