Uploaded image for project: 'Pig'
  1. Pig
  2. PIG-369

Filter does not allow udf as the filter operator and only allows ComparisonOperators

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 0.2.0
    • 0.2.0
    • None
    • None

    Description

      The following pig script does not work:

      register util.jar;
      define MyFilterSet util.FilterUdf('filter.txt');
      A = load 'simpletest' using PigStorage() as ( x, y );
      B = filter A by MyFilterSet(x);
      dump B;
      

      The following error is seen:

      java -cp pig.jar:$localc org.apache.pig.Main filter.pig 
      2008-08-07 17:59:37,663 [main] INFO  org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to hadoop file system at: localhost:9000
      2008-08-07 17:59:37,748 [main] WARN  org.apache.hadoop.fs.FileSystem - "localhost:9000" is a deprecated filesystem name. Use "hdfs://localhost:9000/" instead.
      2008-08-07 17:59:38,035 [main] INFO  org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to map-reduce job tracker at: localhost:9001
      2008-08-07 17:59:38,166 [main] WARN  org.apache.hadoop.fs.FileSystem - "localhost:9000" is a deprecated filesystem name. Use "hdfs://localhost:9000/" instead.
      java.io.IOException: Unable to open iterator for alias: B [org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc]
              at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POFilter.setPlan(POFilter.java:179)
              at org.apache.pig.backend.hadoop.executionengine.physicalLayer.LogToPhyTranslationVisitor.visit(LogToPhyTranslationVisitor.java:592)
              at org.apache.pig.impl.logicalLayer.LOFilter.visit(LOFilter.java:102)
              at org.apache.pig.impl.logicalLayer.LOFilter.visit(LOFilter.java:31)
              at org.apache.pig.impl.plan.DependencyOrderWalker.walk(DependencyOrderWalker.java:68)
              at org.apache.pig.impl.plan.PlanVisitor.visit(PlanVisitor.java:51)
              at org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.compile(HExecutionEngine.java:245)
              at org.apache.pig.PigServer.compilePp(PigServer.java:590)
              at org.apache.pig.PigServer.execute(PigServer.java:516)
              at org.apache.pig.PigServer.openIterator(PigServer.java:307)
              at org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:258)
              at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:175)
              at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:82)
              at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:64)
              at org.apache.pig.Main.main(Main.java:302)
      Caused by: java.lang.ClassCastException: org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc
              ... 15 more
      
      

      I looked further and the issue seems to be in POFilter which only thinks of the filter operator as a ComparisonOperator and doesn't allow a UDF for filtering:

      public void setPlan(PhysicalPlan plan) {
              this.plan = plan;
              comOp = (ComparisonOperator) (plan.getLeaves()).get(0);
              compOperandType = comOp.getOperandType();
          }
      

      Attachments

        1. 369.patch
          7 kB
          Shravan Matthur Narayanamurthy

        Activity

          People

            shravanmn Shravan Matthur Narayanamurthy
            pkamath Pradeep Kamath
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: