Details

    • Type: Sub-task Sub-task
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.11
    • Component/s: None
    • Labels:
      None

      Description

      To reproduce, please run:

      ant clean test -Dtestcase=TestLoadStoreFuncLifeCycle -Dhadoopversion=23
      

      This fails with the following error:

      Error during parsing. Job in state DEFINE instead of RUNNING
      org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1000: Error during parsing. Job in state DEFINE instead of RUNNING
          at org.apache.pig.PigServer$Graph.parseQuery(PigServer.java:1607)
          at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1546)
          at org.apache.pig.PigServer.registerQuery(PigServer.java:516)
          at org.apache.pig.PigServer.registerQuery(PigServer.java:529)
          at org.apache.pig.TestLoadStoreFuncLifeCycle.testLoadStoreFunc(TestLoadStoreFuncLifeCycle.java:332)
      Caused by: Failed to parse: Job in state DEFINE instead of RUNNING
          at org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:193)
          at org.apache.pig.PigServer$Graph.parseQuery(PigServer.java:1599)
      Caused by: java.lang.IllegalStateException: Job in state DEFINE instead of RUNNING
          at org.apache.hadoop.mapreduce.Job.ensureState(Job.java:292)
          at org.apache.hadoop.mapreduce.Job.toString(Job.java:456)
          at java.lang.String.valueOf(String.java:2826)
          at org.apache.pig.TestLoadStoreFuncLifeCycle.logCaller(TestLoadStoreFuncLifeCycle.java:270)
          at org.apache.pig.TestLoadStoreFuncLifeCycle.access$000(TestLoadStoreFuncLifeCycle.java:41)
          at org.apache.pig.TestLoadStoreFuncLifeCycle$InstrumentedStorage.logCaller(TestLoadStoreFuncLifeCycle.java:54)
          at org.apache.pig.TestLoadStoreFuncLifeCycle$InstrumentedStorage.getSchema(TestLoadStoreFuncLifeCycle.java:115)
          at org.apache.pig.newplan.logical.relational.LOLoad.getSchemaFromMetaData(LOLoad.java:174)
          at org.apache.pig.newplan.logical.relational.LOLoad.<init>(LOLoad.java:88)
          at org.apache.pig.parser.LogicalPlanBuilder.buildLoadOp(LogicalPlanBuilder.java:839)
          at org.apache.pig.parser.LogicalPlanGenerator.load_clause(LogicalPlanGenerator.java:3236)
          at org.apache.pig.parser.LogicalPlanGenerator.op_clause(LogicalPlanGenerator.java:1315)
          at org.apache.pig.parser.LogicalPlanGenerator.general_statement(LogicalPlanGenerator.java:799)
          at org.apache.pig.parser.LogicalPlanGenerator.statement(LogicalPlanGenerator.java:517)
          at org.apache.pig.parser.LogicalPlanGenerator.query(LogicalPlanGenerator.java:392)
          at org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:184)
      
      1. PIG-2978.patch
        2 kB
        Cheolsoo Park
      2. PIG-2978-2.patch
        2 kB
        Cheolsoo Park

        Activity

        Hide
        Cheolsoo Park added a comment -

        The error is thrown because the Hadoop-2.0.x Job class overrides the toString() method, and it checks whether the JobState is running or not.

          @Override
          public String toString() {
            ensureState(JobState.RUNNING);
            ...
        

        In hadoop-1.0.x, String.valueOf(<Job object>) is printed as something like this:

        org.apache.hadoop.mapreduce.Job@7ee49dcd
        

        So I replaced String.valueOf(<Job object>) with Job.class.getName().

        With this change, the test now runs with hadoop-2.0.x but fails because the no. of store func instances is 5, whereas it is 4 with hadoop-1.0.x. So I replaced the assertion as follows:

        Storer.count <= (Util.isHadoop2_0() ? 5 : 4)
        

        It would be good if we could identify why it is different between hadoop-1.0.x and hadoop-2.0.x.

        Show
        Cheolsoo Park added a comment - The error is thrown because the Hadoop-2.0.x Job class overrides the toString() method, and it checks whether the JobState is running or not. @Override public String toString() { ensureState(JobState.RUNNING); ... In hadoop-1.0.x, String.valueOf(<Job object>) is printed as something like this: org.apache.hadoop.mapreduce.Job@7ee49dcd So I replaced String.valueOf(<Job object>) with Job.class.getName(). With this change, the test now runs with hadoop-2.0.x but fails because the no. of store func instances is 5, whereas it is 4 with hadoop-1.0.x. So I replaced the assertion as follows: Storer.count <= (Util.isHadoop2_0() ? 5 : 4) It would be good if we could identify why it is different between hadoop-1.0.x and hadoop-2.0.x.
        Hide
        Cheolsoo Park added a comment -

        Here is the difference between hadoop-1.0.x and 2.0.x:

        hadoop-1.0.x
        Storer[3].<init>()
        Storer[3].setStoreFuncUDFContextSignature(A_1-1)
        Storer[3].setStoreLocation(bar, org.apache.hadoop.mapreduce.Job)
        Storer[3].getOutputFormat()
        Storer[3].setStoreLocation(bar, org.apache.hadoop.mapreduce.Job)
        
        hadoop-2.0.x
        Storer[3].<init>()
        Storer[3].setStoreFuncUDFContextSignature(A_1-1)
        Storer[3].setStoreLocation(bar, org.apache.hadoop.mapreduce.Job)
        Storer[3].getOutputFormat()
        Storer[3].setStoreLocation(bar, org.apache.hadoop.mapreduce.Job)
        Storer[4].<init>()
        Storer[4].setStoreFuncUDFContextSignature(A_1-1)
        Storer[4].setStoreLocation(bar, org.apache.hadoop.mapreduce.Job)
        Storer[4].getOutputFormat()
        Storer[4].setStoreLocation(bar, org.apache.hadoop.mapreduce.Job)
        

        For whatever reason, getStoreFunc is repeated with hadoop-2.0.x. The call stack of the extra 4th instantiation is below:

        Storer[4].<init> called by 
        org.apache.pig.impl.PigContext.instantiateFuncFromSpec(PigContext.java:577)
        org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POStore.getStoreFunc(POStore.java:232)
        org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputCommitter.getCommitters(PigOutputCommitter.java:85)
        org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputCommitter.<init>(PigOutputCommitter.java:67)
        org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat.getOutputCommitter(PigOutputFormat.java:279)
        
        Show
        Cheolsoo Park added a comment - Here is the difference between hadoop-1.0.x and 2.0.x: hadoop-1.0.x Storer[3].<init>() Storer[3].setStoreFuncUDFContextSignature(A_1-1) Storer[3].setStoreLocation(bar, org.apache.hadoop.mapreduce.Job) Storer[3].getOutputFormat() Storer[3].setStoreLocation(bar, org.apache.hadoop.mapreduce.Job) hadoop-2.0.x Storer[3].<init>() Storer[3].setStoreFuncUDFContextSignature(A_1-1) Storer[3].setStoreLocation(bar, org.apache.hadoop.mapreduce.Job) Storer[3].getOutputFormat() Storer[3].setStoreLocation(bar, org.apache.hadoop.mapreduce.Job) Storer[4].<init>() Storer[4].setStoreFuncUDFContextSignature(A_1-1) Storer[4].setStoreLocation(bar, org.apache.hadoop.mapreduce.Job) Storer[4].getOutputFormat() Storer[4].setStoreLocation(bar, org.apache.hadoop.mapreduce.Job) For whatever reason, getStoreFunc is repeated with hadoop-2.0.x. The call stack of the extra 4th instantiation is below: Storer[4].<init> called by org.apache.pig.impl.PigContext.instantiateFuncFromSpec(PigContext.java:577) org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POStore.getStoreFunc(POStore.java:232) org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputCommitter.getCommitters(PigOutputCommitter.java:85) org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputCommitter.<init>(PigOutputCommitter.java:67) org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat.getOutputCommitter(PigOutputFormat.java:279)
        Hide
        Julien Le Dem added a comment -

        Thanks Cheolsoo for investigating this.
        Yes the goal of this test is to avoid this kind of extra instance to creep in unnoticed.
        It would be great to know why Pig gets a new POStore at that point.

        Show
        Julien Le Dem added a comment - Thanks Cheolsoo for investigating this. Yes the goal of this test is to avoid this kind of extra instance to creep in unnoticed. It would be great to know why Pig gets a new POStore at that point.
        Hide
        Cheolsoo Park added a comment -

        Thanks Julien for your comment. I will take a further look at this.

        Show
        Cheolsoo Park added a comment - Thanks Julien for your comment. I will take a further look at this.
        Hide
        Rohini Palaniswamy added a comment -

        +1. https://reviews.apache.org/r/8121/

        Checked that the additional call is fine. LocalJobRunner gets the outputcommitter to call setupJob (MAPREDUCE-3563) which was not done in H20's LocalJobRunner. In a real mapreduce environment the StoreFunc would be called more than this as the commitJob would be happening in a separate task as opposed to LocalJobRunner. We should change the test to run in MiniCluster mode and count the instantiations in that case by creating side files to catch mapreduce framework changes. But still this test is useful in local mode to catch any unnecessary new instantiations in pig code itself.

        Show
        Rohini Palaniswamy added a comment - +1. https://reviews.apache.org/r/8121/ Checked that the additional call is fine. LocalJobRunner gets the outputcommitter to call setupJob ( MAPREDUCE-3563 ) which was not done in H20's LocalJobRunner. In a real mapreduce environment the StoreFunc would be called more than this as the commitJob would be happening in a separate task as opposed to LocalJobRunner. We should change the test to run in MiniCluster mode and count the instantiations in that case by creating side files to catch mapreduce framework changes. But still this test is useful in local mode to catch any unnecessary new instantiations in pig code itself.
        Hide
        Cheolsoo Park added a comment -

        Incorporated Rohini's comments in the RB.

        • Changed Job.class.getName() to getJobName()
        • Added comments regarding the difference between hadoop 1.0.x and 2.0.x in terms of the number of StoreFunc instances.
        Show
        Cheolsoo Park added a comment - Incorporated Rohini's comments in the RB. Changed Job.class.getName() to getJobName() Added comments regarding the difference between hadoop 1.0.x and 2.0.x in terms of the number of StoreFunc instances.
        Hide
        Cheolsoo Park added a comment -

        Committed to 0.11/trunk.

        Thanks Rohini for clarifying the difference in Hadoop 2.0.x.

        Show
        Cheolsoo Park added a comment - Committed to 0.11/trunk. Thanks Rohini for clarifying the difference in Hadoop 2.0.x.

          People

          • Assignee:
            Cheolsoo Park
            Reporter:
            Cheolsoo Park
          • Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development