Uploaded image for project: 'Pig'
  1. Pig
  2. PIG-3985

Multiquery execution of RANK with RANK BY causes NPE JobCreationException "ERROR 2017: Internal error creating job configuration"



    • Reviewed
    • Multiquery execution of RANK with RANK BY causes NPE


      A script with both RANK and RANK BY will crash with a Null Pointer Exception in JobControlCompiler.java when multiquery is enabled.

      The following script will work for any combination of the RANK BY operations; or if there is one RANK operation only (i.e. no other RANK or RANK BY operation). Non-BY-RANKS will perish together but succeed alone.

      Disabling multiquery execution makes everything work again.

      I am using Hadoop 2.4.0 with Pig Trunk (d24d06a48, after PIG-3739). The error occurs in local or mapreduce mode.

      -- disable multiquery and you can rank all day long
      -- SET opt.multiquery false
      citypops = LOAD 'us_city_pops.tsv' AS (city:chararray, state:chararray, pop_2011:int);
      citypops_o = ORDER citypops BY city;
      -- if you have one non-by RANK you may not have any other RANKs
      citypops_nosort_inplace    = RANK citypops;
      citypops_presorted_inplace = RANK citypops_o;
      citypops_ties_cause_skips  = RANK citypops   BY city;
      citypops_ties_no_skips     = RANK citypops   BY city  DENSE;
      citypops_presorted_ranked  = RANK citypops_o BY city;
      STORE citypops_nosort_inplace    INTO '/tmp/citypops_nosort_inplace'    USING PigStorage('\t', '--overwrite true');
      -- STORE citypops_presorted_inplace INTO '/tmp/citypops_presorted_inplace' USING PigStorage('\t', '--overwrite true');
      STORE citypops_ties_cause_skips  INTO '/tmp/citypops_ties_cause_skips'  USING PigStorage('\t', '--overwrite true');
      -- STORE citypops_ties_no_skips     INTO '/tmp/citypops_ties_no_skips'     USING PigStorage('\t', '--overwrite true');
      -- STORE citypops_presorted_ranked  INTO '/tmp/citypops_presorted_ranked'  USING PigStorage('\t', '--overwrite true');
      Pig Stack Trace
      ERROR 2017: Internal error creating job configuration.
      org.apache.pig.backend.hadoop.executionengine.JobCreationException: ERROR 2017: Internal error creating job configuration.
              at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler.getJob(JobControlCompiler.java:946)
              at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler.compile(JobControlCompiler.java:322)
              at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig(MapReduceLauncher.java:200)
           --- SNIP ----
      Caused by: java.lang.NullPointerException
              at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler.getJob(JobControlCompiler.java:886)
              ... 19 more

      The proximate offense seems to be that globalCounters.get(operationID) returns null:

                  if(mro.isRankOperation()) {
                      Iterator<String> operationIDs = mro.getRankOperationId().iterator();
                      while(operationIDs.hasNext()) {
                          String operationID = operationIDs.next();
                          Iterator<Pair<String, Long>> itPairs = globalCounters.get(operationID).iterator();
                          Pair<String,Long> pair = null;
                          while(itPairs.hasNext()) {
                              pair = itPairs.next();
                              conf.setLong(pair.first, pair.second);

      PORank.java line 184 seems to need a counter value, and so this part does need to happen.


        1. many_ranks_much_sadness.pig
          1 kB
          Flip Kromer
        2. us_city_pops.tsv
          2 kB
          Flip Kromer
        3. pig-3985-v01.txt
          2 kB
          Koji Noguchi
        4. PIG-3985-2.patch
          9 kB
          Rohini Palaniswamy

        Issue Links



              rohini Rohini Palaniswamy
              mrflip Flip Kromer
              0 Vote for this issue
              4 Start watching this issue