Uploaded image for project: 'Pig'
  1. Pig
  2. PIG-3813

Rank column is assigned different uids everytime when schema is reset

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Critical
    • Resolution: Fixed
    • Affects Version/s: 0.12.0
    • Fix Version/s: 0.12.1, 0.13.0
    • Component/s: impl
    • Labels:
      None

      Description

      When the following script is run, pig goes into an infinite loop. This was reproduced on pig trunk as of March 12, 2014 on apache hadoop 1.2. test_data.txt has been attached.

      test.pig
      tWeek = LOAD '/tmp/test_data.txt' USING PigStorage ('|') AS (WEEK:int, DESCRIPTION:chararray, END_DATE:chararray, PERIOD:int);

      gTWeek = FOREACH tWeek GENERATE WEEK AS WEEK, PERIOD AS PERIOD;

      pWeek = FILTER gTWeek BY PERIOD == 201312;

      pWeekRanked = RANK pWeek BY WEEK ASC DENSE;

      gpWeekRanked = FOREACH pWeekRanked GENERATE $0;
      store gpWeekRanked into 'gpWeekRanked';
      describe gpWeekRanked;
      ---------------------------------------------------

      The res object of class Result, gets its value from leaf.getNextTuple()
      This gets an empty tuple
      ()
      with STATUS_OK.

      SO the while(true) condition never gets an End of Processing (EOP) and so does not exit.

        Attachments

        1. test_data.txt
          11 kB
          Suhas Satish
        2. PIG-3813-2.patch
          2 kB
          Cheolsoo Park
        3. PIG-3813-1.patch
          1 kB
          Cheolsoo Park

          Activity

            People

            • Assignee:
              cheolsoo Cheolsoo Park
              Reporter:
              ssatish Suhas Satish
            • Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: