Uploaded image for project: 'Pig'
  1. Pig
  2. PIG-3813

Rank column is assigned different uids everytime when schema is reset

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Critical
    • Resolution: Fixed
    • 0.12.0
    • 0.12.1, 0.13.0
    • impl
    • None

    Description

      When the following script is run, pig goes into an infinite loop. This was reproduced on pig trunk as of March 12, 2014 on apache hadoop 1.2. test_data.txt has been attached.

      test.pig
      tWeek = LOAD '/tmp/test_data.txt' USING PigStorage ('|') AS (WEEK:int, DESCRIPTION:chararray, END_DATE:chararray, PERIOD:int);

      gTWeek = FOREACH tWeek GENERATE WEEK AS WEEK, PERIOD AS PERIOD;

      pWeek = FILTER gTWeek BY PERIOD == 201312;

      pWeekRanked = RANK pWeek BY WEEK ASC DENSE;

      gpWeekRanked = FOREACH pWeekRanked GENERATE $0;
      store gpWeekRanked into 'gpWeekRanked';
      describe gpWeekRanked;
      ---------------------------------------------------

      The res object of class Result, gets its value from leaf.getNextTuple()
      This gets an empty tuple
      ()
      with STATUS_OK.

      SO the while(true) condition never gets an End of Processing (EOP) and so does not exit.

      Attachments

        1. PIG-3813-2.patch
          2 kB
          Cheolsoo Park
        2. PIG-3813-1.patch
          1 kB
          Cheolsoo Park
        3. test_data.txt
          11 kB
          Suhas Satish

        Activity

          People

            cheolsoo Cheolsoo Park
            ssatish Suhas Satish
            Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: