Pig
  1. Pig
  2. PIG-3813

Rank column is assigned different uids everytime when schema is reset

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Critical Critical
    • Resolution: Fixed
    • Affects Version/s: 0.12.0
    • Fix Version/s: 0.12.1, 0.13.0
    • Component/s: impl
    • Labels:
      None

      Description

      When the following script is run, pig goes into an infinite loop. This was reproduced on pig trunk as of March 12, 2014 on apache hadoop 1.2. test_data.txt has been attached.

      test.pig
      tWeek = LOAD '/tmp/test_data.txt' USING PigStorage ('|') AS (WEEK:int, DESCRIPTION:chararray, END_DATE:chararray, PERIOD:int);

      gTWeek = FOREACH tWeek GENERATE WEEK AS WEEK, PERIOD AS PERIOD;

      pWeek = FILTER gTWeek BY PERIOD == 201312;

      pWeekRanked = RANK pWeek BY WEEK ASC DENSE;

      gpWeekRanked = FOREACH pWeekRanked GENERATE $0;
      store gpWeekRanked into 'gpWeekRanked';
      describe gpWeekRanked;
      ---------------------------------------------------

      The res object of class Result, gets its value from leaf.getNextTuple()
      This gets an empty tuple
      ()
      with STATUS_OK.

      SO the while(true) condition never gets an End of Processing (EOP) and so does not exit.

      1. PIG-3813-1.patch
        1 kB
        Cheolsoo Park
      2. PIG-3813-2.patch
        2 kB
        Cheolsoo Park
      3. test_data.txt
        11 kB
        Suhas Satish

        Activity

        Suhas Satish created issue -
        Suhas Satish made changes -
        Field Original Value New Value
        Description When the following script is run, pig goes into an infinite loop. This was reproduced on pig trunk as of March 12, 2014 on apache hadoop 1.2

        test.pig
        tWeek = LOAD '/tmp/test_data.txt' USING PigStorage ('|') AS (WEEK:int, DESCRIPTION:chararray, END_DATE:chararray, PERIOD:int);

        gTWeek = FOREACH tWeek GENERATE WEEK AS WEEK, PERIOD AS PERIOD;

        pWeek = FILTER gTWeek BY PERIOD == 201312;

        pWeekRanked = RANK pWeek BY WEEK ASC DENSE;

        gpWeekRanked = FOREACH pWeekRanked GENERATE $0;
        store gpWeekRanked into 'gpWeekRanked';
        describe gpWeekRanked;
        ---------------------------------------------------
        Suhas Satish made changes -
        Attachment test_data.txt [ 12634638 ]
        Suhas Satish made changes -
        Description When the following script is run, pig goes into an infinite loop. This was reproduced on pig trunk as of March 12, 2014 on apache hadoop 1.2

        test.pig
        tWeek = LOAD '/tmp/test_data.txt' USING PigStorage ('|') AS (WEEK:int, DESCRIPTION:chararray, END_DATE:chararray, PERIOD:int);

        gTWeek = FOREACH tWeek GENERATE WEEK AS WEEK, PERIOD AS PERIOD;

        pWeek = FILTER gTWeek BY PERIOD == 201312;

        pWeekRanked = RANK pWeek BY WEEK ASC DENSE;

        gpWeekRanked = FOREACH pWeekRanked GENERATE $0;
        store gpWeekRanked into 'gpWeekRanked';
        describe gpWeekRanked;
        ---------------------------------------------------
        When the following script is run, pig goes into an infinite loop. This was reproduced on pig trunk as of March 12, 2014 on apache hadoop 1.2. test_data.txt has been attached.

        test.pig
        tWeek = LOAD '/tmp/test_data.txt' USING PigStorage ('|') AS (WEEK:int, DESCRIPTION:chararray, END_DATE:chararray, PERIOD:int);

        gTWeek = FOREACH tWeek GENERATE WEEK AS WEEK, PERIOD AS PERIOD;

        pWeek = FILTER gTWeek BY PERIOD == 201312;

        pWeekRanked = RANK pWeek BY WEEK ASC DENSE;

        gpWeekRanked = FOREACH pWeekRanked GENERATE $0;
        store gpWeekRanked into 'gpWeekRanked';
        describe gpWeekRanked;
        ---------------------------------------------------
        Suhas Satish made changes -
        Description When the following script is run, pig goes into an infinite loop. This was reproduced on pig trunk as of March 12, 2014 on apache hadoop 1.2. test_data.txt has been attached.

        test.pig
        tWeek = LOAD '/tmp/test_data.txt' USING PigStorage ('|') AS (WEEK:int, DESCRIPTION:chararray, END_DATE:chararray, PERIOD:int);

        gTWeek = FOREACH tWeek GENERATE WEEK AS WEEK, PERIOD AS PERIOD;

        pWeek = FILTER gTWeek BY PERIOD == 201312;

        pWeekRanked = RANK pWeek BY WEEK ASC DENSE;

        gpWeekRanked = FOREACH pWeekRanked GENERATE $0;
        store gpWeekRanked into 'gpWeekRanked';
        describe gpWeekRanked;
        ---------------------------------------------------
        When the following script is run, pig goes into an infinite loop. This was reproduced on pig trunk as of March 12, 2014 on apache hadoop 1.2. test_data.txt has been attached.

        test.pig
        tWeek = LOAD '/tmp/test_data.txt' USING PigStorage ('|') AS (WEEK:int, DESCRIPTION:chararray, END_DATE:chararray, PERIOD:int);

        gTWeek = FOREACH tWeek GENERATE WEEK AS WEEK, PERIOD AS PERIOD;

        pWeek = FILTER gTWeek BY PERIOD == 201312;

        pWeekRanked = RANK pWeek BY WEEK ASC DENSE;

        gpWeekRanked = FOREACH pWeekRanked GENERATE $0;
        store gpWeekRanked into 'gpWeekRanked';
        describe gpWeekRanked;
        ---------------------------------------------------

        The res object of class Result, gets its value from leaf.getNextTuple()
        This gets an empty tuple
        ()
        with STATUS_OK.

        SO the while(true) condition never gets an End of Processing (EOP) and so does not exit.
         
        Suhas Satish made changes -
        Summary  runPipeline() method returns empty tuples and goes into infinite loop under certain conditions  PigGenericMapBase runPipeline() method returns empty tuples and goes into infinite loop under certain conditions
        Cheolsoo Park made changes -
        Assignee Cheolsoo Park [ cheolsoo ]
        Cheolsoo Park made changes -
        Attachment PIG-3813-1.patch [ 12635181 ]
        Cheolsoo Park made changes -
        Attachment PIG-3813-2.patch [ 12635395 ]
        Cheolsoo Park made changes -
        Summary  PigGenericMapBase runPipeline() method returns empty tuples and goes into infinite loop under certain conditions Rank column is assigned different uids everytime when schema is reset
        Cheolsoo Park made changes -
        Status Open [ 1 ] Patch Available [ 10002 ]
        Cheolsoo Park made changes -
        Status Patch Available [ 10002 ] Resolved [ 5 ]
        Fix Version/s 0.13.0 [ 12324971 ]
        Resolution Fixed [ 1 ]
        Cheolsoo Park made changes -
        Fix Version/s 0.12.1 [ 12324970 ]
        Prashant Kommireddi made changes -
        Status Resolved [ 5 ] Closed [ 6 ]

          People

          • Assignee:
            Cheolsoo Park
            Reporter:
            Suhas Satish
          • Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development