Uploaded image for project: 'Pig'
  1. Pig
  2. PIG-3378

Pig streaming with multiquery is buggy in local mode.

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Minor
    • Resolution: Unresolved
    • 0.11.1
    • None
    • None
    • None

    Description

      I realize today a strange behavior of PIG in local mode (streaming + multiquery).
      I put here a minimal script to reproduce the problem.

      Suppose an input file with multiple lines for example:

      1. myInput
        1
        2
        3
        1
        2
        3

      The pig script is :

      1. bug.pig
        MYINPUT = LOAD 'myinput';

      A = GROUP MYINPUT BY $0;
      B = FOREACH A GENERATE FLATTEN(MYINPUT);
      C = STREAM B THROUGH `cat`;

      D = GROUP MYINPUT BY $0;
      E = FOREACH D GENERATE FLATTEN(MYINPUT);
      F = STREAM E THROUGH `cat`;

      STORE C into 'output1';
      STORE F into 'output2';

      I run the script using the following command:
      pig -x local bug.pig

      We should find in output1 and output2 perfect copy of my input file ... but
      this is not the case. We find only one line (the first line of the file)
      cat output1/part*
      cat output2/part*

      For information :

      • The corresponding pig script in hadoop mode work properly.
      • If I comment one of the two store operation, it works as expected (that's why I think it's because on multiquery is run).
      • If y put an EXEC statement between the two STORE operations, it works too.
      • I can assure the script reads well all lines of stdin. For example, changing the executable `cat` with `wc-l`, we find out the number of rows of input file.
        So it seems that the problem is the parsing of stdout.

      Attachments

        Activity

          People

            Unassigned Unassigned
            thomasporez Thomas Porez
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated: