Pig
  1. Pig
  2. PIG-188

There seems to be some mismatches between the actual stderr log and what I expected

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.1.0
    • Component/s: None
    • Labels:
      None

      Description

      With the following Pig script, I got streaming logs as shown below. The job for running this script is job_200804041056_0182. What PigLoggingTest does in this case is simply take tab delimited lines from STDIN and then output them to SDTOUT as tab delimited lines (so the same line comes in and out of PigLogginTest) after spitting out 10 STDERR messages. Also as shown in the UI of job_200804041056_0181, there were a total of 21 tasks (1 map and 20 reduces).

      From all these, I would expect the number of input records and output records to match in the log. Also, I would expect there to be 26 logs. In addition, since there was no error when running the script, all exit code should 0.

      However, there are actually only 6 logs. The number of input records and output records does not match. The logs show that some of the tasks exit with -127.

      In addition, the Input-split *** values in the logs do not make much sense to me:

      Input-split file: null
      Input-split start-offset: -1
      Input-split length: -1

      Here is Pig script:

      define X `PigLoggingTest 10 t` ship('./cplusplus/PigLoggingTest') stderr('logging_test_1');
      A = load '/user/pig/tests/data/singlefile/studenttab10k' as (name, age, gpa);
      B = stream A through X;
      store B into 'logging_test_1';
      C = load 'logging_test_1/_logs/logging_test_1';
      store C into 'results_26';
      

      Here are the logs:

      ===== Task Information Header =====
      Command: PigLoggingTest 10 t 
      Start time: Fri Apr 04 19:18:44 PDT 2008
      Input-split file: null
      Input-split start-offset: -1
      Input-split length: -1
      =====          * * *          =====
      This is stderr message number 1
      This is stderr message number 2
      This is stderr message number 3
      This is stderr message number 4
      This is stderr message number 5
      This is stderr message number 6
      This is stderr message number 7
      This is stderr message number 8
      This is stderr message number 9
      This is stderr message number 10
      ===== Task Information Footer =====
      End time: Fri Apr 04 19:18:45 PDT 2008
      Exit code: 0
      Input records: 10000
      Input bytes: 1898380 bytes 
      Output records: 4
      Output bytes: 219446 bytes (stdout using org.apache.pig.builtin.BinaryStorage)
      =====          * * *          =====
      ===== Task Information Header =====
      Command: PigLoggingTest 10 t 
      Start time: Fri Apr 04 19:31:34 PDT 2008
      Input-split file: null
      Input-split start-offset: -1
      Input-split length: -1
      =====          * * *          =====
      This is stderr message number 1
      This is stderr message number 2
      This is stderr message number 3
      This is stderr message number 4
      This is stderr message number 5
      This is stderr message number 6
      This is stderr message number 7
      This is stderr message number 8
      This is stderr message number 9
      This is stderr message number 10
      ===== Task Information Footer =====
      End time: Fri Apr 04 19:31:36 PDT 2008
      Exit code: 0
      Input records: 10000
      Input bytes: 1898380 bytes 
      Output records: 4
      Output bytes: 219446 bytes (stdout using org.apache.pig.builtin.BinaryStorage)
      =====          * * *          =====
      ===== Task Information Header =====
      Command: ./cplusplus/PigLoggingTest 10 t 
      Start time: Fri Apr 04 10:11:22 PDT 2008
      Input-split file: null
      Input-split start-offset: -1
      Input-split length: -1
      =====          * * *          =====
      ===== Task Information Footer =====
      End time: Fri Apr 04 10:11:22 PDT 2008
      Exit code: -127
      Input records: 747
      Input bytes: 141796 bytes 
      Output records: 0
      Output bytes: 0 bytes (stdout using org.apache.pig.builtin.BinaryStorage)
      =====          * * *          =====
      ===== Task Information Header =====
      Command: ./cplusplus/PigLoggingTest 10 t 
      Start time: Fri Apr 04 10:11:28 PDT 2008
      Input-split file: null
      Input-split start-offset: -1
      Input-split length: -1
      =====          * * *          =====
      ===== Task Information Footer =====
      End time: Fri Apr 04 10:11:28 PDT 2008
      Exit code: -127
      Input records: 747
      Input bytes: 141796 bytes 
      Output records: 0
      Output bytes: 0 bytes (stdout using org.apache.pig.builtin.BinaryStorage)
      =====          * * *          =====
      ===== Task Information Header =====
      Command: ./cplusplus/PigLoggingTest 10 t 
      Start time: Fri Apr 04 10:11:32 PDT 2008
      Input-split file: null
      Input-split start-offset: -1
      Input-split length: -1
      =====          * * *          =====
      ===== Task Information Footer =====
      End time: Fri Apr 04 10:11:33 PDT 2008
      Exit code: -127
      Input records: 747
      Input bytes: 141796 bytes 
      Output records: 0
      Output bytes: 0 bytes (stdout using org.apache.pig.builtin.BinaryStorage)
      =====          * * *          =====
      ===== Task Information Header =====
      Command: ./cplusplus/PigLoggingTest 10 t 
      Start time: Fri Apr 04 10:11:37 PDT 2008
      Input-split file: null
      Input-split start-offset: -1
      Input-split length: -1
      =====          * * *          =====
      ===== Task Information Footer =====
      End time: Fri Apr 04 10:11:37 PDT 2008
      Exit code: -127
      Input records: 747
      Input bytes: 141796 bytes 
      Output records: 0
      Output bytes: 0 bytes (stdout using org.apache.pig.builtin.BinaryStorage)
      =====          * * *          =====
      
      1. PIG-188_0_20080407.patch
        2 kB
        Arun C Murthy
      2. PIG-188_1_20080407.patch
        4 kB
        Arun C Murthy
      3. PIG-188_2_20080408.patch
        4 kB
        Arun C Murthy
      4. PigLoggingTest.cpp
        2 kB
        Xu Zhang

        Activity

        Hide
        Alan Gates added a comment -

        Fix checked in revision 647997. Thanks Arun.

        Show
        Alan Gates added a comment - Fix checked in revision 647997. Thanks Arun.
        Hide
        Benjamin Reed added a comment -

        +1 looks good to me.

        Show
        Benjamin Reed added a comment - +1 looks good to me.
        Hide
        Alan Gates added a comment -

        Changes look fine to me and all the tests pass.

        I'd like to get input from Ben or Charlie Groves, both of whom worked on the split stuff this modifies, to make sure the changes fit well with that interface.

        Show
        Alan Gates added a comment - Changes look fine to me and all the tests pass. I'd like to get input from Ben or Charlie Groves, both of whom worked on the split stuff this modifies, to make sure the changes fit well with that interface.
        Hide
        Arun C Murthy added a comment -

        Fixed for TestCustomSlicer too... my bad.

        Show
        Arun C Murthy added a comment - Fixed for TestCustomSlicer too... my bad.
        Hide
        Olga Natkovich added a comment -

        Changes look good; however, the test for CustomSlicer is failing after I applied the patch:

        java.lang.ArrayIndexOutOfBoundsException: 0
        at org.apache.pig.backend.hadoop.executionengine.mapreduceExec.SliceWrapper.makeReader(SliceWrapper.java:96)
        at org.apache.pig.backend.hadoop.executionengine.mapreduceExec.PigInputFormat.getRecordReader(PigInputFormat.java:113)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:200)
        at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:150)
        08/04/08 16:00:29 INFO mapreduceExec.MapReduceLauncher: Pig progress = 0%

        Show
        Olga Natkovich added a comment - Changes look good; however, the test for CustomSlicer is failing after I applied the patch: java.lang.ArrayIndexOutOfBoundsException: 0 at org.apache.pig.backend.hadoop.executionengine.mapreduceExec.SliceWrapper.makeReader(SliceWrapper.java:96) at org.apache.pig.backend.hadoop.executionengine.mapreduceExec.PigInputFormat.getRecordReader(PigInputFormat.java:113) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:200) at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:150) 08/04/08 16:00:29 INFO mapreduceExec.MapReduceLauncher: Pig progress = 0%
        Hide
        Arun C Murthy added a comment -

        Updated patch:

        • Fix the break caused by PIG-55
        • Ensures that we do not output information about input-splits for reduces in the stderr logs since it could confuse users...
        Show
        Arun C Murthy added a comment - Updated patch: Fix the break caused by PIG-55 Ensures that we do not output information about input-splits for reduces in the stderr logs since it could confuse users...
        Hide
        Arun C Murthy added a comment -

        Looks like PIG-55 broke the feature where the InputSplit was displayed correctly in the logs... fixed now.

        Show
        Arun C Murthy added a comment - Looks like PIG-55 broke the feature where the InputSplit was displayed correctly in the logs... fixed now.
        Hide
        Arun C Murthy added a comment -

        Xu,

        1. How did u get 20 reduces and 1 map for your first job?
        2. You should expect 21 logs (20maps and 1reduces) only on HDFS.
        3. The null/-1/-1 data for input-splits is due to the fact that reduces work on map-outputs and not on HDFS data.

        Show
        Arun C Murthy added a comment - Xu, 1. How did u get 20 reduces and 1 map for your first job? 2. You should expect 21 logs (20maps and 1reduces) only on HDFS. 3. The null/-1/-1 data for input-splits is due to the fact that reduces work on map-outputs and not on HDFS data.

          People

          • Assignee:
            Arun C Murthy
            Reporter:
            Xu Zhang
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development