Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Cannot Reproduce
    • Affects Version/s: 0.9.0
    • Fix Version/s: 0.10.1
    • Component/s: None
    • Labels:

      Description

      Input data :
      -file 'a' starts-
      A|1
      B|2

      -file 'a' ends-
      (Note the empty line at the end)

      The following script does not work.
      a = load 'a' using PigStorage('|') as (x:chararray, y:double);
      b = foreach a generate *, ABS(y - 2*y) as test;
      dump b;

      The function ABS throws a NPE instead of giving out a null for the last line in the input:
      java.lang.NullPointerException
      at org.apache.pig.builtin.DoubleAbs.exec(DoubleAbs.java:45)
      at org.apache.pig.builtin.DoubleAbs.exec(DoubleAbs.java:28)
      at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:216)
      at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:281)
      at org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.getNext(PhysicalOperator.java:324)
      at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:332)
      at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:284)
      at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.runPipeline(PigMapBase.java:261)
      at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:256)
      at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:58)
      at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
      at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:621)
      at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
      at org.apache.hadoop.mapred.Child.main(Child.java:170)

        Activity

        Shubham Chopra created issue -
        Daniel Dai made changes -
        Field Original Value New Value
        Labels newbie
        Dmitriy V. Ryaboy made changes -
        Labels newbie newbie simple
        Hide
        Allan Avendaño added a comment -

        I was trying to reproduce same scenario, but while I tried to run a file with empty space and/or break lines at the end of the file, the result was something like this:

        (1.0,A,1.0)
        (2.0,B,2.0)
        (,,) <-- break line
        (, ,) <-- empty space

        And no exceptions were thrown. Maybe because I'm working on latest version, but in any case this results are accepted?

        Show
        Allan Avendaño added a comment - I was trying to reproduce same scenario, but while I tried to run a file with empty space and/or break lines at the end of the file, the result was something like this: (1.0,A,1.0) (2.0,B,2.0) (,,) <-- break line (, ,) <-- empty space And no exceptions were thrown. Maybe because I'm working on latest version, but in any case this results are accepted?
        Hide
        Prashant Kommireddi added a comment -

        Hi Allan,

        Thanks for looking at this. I ran the same but I do not see an empty space.

        (A,1.0,1.0)
        (B,2.0,2.0)
        (,,)

        Here is my input file contents

        grunt> cat data1
        A|1
        B|2
        <empty line>

        Show
        Prashant Kommireddi added a comment - Hi Allan, Thanks for looking at this. I ran the same but I do not see an empty space. (A,1.0,1.0) (B,2.0,2.0) (,,) Here is my input file contents grunt> cat data1 A|1 B|2 <empty line>
        Hide
        Allan Avendaño added a comment -

        Hi Prashant,

        My input file was this:

        grunt> cat data/abs
        cat data/abs
        A|1
        B|2
        <break line>
        <spaces>

        and output was this
        (A,1.0,1.0)
        (B,2.0,2.0)
        (,,)
        ( ,,)

        and any exception was thrown as is described.
        If there are spaces or break lines in the input, should be shown in the output or not?

        Show
        Allan Avendaño added a comment - Hi Prashant, My input file was this: grunt> cat data/abs cat data/abs A|1 B|2 <break line> <spaces> and output was this (A,1.0,1.0) (B,2.0,2.0) (,,) ( ,,) and any exception was thrown as is described. If there are spaces or break lines in the input, should be shown in the output or not?
        Hide
        Prashant Kommireddi added a comment -

        I think the <space> is coming from projecting all fields

        b = foreach a generate *, ABS(y - 2*y) as test;
        

        Since the script is parsing on '|', it makes sense that space char is output.

        Show
        Prashant Kommireddi added a comment - I think the <space> is coming from projecting all fields b = foreach a generate *, ABS(y - 2*y) as test; Since the script is parsing on '|', it makes sense that space char is output.
        Hide
        Gianmarco De Francisci Morales added a comment -

        It looks like this was fixed, I can't reproduce the bug.

        The spaces in the tuple are correct as pointed out by Prashant.

        Thanks for having a look at this, Allan.

        Show
        Gianmarco De Francisci Morales added a comment - It looks like this was fixed, I can't reproduce the bug. The spaces in the tuple are correct as pointed out by Prashant. Thanks for having a look at this, Allan.
        Gianmarco De Francisci Morales made changes -
        Status Open [ 1 ] Resolved [ 5 ]
        Assignee Gianmarco De Francisci Morales [ azaroth ]
        Fix Version/s 0.10.1 [ 12320547 ]
        Resolution Cannot Reproduce [ 5 ]

          People

          • Assignee:
            Gianmarco De Francisci Morales
            Reporter:
            Shubham Chopra
          • Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development