Uploaded image for project: 'Pig'
  1. Pig
  2. PIG-3826

Outer join with PushDownForEachFlatten generates wrong result

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.12.1, 0.13.0
    • Component/s: impl
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      The following script generates wrong result:
      A = load 'A.txt' using PigStorage(',') as (id:chararray, value:double);
      B = load 'B.txt' using PigStorage(',') as (id:chararray, name:chararray);

      t1 = group A by id;
      t2 = foreach t1

      { r1 = filter $1 by (value>1); r2 = limit r1 1; generate group as id, FLATTEN(r2.value) as value; }

      ;

      t3 = join B by id LEFT OUTER, t2 by id;
      dump t3;

      A.txt:
      1,1.5
      2,0
      3,-2.0
      4,8.9

      B.txt:
      1,Ofer
      2,Jordan
      3,Noa
      4,Daniel

      Expected output:
      (1,Ofer,1,1.5)
      (2,Jordan,,)
      (3,Noa,,)
      (4,Daniel,4,8.9)

      But we get:
      (1,Ofer,1,1.5)
      (4,Daniel,4,8.9)

      With the option "-t PushDownForEachFlatten", the issue goes away.

        Attachments

        1. PIG-3826-1.patch
          2 kB
          Jianyong Dai

          Activity

            People

            • Assignee:
              daijy Jianyong Dai
              Reporter:
              daijy Jianyong Dai
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: