Pig
  1. Pig
  2. PIG-2535

Bug in new logical plan results in no output for join

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.8.1, 0.9.1, 0.10.0
    • Fix Version/s: 0.10.0, 0.9.3, 0.11
    • Component/s: None
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      The below script is a snippet of a much larger script. The join in the script results in 0 output for Pig 0.8,0.9 and 0.10 though there are matching records.

      event_serve = LOAD 'input1'   USING MyMapLoader() AS (s:map[], m:map[], l:map[]);
      raw = LOAD 'input2'  USING MyMapLoader() AS (s:map[], m:map[], l:map[]);
      
      SPLIT raw INTO
          serve_raw IF (( (chararray) (s#'type') == '0') AND ( (chararray) (s#'source') == '5')),
          cm_click_raw IF (( (chararray) (s#'type') == '1') AND ( (chararray) (s#'source') == '5'));
      
      cm_serve = FOREACH serve_raw GENERATE  s#'cm_serve_id' AS cm_event_guid,  s#'cm_serve_timestamp_ms' AS cm_receive_time, s#'p_url' AS ctx ;
      cm_serve_lowercase = stream cm_serve through `tr [:upper:] [:lower:]`;
      cm_serve_final = FOREACH cm_serve_lowercase GENERATE  $0 AS cm_event_guid, $1 AS cm_receive_time, $2 AS ctx;
      filtered = FILTER event_serve BY (chararray) (s#'filter_key') neq 'xxxx' AND (chararray) (s#'filter_key') neq 'yyyy';
      event_serve_project = FOREACH filtered GENERATE s#'event_guid' AS event_guid, s#'receive_time' AS receive_time;
      event_serve_join = join cm_serve_final by (cm_event_guid, cm_receive_time), event_serve_project by (event_guid, receive_time) PARALLEL 800;
      STORE event_serve_join INTO 'output' ;
      

      The script produces correct results if I disable ColumnMapKeyPrune optimizer.

      1. PIG-2535-2.patch
        3 kB
        Daniel Dai
      2. PIG-2535-1.patch
        3 kB
        Daniel Dai
      3. PIG-2535-0.patch
        0.9 kB
        Daniel Dai

        Activity

        Daniel Dai made changes -
        Status Resolved [ 5 ] Closed [ 6 ]
        Daniel Dai made changes -
        Status Open [ 1 ] Resolved [ 5 ]
        Hadoop Flags Reviewed [ 10343 ]
        Assignee Daniel Dai [ daijy ]
        Fix Version/s 0.10 [ 12316246 ]
        Fix Version/s 0.9.3 [ 12319456 ]
        Fix Version/s 0.11 [ 12318878 ]
        Resolution Fixed [ 1 ]
        Daniel Dai made changes -
        Attachment PIG-2535-2.patch [ 12516655 ]
        Daniel Dai made changes -
        Attachment PIG-2535-1.patch [ 12515314 ]
        Daniel Dai made changes -
        Field Original Value New Value
        Attachment PIG-2535-0.patch [ 12514965 ]
        Vivek Padmanabhan created issue -

          People

          • Assignee:
            Daniel Dai
            Reporter:
            Vivek Padmanabhan
          • Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development