Uploaded image for project: 'Pig'
  1. Pig
  2. PIG-2535

Bug in new logical plan results in no output for join

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 0.8.1, 0.9.1, 0.10.0
    • Fix Version/s: 0.10.0, 0.9.3, 0.11
    • Component/s: None
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      The below script is a snippet of a much larger script. The join in the script results in 0 output for Pig 0.8,0.9 and 0.10 though there are matching records.

      event_serve = LOAD 'input1'   USING MyMapLoader() AS (s:map[], m:map[], l:map[]);
      raw = LOAD 'input2'  USING MyMapLoader() AS (s:map[], m:map[], l:map[]);
      
      SPLIT raw INTO
          serve_raw IF (( (chararray) (s#'type') == '0') AND ( (chararray) (s#'source') == '5')),
          cm_click_raw IF (( (chararray) (s#'type') == '1') AND ( (chararray) (s#'source') == '5'));
      
      cm_serve = FOREACH serve_raw GENERATE  s#'cm_serve_id' AS cm_event_guid,  s#'cm_serve_timestamp_ms' AS cm_receive_time, s#'p_url' AS ctx ;
      cm_serve_lowercase = stream cm_serve through `tr [:upper:] [:lower:]`;
      cm_serve_final = FOREACH cm_serve_lowercase GENERATE  $0 AS cm_event_guid, $1 AS cm_receive_time, $2 AS ctx;
      filtered = FILTER event_serve BY (chararray) (s#'filter_key') neq 'xxxx' AND (chararray) (s#'filter_key') neq 'yyyy';
      event_serve_project = FOREACH filtered GENERATE s#'event_guid' AS event_guid, s#'receive_time' AS receive_time;
      event_serve_join = join cm_serve_final by (cm_event_guid, cm_receive_time), event_serve_project by (event_guid, receive_time) PARALLEL 800;
      STORE event_serve_join INTO 'output' ;
      

      The script produces correct results if I disable ColumnMapKeyPrune optimizer.

        Attachments

        1. PIG-2535-0.patch
          0.9 kB
          Daniel Dai
        2. PIG-2535-1.patch
          3 kB
          Daniel Dai
        3. PIG-2535-2.patch
          3 kB
          Daniel Dai

          Activity

            People

            • Assignee:
              daijy Daniel Dai
              Reporter:
              vivekp Vivek Padmanabhan
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: