Pig
  1. Pig
  2. PIG-1277

Pig should give error message when cogroup on tuple keys of different inner type

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.6.0
    • Fix Version/s: 0.9.0
    • Component/s: impl
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      When we cogroup on a tuple, if the inner type of tuple does not match, we treat them as different keys. This is confusing. It is desirable to give error/warnings when it happens.

      Here is one example:
      UDF:

      public class MapGenerate extends EvalFunc<Map> {
          @Override
          public Map exec(Tuple input) throws IOException {
              // TODO Auto-generated method stub
              Map m = new HashMap();
              m.put("key", new Integer(input.size()));
              return m;
          }
          
          @Override
          public Schema outputSchema(Schema input) {
              return new Schema(new Schema.FieldSchema(null, DataType.MAP));
          }
      }
      

      Pig script:

      a = load '1.txt' as (a0);
      b = foreach a generate a0, MapGenerate(*) as m:map[];
      c = foreach b generate a0, m#'key' as key;
      d = load '2.txt' as (c0, c1);
      e = cogroup c by (a0, key), d by (c0, c1);
      dump e;
      

      1.txt

      1
      

      2.txt

      1 1
      

      User expected result (which is not right):

      ((1,1),{(1,1)},{(1,1)})
      

      Real result:

      ((1,1),{(1,1)},{})
      ((1,1),{},{(1,1)})
      

      We shall give user the message that we can not merge the key due to the type mismatch.

      1. PIG-1277-3.patch
        18 kB
        Daniel Dai
      2. PIG-1277-2.patch
        8 kB
        Daniel Dai
      3. PIG-1277-1.patch
        8 kB
        Daniel Dai

        Issue Links

          Activity

          Daniel Dai created issue -
          Olga Natkovich made changes -
          Field Original Value New Value
          Fix Version/s 0.9.0 [ 12315191 ]
          Alan Gates made changes -
          Assignee Alan Gates [ alangates ]
          Daniel Dai made changes -
          Attachment PIG-1277-1.patch [ 12466432 ]
          Daniel Dai made changes -
          Link This issue is duplicated by PIG-999 [ PIG-999 ]
          Daniel Dai made changes -
          Link This issue is duplicated by PIG-1065 [ PIG-1065 ]
          Daniel Dai made changes -
          Attachment PIG-1277-2.patch [ 12466517 ]
          Daniel Dai made changes -
          Attachment PIG-1277-3.patch [ 12466909 ]
          Daniel Dai made changes -
          Status Open [ 1 ] Resolved [ 5 ]
          Hadoop Flags [Reviewed]
          Resolution Fixed [ 1 ]
          Olga Natkovich made changes -
          Status Resolved [ 5 ] Closed [ 6 ]

            People

            • Assignee:
              Alan Gates
              Reporter:
              Daniel Dai
            • Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development