Uploaded image for project: 'Pig'
  1. Pig
  2. PIG-664

Semantics of * is not consistent

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 0.2.0
    • Fix Version/s: 0.2.0
    • Component/s: impl
    • Labels:
      None
    • Patch Info:
      Patch Available
    • Hadoop Flags:
      Reviewed

      Description

      The semantics of * is not consistent in PIG. The use of * with generate results in the all the columns of the record being flattened. However, the use of * as an input to a UDF results in a tuple (wrapped in another tuple). For consistency, * should always result in all the columns of the record (i.e., flattened). The use of * occurs in:

      1. Foreach generate: E.g.: foreach input generate *;
      2. Input to UDFs: E.g. foreach input generate myUDF;
      3. Order by: E.g.: order input by *;
      4. (Co)Group: E.g.: group a by *; cogroup a by *, b by *;

      In terms of implementation, this involves rolling back the fix introduced in PIG-597 and fixing the following builtin UDFs:

      1. ARITY - Should return the size of the input tuple instead of extracting the first column of the input tuple
      2. SIZE - Should return the size of the input tuple instead of extracting the first column of the input tuple

        Attachments

        1. PIG-664.patch
          8 kB
          Santhosh Muthur Srinivasan

          Activity

            People

            • Assignee:
              sms Santhosh Muthur Srinivasan
              Reporter:
              sms Santhosh Muthur Srinivasan
            • Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: