Uploaded image for project: 'Pig'
  1. Pig
  2. PIG-664

Semantics of * is not consistent

VotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 0.2.0
    • 0.2.0
    • impl
    • None
    • Patch Available
    • Reviewed

    Description

      The semantics of * is not consistent in PIG. The use of * with generate results in the all the columns of the record being flattened. However, the use of * as an input to a UDF results in a tuple (wrapped in another tuple). For consistency, * should always result in all the columns of the record (i.e., flattened). The use of * occurs in:

      1. Foreach generate: E.g.: foreach input generate *;
      2. Input to UDFs: E.g. foreach input generate myUDF;
      3. Order by: E.g.: order input by *;
      4. (Co)Group: E.g.: group a by *; cogroup a by *, b by *;

      In terms of implementation, this involves rolling back the fix introduced in PIG-597 and fixing the following builtin UDFs:

      1. ARITY - Should return the size of the input tuple instead of extracting the first column of the input tuple
      2. SIZE - Should return the size of the input tuple instead of extracting the first column of the input tuple

      Attachments

        1. PIG-664.patch
          8 kB
          Santhosh Muthur Srinivasan

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            sms Santhosh Muthur Srinivasan
            sms Santhosh Muthur Srinivasan
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment