Uploaded image for project: 'Pig'
  1. Pig
  2. PIG-2421

EvalFuncs need redesigned

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 0.11
    • None
    • impl
    • None

    Description

      The current EvalFunc interface (and associated Algebraic and Accumulator interfaces) have grown unwieldy. In particular, people have noted the following issues:

      1. Writing a UDF requires a lot of boiler plate code.
      2. Since UDFs always pass a tuple, users are required to manage their own type checking for input.
      3. Declaring schemas for output data is confusing.
      4. Writing a UDF that accepts multiple different parameters (using getArgToFuncMapping) is confusing.
      5. Using Algebraic and Accumulator interfaces often entails duplicating code from the initial implementation.
      6. UDF implementors are exposed to the internals of Pig since they have to know when to return a tuple (Initial, Intermediate) and when not to (exec, Final).
      7. The separation of Initial, Intermediate, and Final into separate classes forces code duplication and makes it hard for UDFs in other languages to use those interfaces.
      8. There is unused code in the current interface that occasionally causes confusion (e.g. isAsynchronous)

      Any change must be done in a way that allows existing UDFs to continue working essentially forever.

      Attachments

        1. examples.patch
          9 kB
          Julien Le Dem
        2. PIG-newudf.patch
          87 kB
          Alan Gates

        Issue Links

          Activity

            People

              gates Alan Gates
              gates Alan Gates
              Votes:
              0 Vote for this issue
              Watchers:
              13 Start watching this issue

              Dates

                Created:
                Updated: