Pig
  1. Pig
  2. PIG-601

Add finalize() interface to UDF

    Details

    • Type: New Feature New Feature
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.2.0
    • Component/s: impl
    • Labels:
      None

      Description

      I would like to have a finalize() method to UDF, which will be called when no more inputs and the UDF will be killed. The finalize() method should allow to generate extra output, which in many cases could benefit aggregations.

      There are couple of application that can benefit from this feature.

      One of the example is, in some UDFs, I need to open some resource(i. e. local file) and when the task finishes, I need to close the resource.

      Another example is, in one of my application, I do statistics for a list of categories and I need to generate a summary category and attach to the end of the table. With the finalize method, I could achieve this in an efficient and neat way.

        Activity

        Hide
        Olga Natkovich added a comment -

        Both EvalFunc and StoreFunc have finish() method that is called bythe framework when no more data will be given to the UDF. On the load side, UDF itself decided when it is done so such functionality is not needed. Please, reopen if I misunderstood your intent.

        Show
        Olga Natkovich added a comment - Both EvalFunc and StoreFunc have finish() method that is called bythe framework when no more data will be given to the UDF. On the load side, UDF itself decided when it is done so such functionality is not needed. Please, reopen if I misunderstood your intent.

          People

          • Assignee:
            Unassigned
            Reporter:
            Yiping Han
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development