Pig
  1. Pig
  2. PIG-1471

inline UDFs in scripting languages

    Details

    • Type: New Feature New Feature
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.8.0
    • Component/s: None
    • Labels:
      None

      Description

      It should be possible to write UDFs in scripting languages such as python, ruby, etc. This frees users from needing to compile Java, generate a jar, etc. It also opens Pig to programmers who prefer scripting languages over Java. It should be possible to write these scripts inline as part of pig scripts. This feature is an extension of https://issues.apache.org/jira/browse/PIG-928

        Activity

        Hide
        Aniket Mokashi added a comment -

        The proposed syntax is

        define hellopig using org.apache.pig.scripting.jython.JythonScriptEngine as '@outputSchema("x:{t:(word:chararray)}")\ndef helloworld():\n\treturn ('Hello, World')';
        
        Show
        Aniket Mokashi added a comment - The proposed syntax is define hellopig using org.apache.pig.scripting.jython.JythonScriptEngine as '@outputSchema( "x:{t:(word:chararray)}" )\ndef helloworld():\n\treturn ('Hello, World')';
        Hide
        Julien Le Dem added a comment -

        If the function definition is inline in a DEFINE statement then the @outputSchema decorator is not that usefull anymore.
        Also the current syntax already enables doing something similar:

        DEFINE hellopig org.apache.pig.scripting.jython.JythonFunction('def helloworld():\n\treturn (\'Hello, World\')', 'x:{t:(word:chararray)}');
        

        so I'm not sure extending the syntax is necessary unless it let the user type UDFs without escaping (\n \` ...)
        Something like:

        DEFINE hellopig USING org.apache.pig.scripting.jython.JythonScriptEngine('x:{t:(word:chararray)}') AS
         
        def helloworld():
            return ('Hello, World')';
        
        ENDDEFINE
        
        Show
        Julien Le Dem added a comment - If the function definition is inline in a DEFINE statement then the @outputSchema decorator is not that usefull anymore. Also the current syntax already enables doing something similar: DEFINE hellopig org.apache.pig.scripting.jython.JythonFunction('def helloworld():\n\treturn (\'Hello, World\')', 'x:{t:(word:chararray)}'); so I'm not sure extending the syntax is necessary unless it let the user type UDFs without escaping (\n \` ...) Something like: DEFINE hellopig USING org.apache.pig.scripting.jython.JythonScriptEngine('x:{t:(word:chararray)}') AS def helloworld(): return ('Hello, World')'; ENDDEFINE
        Hide
        Daniel Dai added a comment -

        Yes, Jython works because of JythonFunction has the ability to do the inline function. Here we want a general and consistent solution for UDF in other languages to define an inline UDF.

        Show
        Daniel Dai added a comment - Yes, Jython works because of JythonFunction has the ability to do the inline function. Here we want a general and consistent solution for UDF in other languages to define an inline UDF.
        Hide
        Alan Gates added a comment -

        This should have been resolved a while ago.

        Show
        Alan Gates added a comment - This should have been resolved a while ago.

          People

          • Assignee:
            Aniket Mokashi
            Reporter:
            Aniket Mokashi
          • Votes:
            1 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

            • Due:
              Created:
              Updated:
              Resolved:

              Development