Pig
  1. Pig
  2. PIG-245

Need wrapper UDFs for all java.lang.Math functions

    Details

    • Type: New Feature New Feature
    • Status: Closed
    • Priority: Minor Minor
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.1.0
    • Component/s: None
    • Labels:
      None
    • Patch Info:
      Patch Available

      Description

      Attached is the code for all the wrapper UDFs for the math functions with unit tests. One exception is the random() function which does not take any argument. Since Pig doesn't support zero argument udfs, the current code takes a dummy argument which is not used in the call to java.lang.Math.random().

      1. MathUDF.patch
        103 kB
        Ajay Garg
      2. mathudfs.zip
        38 kB
        Ajay Garg

        Activity

        Hide
        Olga Natkovich added a comment -

        committed the changes. Thanks Ajay and Shravan for contributing

        Show
        Olga Natkovich added a comment - committed the changes. Thanks Ajay and Shravan for contributing
        Hide
        Olga Natkovich added a comment -

        I have applied and tested the patch. I also created javadoc and it looks really good! One issue that I saw is that 'see also' field does not seem to be correctly set. When I built javadoc I saw a bunch of warnings of the form:

        [javadoc] /home/olgan/src/pig-apache2/trunk/contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/evaluation/math/toRadians.java:40: warning - Tag @see: reference not found:
        [javadoc] /home/olgan/src/pig-apache2/trunk/contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/evaluation/math/ULP.java:40: warning - @see tag has no arguments.
        [javadoc] /home/olgan/src/pig-apache2/trunk/contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/evaluation/math/ULP.java:40: warning - Tag @see: reference not found:

        I will commit the patch as is since it is already very useful. It would be nice if you get a chance to update the docs at some point.

        Show
        Olga Natkovich added a comment - I have applied and tested the patch. I also created javadoc and it looks really good! One issue that I saw is that 'see also' field does not seem to be correctly set. When I built javadoc I saw a bunch of warnings of the form: [javadoc] /home/olgan/src/pig-apache2/trunk/contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/evaluation/math/toRadians.java:40: warning - Tag @see: reference not found: [javadoc] /home/olgan/src/pig-apache2/trunk/contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/evaluation/math/ULP.java:40: warning - @see tag has no arguments. [javadoc] /home/olgan/src/pig-apache2/trunk/contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/evaluation/math/ULP.java:40: warning - Tag @see: reference not found: I will commit the patch as is since it is already very useful. It would be nice if you get a chance to update the docs at some point.
        Hide
        Alan Gates added a comment -

        Eventually what we want is for a UDF to be able to return 3 levels of errors:

        WARN - I couldn't process this record, so we should put a null here and keep going. This can be communicated by the UDF returning null once we have them. Ideally we'd like an explicit return code for this so that we can log a warning.

        ERROR - I couldn't process this record and we should give up on this execution attempt. That's the default behavior now, and what we should continue to do for now.

        FATAL - I couldn't process this record and we should give up the job entirely (ie no retries). This would only be used in cases where we know retries will be of no use, for example if we find we have an incompatible version of the UDF loaded or something. There's no way to do this now.

        All this is a long way to say that, given we don't yet have nulls, the best we can do right now is error out when we get an IOException.

        Show
        Alan Gates added a comment - Eventually what we want is for a UDF to be able to return 3 levels of errors: WARN - I couldn't process this record, so we should put a null here and keep going. This can be communicated by the UDF returning null once we have them. Ideally we'd like an explicit return code for this so that we can log a warning. ERROR - I couldn't process this record and we should give up on this execution attempt. That's the default behavior now, and what we should continue to do for now. FATAL - I couldn't process this record and we should give up the job entirely (ie no retries). This would only be used in cases where we know retries will be of no use, for example if we find we have an incompatible version of the UDF loaded or something. There's no way to do this now. All this is a long way to say that, given we don't yet have nulls, the best we can do right now is error out when we get an IOException.
        Hide
        Pi Song added a comment -

        I'm not sure about what exception should be thrown from UDFs. This would affect the way we handle it. The outcome could be 1) skip the record 2) stop the whole process.

        Alan, please clarify.

        Show
        Pi Song added a comment - I'm not sure about what exception should be thrown from UDFs. This would affect the way we handle it. The outcome could be 1) skip the record 2) stop the whole process. Alan, please clarify.
        Hide
        Ajay Garg added a comment -

        I have changed RuntimeException with IOException. I am attaching the patch ( MathUDF.patch ) which creates all the UDFs and JUnit test case for them.
        Thanks

        Show
        Ajay Garg added a comment - I have changed RuntimeException with IOException. I am attaching the patch ( MathUDF.patch ) which creates all the UDFs and JUnit test case for them. Thanks
        Hide
        Olga Natkovich added a comment -

        I have created the repository. For me to add this functions to it, I need the new patch that addresses the exception issue listed above. Ajay, could you please, post the new one.

        Show
        Olga Natkovich added a comment - I have created the repository. For me to add this functions to it, I need the new patch that addresses the exception issue listed above. Ajay, could you please, post the new one.
        Hide
        Olga Natkovich added a comment -

        One thing I would like to request is for UDFs to through IOException rather than RuntimeException since that what the expected interface is.

        Show
        Olga Natkovich added a comment - One thing I would like to request is for UDFs to through IOException rather than RuntimeException since that what the expected interface is.
        Hide
        Olga Natkovich added a comment -

        Looks good. I will create the repository sometimes next week and will commit the UDFs at that time

        Show
        Olga Natkovich added a comment - Looks good. I will create the repository sometimes next week and will commit the UDFs at that time
        Hide
        Pi Song added a comment -

        Very useful stuff! Good work!!

        Show
        Pi Song added a comment - Very useful stuff! Good work!!
        Hide
        Ajay Garg added a comment -

        code attached...

        Show
        Ajay Garg added a comment - code attached...

          People

          • Assignee:
            Shravan Matthur Narayanamurthy
            Reporter:
            Shravan Matthur Narayanamurthy
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development