Pig
  1. Pig
  2. PIG-3198

Let users use any function from PigType -> PigType as if it were builtlin

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.12.0
    • Component/s: None
    • Labels:
      None

      Description

      This idea is an extension of PIG-2643. Ideally, someone should be able to call any function currently registered in Pig as if it were builtin.

      1. PIG-3198-apache_header.patch
        24 kB
        Cheolsoo Park
      2. PIG-3198-1.patch
        41 kB
        Jonathan Coveney
      3. PIG-3198-0.patch
        39 kB
        Jonathan Coveney

        Issue Links

          Activity

          Hide
          Cheolsoo Park added a comment -

          PIG-3198-apache_header.patch is committed. Closing the jira again.

          Show
          Cheolsoo Park added a comment - PIG-3198 -apache_header.patch is committed. Closing the jira again.
          Show
          Jonathan Coveney added a comment - Dmitriy: https://issues.apache.org/jira/browse/PIG-3284
          Hide
          Dmitriy V. Ryaboy added a comment -

          Please add docs!

          Show
          Dmitriy V. Ryaboy added a comment - Please add docs!
          Hide
          Cheolsoo Park added a comment -

          Jonathan Coveney, thank you for implementing this!

          You forgot to add the Apache header to the new files. Here is a patch that adds them. I also fixed more white spaces while doing it.

          Show
          Cheolsoo Park added a comment - Jonathan Coveney , thank you for implementing this! You forgot to add the Apache header to the new files. Here is a patch that adds them. I also fixed more white spaces while doing it.
          Hide
          Jonathan Coveney added a comment -

          Awesome Alan. Fixed the tabs and will commit shortly.

          Show
          Jonathan Coveney added a comment - Awesome Alan. Fixed the tabs and will commit shortly.
          Hide
          Alan Gates added a comment -

          I looked through this. Other than spare tabs (rather than spaces) in some of the files it looks good. +1. I think this is exciting functionality. I'm glad to see it added.

          Show
          Alan Gates added a comment - I looked through this. Other than spare tabs (rather than spaces) in some of the files it looks good. +1. I think this is exciting functionality. I'm glad to see it added.
          Show
          Jonathan Coveney added a comment - https://reviews.apache.org/r/9559/
          Hide
          Jonathan Coveney added a comment -

          So I actually implemented this. You can check TestBuilinInvoker for some examples, but generally the syntax is as such:

          a = foreach @ generate invoke(x)concat(x);
          

          in the case of a function on another type and

          a = foreach @ generate invoke&Integer.valueOf(x);
          

          in the case of static types.

          Currently it should support any function taking 0+ PigType arguments and returning a PigType argument...in the future we could allow people to cast Object, or to chain together non-PigTypes but that was a bit out of the scope of what I wanted to work on for this.

          I actually don't love the syntax and would love to evolve it, but that portion of the parser is really hairy and it is really difficult not to introduce ambiguities...after about 10 hours of banging my head on it I went with the above. I'd love to have some eyes on this for technical merit etc.

          Essentially, this turns any PigType->PigType method into a UDF without having to have a builtin, which I think is cool. This means people can have an arbitrary method on their classpath and don't have to go through the annoyance of wrapping it in a UDF. Ideally this cuts down on the number of lame builtin functions we need to add as people can just use this (it uses bytecode generation so is as performant as any code we'd write, though there are a couple of bytecode optimizations I could do down the line).

          Show
          Jonathan Coveney added a comment - So I actually implemented this. You can check TestBuilinInvoker for some examples, but generally the syntax is as such: a = foreach @ generate invoke(x)concat(x); in the case of a function on another type and a = foreach @ generate invoke& Integer .valueOf(x); in the case of static types. Currently it should support any function taking 0+ PigType arguments and returning a PigType argument...in the future we could allow people to cast Object, or to chain together non-PigTypes but that was a bit out of the scope of what I wanted to work on for this. I actually don't love the syntax and would love to evolve it, but that portion of the parser is really hairy and it is really difficult not to introduce ambiguities...after about 10 hours of banging my head on it I went with the above. I'd love to have some eyes on this for technical merit etc. Essentially, this turns any PigType->PigType method into a UDF without having to have a builtin, which I think is cool. This means people can have an arbitrary method on their classpath and don't have to go through the annoyance of wrapping it in a UDF. Ideally this cuts down on the number of lame builtin functions we need to add as people can just use this (it uses bytecode generation so is as performant as any code we'd write, though there are a couple of bytecode optimizations I could do down the line).

            People

            • Assignee:
              Jonathan Coveney
              Reporter:
              Jonathan Coveney
            • Votes:
              1 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development