Details

    • Type: New Feature
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.11.0
    • Component/s: Function/UDF
    • Labels:
      None

      Description

      We need to support Python UDAF as well as UDF (TAJO-1344).

      1. TAJO-1562.patch
        129 kB
        Jihoon Son

        Activity

        Hide
        jihoonson Jihoon Son added a comment -

        I will share the interface design soon.

        Show
        jihoonson Jihoon Son added a comment - I will share the interface design soon.
        Hide
        jihoonson Jihoon Son added a comment - - edited

        Hi guys. This is the first proposal.
        Honestly, I'm not much familiar with Python, so, this proposal may be weird. Welcome any suggestions and comments.

        I investigated several features of Python. Finally, I think that the class of Python looks appropriate to support UDAF. That is, users can define a new UDAF by defining a Python class which inherits a pre-defined AbstractUdaf class.
        Here is an example.

        AvgPy class

        from tajo_util import output_type
        
        
        class AvgPy:
            sum = 0
            cnt = 0
        
            def __init__(self, sum=0, cnt=0):
                self.sum = sum
                self.cnt = cnt
        
            # eval at the first stage
            def eval(self, item):
                self.sum += item
                self.cnt += 1
        
            # get intermediate result
            @output_type('int4', 'int4')
            def get_partial_result(self):
                return [self.sum, self.cnt]
        
            # merge intermediate results
            def merge(self, sum, cnt):
                self.sum += sum
                self.cnt += cnt
        
            # get final result
            @output_type('float4')
            def get_final_result(self):
                return self.sum / self.cnt
        

        To do support this form of UDAFs, we should support a general way to maintain the aggregated values, e.g., aggregated in SumPy, between different stages. I think that this can be solved by serializing/deserializing them as a tuple as follows.

        message NamedDatum {
          required string name = 1;
          required Datum val = 2;
        }
        
        message NamedTuple {
          repeated NamedDatum datums = 1;
        }
        
        Show
        jihoonson Jihoon Son added a comment - - edited Hi guys. This is the first proposal. Honestly, I'm not much familiar with Python, so, this proposal may be weird. Welcome any suggestions and comments. I investigated several features of Python. Finally, I think that the class of Python looks appropriate to support UDAF. That is, users can define a new UDAF by defining a Python class which inherits a pre-defined AbstractUdaf class. Here is an example. AvgPy class from tajo_util import output_type class AvgPy: sum = 0 cnt = 0 def __init__(self, sum=0, cnt=0): self.sum = sum self.cnt = cnt # eval at the first stage def eval(self, item): self.sum += item self.cnt += 1 # get intermediate result @output_type('int4', 'int4') def get_partial_result(self): return [self.sum, self.cnt] # merge intermediate results def merge(self, sum, cnt): self.sum += sum self.cnt += cnt # get final result @output_type('float4') def get_final_result(self): return self.sum / self.cnt To do support this form of UDAFs, we should support a general way to maintain the aggregated values, e.g., aggregated in SumPy, between different stages. I think that this can be solved by serializing/deserializing them as a tuple as follows. message NamedDatum { required string name = 1; required Datum val = 2; } message NamedTuple { repeated NamedDatum datums = 1; }
        Hide
        githubbot ASF GitHub Bot added a comment -

        GitHub user jihoonson opened a pull request:

        https://github.com/apache/tajo/pull/551

        TAJO-1562: Python UDAF support

        You can merge this pull request into a Git repository by running:

        $ git pull https://github.com/jihoonson/tajo-2 TAJO-1562

        Alternatively you can review and apply these changes as the patch at:

        https://github.com/apache/tajo/pull/551.patch

        To close this pull request, make a commit to your master/trunk branch
        with (at least) the following in the commit message:

        This closes #551


        commit 0a7c24685cedcaaf5be8cb899524a4838e335182
        Author: Jihoon Son <jihoonson@apache.org>
        Date: 2015-04-18T08:33:24Z

        TAJO-1562

        commit 4916724116b0be888ab1257e8133e562b2506957
        Author: Jihoon Son <jihoonson@apache.org>
        Date: 2015-04-18T09:59:41Z

        TAJO-1562

        commit 2d8e6e6ef4dc198951e1d830dca64bbb051e22d7
        Author: Jihoon Son <jihoonson@apache.org>
        Date: 2015-04-20T02:11:14Z

        Merge branch 'TAJO-1562' of https://github.com/jihoonson/tajo-2 into TAJO-1562

        commit 67f4d063415f0f2361cb9b9ba7a5c5e681af5c3d
        Author: Jihoon Son <jihoonson@apache.org>
        Date: 2015-04-20T05:02:54Z

        TAJO-1562

        commit 4dfa5a7fd37df753721b776305098a07ecbd803b
        Author: Jihoon Son <jihoonson@apache.org>
        Date: 2015-04-20T05:07:42Z

        TAJO-1562

        commit 4adb233652d8a7777e8177e623a2e5d7faa7aad3
        Author: Jihoon Son <jihoonson@apache.org>
        Date: 2015-04-20T07:19:59Z

        Merge branch 'master' of https://git-wip-us.apache.org/repos/asf/tajo into TAJO-1562

        commit dd511ab93f9f555f1d1f21f609d104a6a1f729a8
        Author: Jihoon Son <jihoonson@apache.org>
        Date: 2015-04-20T10:08:22Z

        TAJO-1562

        commit 521dcbb27cd434dae76d2dee06d9963ad529d509
        Author: Jihoon Son <jihoonson@apache.org>
        Date: 2015-04-20T11:31:42Z

        TAJO-1562

        commit 8dbe9589aac142d8b3438e85efcd5932616aabd4
        Author: Jihoon Son <jihoonson@apache.org>
        Date: 2015-04-20T11:52:49Z

        TAJO-1562

        commit 3100949e2d7b8fff13e79a47cec73043a21c31af
        Author: Jihoon Son <jihoonson@apache.org>
        Date: 2015-04-20T14:36:40Z

        TAJO-1562

        commit e8597c2f8c1ccc7f046c92e0fa287df72b837b09
        Author: Jihoon Son <jihoonson@apache.org>
        Date: 2015-04-20T14:51:56Z

        TAJO-1562

        commit 404c5f6c547bc8e230b281376b46a2938c118497
        Author: Jihoon Son <jihoonson@apache.org>
        Date: 2015-04-20T21:37:46Z

        TAJO-1562

        commit 42902cadfafc86da01d7ff8872b14f3ccf4d2124
        Author: Jihoon Son <jihoonson@apache.org>
        Date: 2015-04-21T12:21:53Z

        TAJO-1562

        commit 73304722b7158d8cc92b6861f417be85d95d557a
        Author: Jihoon Son <jihoonson@apache.org>
        Date: 2015-04-21T14:26:42Z

        TAJO-1562

        commit 6aed1bd4166b70b05f02018125fa5518a19cc6fc
        Author: Jihoon Son <jihoonson@apache.org>
        Date: 2015-04-22T11:57:51Z

        TAJO-1562

        commit 1ae678dfef373ac40e3d6d58c32bfd9f95b9a158
        Author: Jihoon Son <jihoonson@apache.org>
        Date: 2015-04-22T17:22:57Z

        TAJO-1562

        commit 3b0ae9eac108abf0289e164d93632d57f21e123c
        Author: Jihoon Son <jihoonson@apache.org>
        Date: 2015-04-23T02:48:19Z

        Test success

        commit f12386b28dd059d5ad290ed6122c220e092234b2
        Author: Jihoon Son <jihoonson@apache.org>
        Date: 2015-04-23T03:04:59Z

        Output type cleanup

        commit 26edca2e81d8487574834711a189aac3628955f8
        Author: Jihoon Son <jihoonson@apache.org>
        Date: 2015-04-23T04:23:21Z

        Test added

        commit 1306db498bd569ef54547127c5282561888078e5
        Author: Jihoon Son <jihoonson@apache.org>
        Date: 2015-04-23T04:23:58Z

        Merge branch 'master' of https://git-wip-us.apache.org/repos/asf/tajo into TAJO-1562

        commit f51e195629c34dce42173818a98f0b5489f7dd91
        Author: Jihoon Son <jihoonson@apache.org>
        Date: 2015-04-23T04:36:35Z

        Remove all commented out codes

        commit d3eba9892bacbd035ebdf0c483911187f8bafed2
        Author: Jihoon Son <jihoonson@apache.org>
        Date: 2015-04-23T04:54:57Z

        Fix window query test failure

        commit 4b5609becb777c77cf3b30109c7a5b203078f4b3
        Author: Jihoon Son <jihoonson@apache.org>
        Date: 2015-04-23T06:22:07Z

        Test on local machine

        commit 4b0f8aa5d832057be153493e55a1e405c2f8aa27
        Author: Jihoon Son <jihoonson@apache.org>
        Date: 2015-04-23T06:54:55Z

        Add a document


        Show
        githubbot ASF GitHub Bot added a comment - GitHub user jihoonson opened a pull request: https://github.com/apache/tajo/pull/551 TAJO-1562 : Python UDAF support You can merge this pull request into a Git repository by running: $ git pull https://github.com/jihoonson/tajo-2 TAJO-1562 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/tajo/pull/551.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #551 commit 0a7c24685cedcaaf5be8cb899524a4838e335182 Author: Jihoon Son <jihoonson@apache.org> Date: 2015-04-18T08:33:24Z TAJO-1562 commit 4916724116b0be888ab1257e8133e562b2506957 Author: Jihoon Son <jihoonson@apache.org> Date: 2015-04-18T09:59:41Z TAJO-1562 commit 2d8e6e6ef4dc198951e1d830dca64bbb051e22d7 Author: Jihoon Son <jihoonson@apache.org> Date: 2015-04-20T02:11:14Z Merge branch ' TAJO-1562 ' of https://github.com/jihoonson/tajo-2 into TAJO-1562 commit 67f4d063415f0f2361cb9b9ba7a5c5e681af5c3d Author: Jihoon Son <jihoonson@apache.org> Date: 2015-04-20T05:02:54Z TAJO-1562 commit 4dfa5a7fd37df753721b776305098a07ecbd803b Author: Jihoon Son <jihoonson@apache.org> Date: 2015-04-20T05:07:42Z TAJO-1562 commit 4adb233652d8a7777e8177e623a2e5d7faa7aad3 Author: Jihoon Son <jihoonson@apache.org> Date: 2015-04-20T07:19:59Z Merge branch 'master' of https://git-wip-us.apache.org/repos/asf/tajo into TAJO-1562 commit dd511ab93f9f555f1d1f21f609d104a6a1f729a8 Author: Jihoon Son <jihoonson@apache.org> Date: 2015-04-20T10:08:22Z TAJO-1562 commit 521dcbb27cd434dae76d2dee06d9963ad529d509 Author: Jihoon Son <jihoonson@apache.org> Date: 2015-04-20T11:31:42Z TAJO-1562 commit 8dbe9589aac142d8b3438e85efcd5932616aabd4 Author: Jihoon Son <jihoonson@apache.org> Date: 2015-04-20T11:52:49Z TAJO-1562 commit 3100949e2d7b8fff13e79a47cec73043a21c31af Author: Jihoon Son <jihoonson@apache.org> Date: 2015-04-20T14:36:40Z TAJO-1562 commit e8597c2f8c1ccc7f046c92e0fa287df72b837b09 Author: Jihoon Son <jihoonson@apache.org> Date: 2015-04-20T14:51:56Z TAJO-1562 commit 404c5f6c547bc8e230b281376b46a2938c118497 Author: Jihoon Son <jihoonson@apache.org> Date: 2015-04-20T21:37:46Z TAJO-1562 commit 42902cadfafc86da01d7ff8872b14f3ccf4d2124 Author: Jihoon Son <jihoonson@apache.org> Date: 2015-04-21T12:21:53Z TAJO-1562 commit 73304722b7158d8cc92b6861f417be85d95d557a Author: Jihoon Son <jihoonson@apache.org> Date: 2015-04-21T14:26:42Z TAJO-1562 commit 6aed1bd4166b70b05f02018125fa5518a19cc6fc Author: Jihoon Son <jihoonson@apache.org> Date: 2015-04-22T11:57:51Z TAJO-1562 commit 1ae678dfef373ac40e3d6d58c32bfd9f95b9a158 Author: Jihoon Son <jihoonson@apache.org> Date: 2015-04-22T17:22:57Z TAJO-1562 commit 3b0ae9eac108abf0289e164d93632d57f21e123c Author: Jihoon Son <jihoonson@apache.org> Date: 2015-04-23T02:48:19Z Test success commit f12386b28dd059d5ad290ed6122c220e092234b2 Author: Jihoon Son <jihoonson@apache.org> Date: 2015-04-23T03:04:59Z Output type cleanup commit 26edca2e81d8487574834711a189aac3628955f8 Author: Jihoon Son <jihoonson@apache.org> Date: 2015-04-23T04:23:21Z Test added commit 1306db498bd569ef54547127c5282561888078e5 Author: Jihoon Son <jihoonson@apache.org> Date: 2015-04-23T04:23:58Z Merge branch 'master' of https://git-wip-us.apache.org/repos/asf/tajo into TAJO-1562 commit f51e195629c34dce42173818a98f0b5489f7dd91 Author: Jihoon Son <jihoonson@apache.org> Date: 2015-04-23T04:36:35Z Remove all commented out codes commit d3eba9892bacbd035ebdf0c483911187f8bafed2 Author: Jihoon Son <jihoonson@apache.org> Date: 2015-04-23T04:54:57Z Fix window query test failure commit 4b5609becb777c77cf3b30109c7a5b203078f4b3 Author: Jihoon Son <jihoonson@apache.org> Date: 2015-04-23T06:22:07Z Test on local machine commit 4b0f8aa5d832057be153493e55a1e405c2f8aa27 Author: Jihoon Son <jihoonson@apache.org> Date: 2015-04-23T06:54:55Z Add a document
        Hide
        githubbot ASF GitHub Bot added a comment -

        Github user jihoonson commented on the pull request:

        https://github.com/apache/tajo/pull/551#issuecomment-96484878

        This patch is ready for review. Users can define their own Python UDAFs by defining a class containing some mandatory functions, i.e., reset(), eval(), merge(), get_partial_result(), and get_final_result(). Here is a typical example.

        ```
        class AvgPy:
        sum = 0
        cnt = 0

        def _init_(self):
        self.reset()

        def reset(self):
        self.sum = 0
        self.cnt = 0

        1. eval at the first stage
          def eval(self, item):
          self.sum += item
          self.cnt += 1
        1. get intermediate result
          def get_partial_result(self):
          return [self.sum, self.cnt]
        1. merge intermediate results
          def merge(self, list):
          self.sum += list[0]
          self.cnt += list[1]
        1. get final result
          @output_type('float8')
          def get_final_result(self):
          return self.sum / float(self.cnt)
          ```

        The additional restriction is that the return type of ```get_partial_result()``` and the input type of ```merge()``` must be same.

        I'd like to give a brief explanation of how Python UDAFs are executed. During aggregation, Tajo currently keeps intermediate aggregation state per key in each implementation of aggregation operators. This approach also should be applied for Python functions to control resource usage and avoid a huge change. To do so, intermediate aggregation state of Python UDAF should be sent to Tajo workers whenever performing aggregation for different keys. However, I don't want for users to implement serialization/deserialization of intermediate aggregation state.

        So, I simply used json for that. The snapshot of the instance of Python UDAF is sent to Tajo workers when necessary.

        Let me consider an example. When a tuple (1, 10) is read and the aggregation key is 1, Tajo creates a PythonAggFunctionContext for the aggregation key and call AvgPy.eval() with an input 10. Here, the intermediate aggregation state, i.e. AvgPy.sum = 10 and AvgPy.cnt = 1, is kept in the instance of AvgPy instead of PythonAggFunctionContext. However, when a tuple (2, 5) is read as a next input, Tajo gets intermediate aggregation state from the Python controller, and stores it in PythonAggFunctionContext. After that, the instance of AvgPy is reset to compute aggregation for another key 2.

        Followings are some highlights of remaining implementation.

        • I've improved Python controller to support UDAF as well as UDF.
        • For UDAF execution, the Python controller executes some pre-defined functions (reset(), eval(), merge(), get_partial_result(), and get_final_result()).
        • I've improved AggregationFunctionCallEval to support script functions via AggFunctionInvoke.
        • AggFunctionInvoke is an abstract class to represent how the function is executed. There are two subclasses, ClassBasedAggFunctionInvoke and PythonAggFunctionInvoke.

        Finally, I've added several test functions, but some of them are disabled. This is due to our rigid function syntax. Currently, function names are explicitly defined in our parser, and their types are identified during parsing SQLs. This architecture should be improved to support functions defined by users.

        Show
        githubbot ASF GitHub Bot added a comment - Github user jihoonson commented on the pull request: https://github.com/apache/tajo/pull/551#issuecomment-96484878 This patch is ready for review. Users can define their own Python UDAFs by defining a class containing some mandatory functions, i.e., reset(), eval(), merge(), get_partial_result(), and get_final_result(). Here is a typical example. ``` class AvgPy: sum = 0 cnt = 0 def _ init _(self): self.reset() def reset(self): self.sum = 0 self.cnt = 0 eval at the first stage def eval(self, item): self.sum += item self.cnt += 1 get intermediate result def get_partial_result(self): return [self.sum, self.cnt] merge intermediate results def merge(self, list): self.sum += list [0] self.cnt += list [1] get final result @output_type('float8') def get_final_result(self): return self.sum / float(self.cnt) ``` The additional restriction is that the return type of ```get_partial_result()``` and the input type of ```merge()``` must be same. I'd like to give a brief explanation of how Python UDAFs are executed. During aggregation, Tajo currently keeps intermediate aggregation state per key in each implementation of aggregation operators. This approach also should be applied for Python functions to control resource usage and avoid a huge change. To do so, intermediate aggregation state of Python UDAF should be sent to Tajo workers whenever performing aggregation for different keys. However, I don't want for users to implement serialization/deserialization of intermediate aggregation state. So, I simply used json for that. The snapshot of the instance of Python UDAF is sent to Tajo workers when necessary. Let me consider an example. When a tuple (1, 10) is read and the aggregation key is 1, Tajo creates a PythonAggFunctionContext for the aggregation key and call AvgPy.eval() with an input 10. Here, the intermediate aggregation state, i.e. AvgPy.sum = 10 and AvgPy.cnt = 1, is kept in the instance of AvgPy instead of PythonAggFunctionContext. However, when a tuple (2, 5) is read as a next input, Tajo gets intermediate aggregation state from the Python controller, and stores it in PythonAggFunctionContext. After that, the instance of AvgPy is reset to compute aggregation for another key 2. Followings are some highlights of remaining implementation. I've improved Python controller to support UDAF as well as UDF. For UDAF execution, the Python controller executes some pre-defined functions (reset(), eval(), merge(), get_partial_result(), and get_final_result()). I've improved AggregationFunctionCallEval to support script functions via AggFunctionInvoke. AggFunctionInvoke is an abstract class to represent how the function is executed. There are two subclasses, ClassBasedAggFunctionInvoke and PythonAggFunctionInvoke. Finally, I've added several test functions, but some of them are disabled. This is due to our rigid function syntax. Currently, function names are explicitly defined in our parser, and their types are identified during parsing SQLs. This architecture should be improved to support functions defined by users.
        Hide
        tajoqa Tajo QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12728406/TAJO-1562.patch
        against master revision release-0.9.0-rc0-276-g1971d85.

        -1 patch. The patch command could not apply the patch.

        -1 patch. The patch command could not apply the patch.

        Console output: https://builds.apache.org/job/PreCommit-TAJO-Build/763//console

        This message is automatically generated.

        Show
        tajoqa Tajo QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12728406/TAJO-1562.patch against master revision release-0.9.0-rc0-276-g1971d85. -1 patch. The patch command could not apply the patch. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-TAJO-Build/763//console This message is automatically generated.
        Hide
        githubbot ASF GitHub Bot added a comment -

        Github user hyunsik commented on the pull request:

        https://github.com/apache/tajo/pull/551#issuecomment-96986771

        Great job! The interface looks nice to me. Even though some parts seem to need more optimization, it would take more time. I think that the current patch seems to be ready to be committed because the interface and the code skeleton are great. We can do more optimization later.

        I have one comment. Could you rename isUdf to others like isScalarFunction or isAggregationFunction? It would be more intuitive, I think.

        Show
        githubbot ASF GitHub Bot added a comment - Github user hyunsik commented on the pull request: https://github.com/apache/tajo/pull/551#issuecomment-96986771 Great job! The interface looks nice to me. Even though some parts seem to need more optimization, it would take more time. I think that the current patch seems to be ready to be committed because the interface and the code skeleton are great. We can do more optimization later. I have one comment. Could you rename isUdf to others like isScalarFunction or isAggregationFunction? It would be more intuitive, I think.
        Hide
        githubbot ASF GitHub Bot added a comment -

        Github user jihoonson commented on the pull request:

        https://github.com/apache/tajo/pull/551#issuecomment-96989855

        Thanks @hyunsik. I've changed isUdf to isScalarFunction.

        Show
        githubbot ASF GitHub Bot added a comment - Github user jihoonson commented on the pull request: https://github.com/apache/tajo/pull/551#issuecomment-96989855 Thanks @hyunsik. I've changed isUdf to isScalarFunction.
        Hide
        githubbot ASF GitHub Bot added a comment -

        Github user jinossy commented on the pull request:

        https://github.com/apache/tajo/pull/551#issuecomment-97081799

        +1 Great work!

        Show
        githubbot ASF GitHub Bot added a comment - Github user jinossy commented on the pull request: https://github.com/apache/tajo/pull/551#issuecomment-97081799 +1 Great work!
        Hide
        jihoonson Jihoon Son added a comment -

        Committed to master.

        Show
        jihoonson Jihoon Son added a comment - Committed to master.
        Hide
        jihoonson Jihoon Son added a comment -

        This issue is not committed yet. My mistake.

        Show
        jihoonson Jihoon Son added a comment - This issue is not committed yet. My mistake.
        Hide
        githubbot ASF GitHub Bot added a comment -

        Github user asfgit closed the pull request at:

        https://github.com/apache/tajo/pull/551

        Show
        githubbot ASF GitHub Bot added a comment - Github user asfgit closed the pull request at: https://github.com/apache/tajo/pull/551
        Hide
        githubbot ASF GitHub Bot added a comment -

        Github user jihoonson commented on the pull request:

        https://github.com/apache/tajo/pull/551#issuecomment-97285120

        @jinossy @hyunsik, thanks for your reviews!

        Show
        githubbot ASF GitHub Bot added a comment - Github user jihoonson commented on the pull request: https://github.com/apache/tajo/pull/551#issuecomment-97285120 @jinossy @hyunsik, thanks for your reviews!
        Hide
        jihoonson Jihoon Son added a comment -

        Committed to master.

        Show
        jihoonson Jihoon Son added a comment - Committed to master.
        Hide
        hudson Hudson added a comment -

        FAILURE: Integrated in Tajo-master-CODEGEN-build #330 (See https://builds.apache.org/job/Tajo-master-CODEGEN-build/330/)
        TAJO-1562: Python UDAF support. (jihoon) (jihoonson: rev 9540f16edb0de1a66b016b8a7b65568cc2d64709)

        • tajo-plan/src/main/java/org/apache/tajo/plan/expr/EvalContext.java
        • tajo-core/src/test/resources/results/TestGroupByQuery/testPythonUdaf3.result
        • tajo-plan/src/main/java/org/apache/tajo/plan/function/FunctionInvoke.java
        • tajo-core/src/test/resources/queries/TestGroupByQuery/testPythonUdaf.sql
        • tajo-core/src/test/resources/python/test_funcs.py
        • tajo-core/src/test/resources/results/TestGroupByQuery/testComplexTargetWithPythonUdaf.result
        • tajo-plan/src/main/java/org/apache/tajo/plan/function/stream/CSVLineSerDe.java
        • tajo-plan/src/main/java/org/apache/tajo/plan/function/PythonAggFunctionInvoke.java
        • tajo-core/src/test/java/org/apache/tajo/engine/query/TestGroupByQuery.java
        • tajo-plan/src/main/java/org/apache/tajo/plan/function/FunctionInvokeContext.java
        • tajo-plan/src/main/java/org/apache/tajo/plan/function/stream/TextLineDeserializer.java
        • tajo-plan/src/main/java/org/apache/tajo/plan/ExprAnnotator.java
        • tajo-core/src/test/java/org/apache/tajo/engine/function/TestPythonFunctions.java
        • tajo-catalog/tajo-catalog-common/src/main/java/org/apache/tajo/function/FunctionInvocation.java
        • tajo-core/src/test/resources/queries/TestGroupByQuery/testGroupbyWithPythonFunc.sql
        • tajo-core/src/test/resources/python/test_funcs2.py
        • tajo-catalog/tajo-catalog-common/src/main/java/org/apache/tajo/function/PythonInvocationDesc.java
        • tajo-core/src/main/java/org/apache/tajo/engine/function/FunctionLoader.java
        • tajo-docs/src/main/sphinx/functions.rst
        • tajo-plan/src/main/java/org/apache/tajo/plan/function/python/PythonScriptEngine.java
        • tajo-plan/src/main/java/org/apache/tajo/plan/function/stream/TextLineSerializer.java
        • tajo-core/src/test/resources/results/TestGroupByQuery/testPythonUdaf.result
        • tajo-docs/src/main/sphinx/functions/python.rst
        • tajo-algebra/src/main/java/org/apache/tajo/algebra/GeneralSetFunctionExpr.java
        • tajo-core/src/main/java/org/apache/tajo/engine/planner/global/GlobalPlanner.java
        • tajo-core/src/test/resources/queries/TestGroupByQuery/testPythonUdaf3.sql
        • tajo-core/src/test/resources/results/TestGroupByQuery/testPythonUdaf2.result
        • tajo-plan/src/main/java/org/apache/tajo/plan/function/AggFunction.java
        • tajo-core/src/test/resources/queries/TestGroupByQuery/testPythonUdaf2.sql
        • tajo-plan/src/main/java/org/apache/tajo/plan/function/stream/InputHandler.java
        • tajo-plan/src/main/java/org/apache/tajo/plan/function/PythonFunctionInvoke.java
        • tajo-plan/src/main/java/org/apache/tajo/plan/function/AggFunctionInvoke.java
        • tajo-plan/src/main/java/org/apache/tajo/plan/expr/WindowFunctionEval.java
        • tajo-core/src/test/resources/queries/TestGroupByQuery/testComplexTargetWithPythonUdaf.sql
        • tajo-plan/src/main/java/org/apache/tajo/plan/function/stream/TextFieldSerializerDeserializer.java
        • tajo-plan/src/main/java/org/apache/tajo/plan/serder/EvalNodeDeserializer.java
        • tajo-core/src/main/resources/python/controller.py
        • tajo-core/src/test/resources/queries/TestGroupByQuery/testDistinctPythonUdafWithUnion1.sql
        • tajo-plan/src/main/java/org/apache/tajo/plan/function/stream/BufferPool.java
        • tajo-core/src/test/resources/results/TestGroupByQuery/testDistinctPythonUdafWithUnion1.result
        • tajo-plan/src/main/java/org/apache/tajo/plan/function/ClassBasedAggFunctionInvoke.java
        • tajo-plan/src/main/java/org/apache/tajo/plan/function/stream/OutputHandler.java
        • tajo-core/src/test/resources/queries/TestGroupByQuery/testPythonUdafWithHaving.sql
        • tajo-plan/src/main/java/org/apache/tajo/plan/function/stream/CSVLineDeserializer.java
        • tajo-plan/src/main/java/org/apache/tajo/plan/function/stream/CSVLineSerializer.java
        • tajo-core/src/test/resources/queries/TestSelectQuery/testSelectPythonFuncs.sql
        • CHANGES
        • tajo-plan/src/main/java/org/apache/tajo/plan/expr/AggregationFunctionCallEval.java
        • tajo-plan/src/main/java/org/apache/tajo/plan/function/stream/TextLineSerDe.java
        • tajo-core/src/test/resources/queries/TestSelectQuery/testSelectWithPredicateOnPythonFunc.sql
        • tajo-core/src/test/resources/results/TestGroupByQuery/testPythonUdafWithNullData.result
        • tajo-plan/src/main/java/org/apache/tajo/plan/function/ClassBasedScalarFunctionInvoke.java
        • tajo-plan/src/main/java/org/apache/tajo/plan/function/LegacyScalarFunctionInvoke.java
        • tajo-catalog/tajo-catalog-common/src/main/proto/CatalogProtos.proto
        • tajo-core/src/main/resources/python/tajo_util.py
        • tajo-core/src/test/resources/python/test_udaf.py
        • tajo-core/src/test/resources/results/TestGroupByQuery/testPythonUdafWithHaving.result
        • tajo-plan/src/main/java/org/apache/tajo/plan/function/python/TajoScriptEngine.java
        • tajo-plan/src/main/java/org/apache/tajo/plan/function/stream/FieldSerializerDeserializer.java
        • tajo-core/src/test/resources/queries/TestSelectQuery/testNestedPythonFunction.sql
        • tajo-core/src/test/resources/python/test_funcs.pyc
        • tajo-plan/src/main/java/org/apache/tajo/plan/function/stream/StreamingUtil.java
        • tajo-core/src/test/resources/queries/TestGroupByQuery/testPythonUdafWithNullData.sql
        Show
        hudson Hudson added a comment - FAILURE: Integrated in Tajo-master-CODEGEN-build #330 (See https://builds.apache.org/job/Tajo-master-CODEGEN-build/330/ ) TAJO-1562 : Python UDAF support. (jihoon) (jihoonson: rev 9540f16edb0de1a66b016b8a7b65568cc2d64709) tajo-plan/src/main/java/org/apache/tajo/plan/expr/EvalContext.java tajo-core/src/test/resources/results/TestGroupByQuery/testPythonUdaf3.result tajo-plan/src/main/java/org/apache/tajo/plan/function/FunctionInvoke.java tajo-core/src/test/resources/queries/TestGroupByQuery/testPythonUdaf.sql tajo-core/src/test/resources/python/test_funcs.py tajo-core/src/test/resources/results/TestGroupByQuery/testComplexTargetWithPythonUdaf.result tajo-plan/src/main/java/org/apache/tajo/plan/function/stream/CSVLineSerDe.java tajo-plan/src/main/java/org/apache/tajo/plan/function/PythonAggFunctionInvoke.java tajo-core/src/test/java/org/apache/tajo/engine/query/TestGroupByQuery.java tajo-plan/src/main/java/org/apache/tajo/plan/function/FunctionInvokeContext.java tajo-plan/src/main/java/org/apache/tajo/plan/function/stream/TextLineDeserializer.java tajo-plan/src/main/java/org/apache/tajo/plan/ExprAnnotator.java tajo-core/src/test/java/org/apache/tajo/engine/function/TestPythonFunctions.java tajo-catalog/tajo-catalog-common/src/main/java/org/apache/tajo/function/FunctionInvocation.java tajo-core/src/test/resources/queries/TestGroupByQuery/testGroupbyWithPythonFunc.sql tajo-core/src/test/resources/python/test_funcs2.py tajo-catalog/tajo-catalog-common/src/main/java/org/apache/tajo/function/PythonInvocationDesc.java tajo-core/src/main/java/org/apache/tajo/engine/function/FunctionLoader.java tajo-docs/src/main/sphinx/functions.rst tajo-plan/src/main/java/org/apache/tajo/plan/function/python/PythonScriptEngine.java tajo-plan/src/main/java/org/apache/tajo/plan/function/stream/TextLineSerializer.java tajo-core/src/test/resources/results/TestGroupByQuery/testPythonUdaf.result tajo-docs/src/main/sphinx/functions/python.rst tajo-algebra/src/main/java/org/apache/tajo/algebra/GeneralSetFunctionExpr.java tajo-core/src/main/java/org/apache/tajo/engine/planner/global/GlobalPlanner.java tajo-core/src/test/resources/queries/TestGroupByQuery/testPythonUdaf3.sql tajo-core/src/test/resources/results/TestGroupByQuery/testPythonUdaf2.result tajo-plan/src/main/java/org/apache/tajo/plan/function/AggFunction.java tajo-core/src/test/resources/queries/TestGroupByQuery/testPythonUdaf2.sql tajo-plan/src/main/java/org/apache/tajo/plan/function/stream/InputHandler.java tajo-plan/src/main/java/org/apache/tajo/plan/function/PythonFunctionInvoke.java tajo-plan/src/main/java/org/apache/tajo/plan/function/AggFunctionInvoke.java tajo-plan/src/main/java/org/apache/tajo/plan/expr/WindowFunctionEval.java tajo-core/src/test/resources/queries/TestGroupByQuery/testComplexTargetWithPythonUdaf.sql tajo-plan/src/main/java/org/apache/tajo/plan/function/stream/TextFieldSerializerDeserializer.java tajo-plan/src/main/java/org/apache/tajo/plan/serder/EvalNodeDeserializer.java tajo-core/src/main/resources/python/controller.py tajo-core/src/test/resources/queries/TestGroupByQuery/testDistinctPythonUdafWithUnion1.sql tajo-plan/src/main/java/org/apache/tajo/plan/function/stream/BufferPool.java tajo-core/src/test/resources/results/TestGroupByQuery/testDistinctPythonUdafWithUnion1.result tajo-plan/src/main/java/org/apache/tajo/plan/function/ClassBasedAggFunctionInvoke.java tajo-plan/src/main/java/org/apache/tajo/plan/function/stream/OutputHandler.java tajo-core/src/test/resources/queries/TestGroupByQuery/testPythonUdafWithHaving.sql tajo-plan/src/main/java/org/apache/tajo/plan/function/stream/CSVLineDeserializer.java tajo-plan/src/main/java/org/apache/tajo/plan/function/stream/CSVLineSerializer.java tajo-core/src/test/resources/queries/TestSelectQuery/testSelectPythonFuncs.sql CHANGES tajo-plan/src/main/java/org/apache/tajo/plan/expr/AggregationFunctionCallEval.java tajo-plan/src/main/java/org/apache/tajo/plan/function/stream/TextLineSerDe.java tajo-core/src/test/resources/queries/TestSelectQuery/testSelectWithPredicateOnPythonFunc.sql tajo-core/src/test/resources/results/TestGroupByQuery/testPythonUdafWithNullData.result tajo-plan/src/main/java/org/apache/tajo/plan/function/ClassBasedScalarFunctionInvoke.java tajo-plan/src/main/java/org/apache/tajo/plan/function/LegacyScalarFunctionInvoke.java tajo-catalog/tajo-catalog-common/src/main/proto/CatalogProtos.proto tajo-core/src/main/resources/python/tajo_util.py tajo-core/src/test/resources/python/test_udaf.py tajo-core/src/test/resources/results/TestGroupByQuery/testPythonUdafWithHaving.result tajo-plan/src/main/java/org/apache/tajo/plan/function/python/TajoScriptEngine.java tajo-plan/src/main/java/org/apache/tajo/plan/function/stream/FieldSerializerDeserializer.java tajo-core/src/test/resources/queries/TestSelectQuery/testNestedPythonFunction.sql tajo-core/src/test/resources/python/test_funcs.pyc tajo-plan/src/main/java/org/apache/tajo/plan/function/stream/StreamingUtil.java tajo-core/src/test/resources/queries/TestGroupByQuery/testPythonUdafWithNullData.sql
        Hide
        hudson Hudson added a comment -

        SUCCESS: Integrated in Tajo-master-build #691 (See https://builds.apache.org/job/Tajo-master-build/691/)
        TAJO-1562: Python UDAF support. (jihoon) (jihoonson: rev 9540f16edb0de1a66b016b8a7b65568cc2d64709)

        • tajo-docs/src/main/sphinx/functions/python.rst
        • tajo-core/src/test/java/org/apache/tajo/engine/query/TestGroupByQuery.java
        • tajo-core/src/main/java/org/apache/tajo/engine/function/FunctionLoader.java
        • tajo-plan/src/main/java/org/apache/tajo/plan/function/stream/InputHandler.java
        • tajo-plan/src/main/java/org/apache/tajo/plan/function/stream/FieldSerializerDeserializer.java
        • tajo-core/src/test/resources/queries/TestGroupByQuery/testComplexTargetWithPythonUdaf.sql
        • tajo-catalog/tajo-catalog-common/src/main/proto/CatalogProtos.proto
        • tajo-plan/src/main/java/org/apache/tajo/plan/function/stream/TextLineDeserializer.java
        • tajo-core/src/test/resources/queries/TestGroupByQuery/testPythonUdafWithHaving.sql
        • tajo-plan/src/main/java/org/apache/tajo/plan/function/python/PythonScriptEngine.java
        • tajo-core/src/test/resources/queries/TestGroupByQuery/testPythonUdaf3.sql
        • tajo-plan/src/main/java/org/apache/tajo/plan/expr/AggregationFunctionCallEval.java
        • tajo-plan/src/main/java/org/apache/tajo/plan/function/LegacyScalarFunctionInvoke.java
        • tajo-core/src/main/resources/python/tajo_util.py
        • tajo-core/src/test/resources/queries/TestGroupByQuery/testPythonUdaf2.sql
        • tajo-catalog/tajo-catalog-common/src/main/java/org/apache/tajo/function/PythonInvocationDesc.java
        • tajo-core/src/test/resources/results/TestGroupByQuery/testPythonUdaf3.result
        • tajo-plan/src/main/java/org/apache/tajo/plan/function/stream/BufferPool.java
        • tajo-algebra/src/main/java/org/apache/tajo/algebra/GeneralSetFunctionExpr.java
        • tajo-core/src/test/resources/results/TestGroupByQuery/testPythonUdafWithNullData.result
        • tajo-plan/src/main/java/org/apache/tajo/plan/expr/WindowFunctionEval.java
        • tajo-plan/src/main/java/org/apache/tajo/plan/function/stream/TextLineSerializer.java
        • tajo-plan/src/main/java/org/apache/tajo/plan/function/FunctionInvoke.java
        • tajo-core/src/test/java/org/apache/tajo/engine/function/TestPythonFunctions.java
        • tajo-docs/src/main/sphinx/functions.rst
        • tajo-plan/src/main/java/org/apache/tajo/plan/serder/EvalNodeDeserializer.java
        • tajo-core/src/main/resources/python/controller.py
        • tajo-plan/src/main/java/org/apache/tajo/plan/function/stream/CSVLineSerDe.java
        • tajo-core/src/test/resources/queries/TestSelectQuery/testNestedPythonFunction.sql
        • tajo-plan/src/main/java/org/apache/tajo/plan/function/ClassBasedScalarFunctionInvoke.java
        • tajo-plan/src/main/java/org/apache/tajo/plan/function/stream/TextLineSerDe.java
        • tajo-plan/src/main/java/org/apache/tajo/plan/function/stream/OutputHandler.java
        • tajo-core/src/test/resources/python/test_funcs2.py
        • CHANGES
        • tajo-plan/src/main/java/org/apache/tajo/plan/ExprAnnotator.java
        • tajo-core/src/main/java/org/apache/tajo/engine/planner/global/GlobalPlanner.java
        • tajo-core/src/test/resources/queries/TestSelectQuery/testSelectPythonFuncs.sql
        • tajo-core/src/test/resources/python/test_udaf.py
        • tajo-core/src/test/resources/queries/TestSelectQuery/testSelectWithPredicateOnPythonFunc.sql
        • tajo-plan/src/main/java/org/apache/tajo/plan/function/python/TajoScriptEngine.java
        • tajo-core/src/test/resources/results/TestGroupByQuery/testPythonUdaf.result
        • tajo-plan/src/main/java/org/apache/tajo/plan/function/FunctionInvokeContext.java
        • tajo-core/src/test/resources/queries/TestGroupByQuery/testPythonUdafWithNullData.sql
        • tajo-catalog/tajo-catalog-common/src/main/java/org/apache/tajo/function/FunctionInvocation.java
        • tajo-plan/src/main/java/org/apache/tajo/plan/function/stream/CSVLineDeserializer.java
        • tajo-plan/src/main/java/org/apache/tajo/plan/function/PythonAggFunctionInvoke.java
        • tajo-plan/src/main/java/org/apache/tajo/plan/function/stream/CSVLineSerializer.java
        • tajo-core/src/test/resources/queries/TestGroupByQuery/testPythonUdaf.sql
        • tajo-core/src/test/resources/queries/TestGroupByQuery/testGroupbyWithPythonFunc.sql
        • tajo-plan/src/main/java/org/apache/tajo/plan/function/AggFunction.java
        • tajo-core/src/test/resources/results/TestGroupByQuery/testDistinctPythonUdafWithUnion1.result
        • tajo-core/src/test/resources/results/TestGroupByQuery/testComplexTargetWithPythonUdaf.result
        • tajo-core/src/test/resources/python/test_funcs.py
        • tajo-plan/src/main/java/org/apache/tajo/plan/function/stream/TextFieldSerializerDeserializer.java
        • tajo-plan/src/main/java/org/apache/tajo/plan/expr/EvalContext.java
        • tajo-plan/src/main/java/org/apache/tajo/plan/function/stream/StreamingUtil.java
        • tajo-core/src/test/resources/results/TestGroupByQuery/testPythonUdaf2.result
        • tajo-core/src/test/resources/results/TestGroupByQuery/testPythonUdafWithHaving.result
        • tajo-core/src/test/resources/python/test_funcs.pyc
        • tajo-plan/src/main/java/org/apache/tajo/plan/function/AggFunctionInvoke.java
        • tajo-plan/src/main/java/org/apache/tajo/plan/function/ClassBasedAggFunctionInvoke.java
        • tajo-core/src/test/resources/queries/TestGroupByQuery/testDistinctPythonUdafWithUnion1.sql
        • tajo-plan/src/main/java/org/apache/tajo/plan/function/PythonFunctionInvoke.java
        Show
        hudson Hudson added a comment - SUCCESS: Integrated in Tajo-master-build #691 (See https://builds.apache.org/job/Tajo-master-build/691/ ) TAJO-1562 : Python UDAF support. (jihoon) (jihoonson: rev 9540f16edb0de1a66b016b8a7b65568cc2d64709) tajo-docs/src/main/sphinx/functions/python.rst tajo-core/src/test/java/org/apache/tajo/engine/query/TestGroupByQuery.java tajo-core/src/main/java/org/apache/tajo/engine/function/FunctionLoader.java tajo-plan/src/main/java/org/apache/tajo/plan/function/stream/InputHandler.java tajo-plan/src/main/java/org/apache/tajo/plan/function/stream/FieldSerializerDeserializer.java tajo-core/src/test/resources/queries/TestGroupByQuery/testComplexTargetWithPythonUdaf.sql tajo-catalog/tajo-catalog-common/src/main/proto/CatalogProtos.proto tajo-plan/src/main/java/org/apache/tajo/plan/function/stream/TextLineDeserializer.java tajo-core/src/test/resources/queries/TestGroupByQuery/testPythonUdafWithHaving.sql tajo-plan/src/main/java/org/apache/tajo/plan/function/python/PythonScriptEngine.java tajo-core/src/test/resources/queries/TestGroupByQuery/testPythonUdaf3.sql tajo-plan/src/main/java/org/apache/tajo/plan/expr/AggregationFunctionCallEval.java tajo-plan/src/main/java/org/apache/tajo/plan/function/LegacyScalarFunctionInvoke.java tajo-core/src/main/resources/python/tajo_util.py tajo-core/src/test/resources/queries/TestGroupByQuery/testPythonUdaf2.sql tajo-catalog/tajo-catalog-common/src/main/java/org/apache/tajo/function/PythonInvocationDesc.java tajo-core/src/test/resources/results/TestGroupByQuery/testPythonUdaf3.result tajo-plan/src/main/java/org/apache/tajo/plan/function/stream/BufferPool.java tajo-algebra/src/main/java/org/apache/tajo/algebra/GeneralSetFunctionExpr.java tajo-core/src/test/resources/results/TestGroupByQuery/testPythonUdafWithNullData.result tajo-plan/src/main/java/org/apache/tajo/plan/expr/WindowFunctionEval.java tajo-plan/src/main/java/org/apache/tajo/plan/function/stream/TextLineSerializer.java tajo-plan/src/main/java/org/apache/tajo/plan/function/FunctionInvoke.java tajo-core/src/test/java/org/apache/tajo/engine/function/TestPythonFunctions.java tajo-docs/src/main/sphinx/functions.rst tajo-plan/src/main/java/org/apache/tajo/plan/serder/EvalNodeDeserializer.java tajo-core/src/main/resources/python/controller.py tajo-plan/src/main/java/org/apache/tajo/plan/function/stream/CSVLineSerDe.java tajo-core/src/test/resources/queries/TestSelectQuery/testNestedPythonFunction.sql tajo-plan/src/main/java/org/apache/tajo/plan/function/ClassBasedScalarFunctionInvoke.java tajo-plan/src/main/java/org/apache/tajo/plan/function/stream/TextLineSerDe.java tajo-plan/src/main/java/org/apache/tajo/plan/function/stream/OutputHandler.java tajo-core/src/test/resources/python/test_funcs2.py CHANGES tajo-plan/src/main/java/org/apache/tajo/plan/ExprAnnotator.java tajo-core/src/main/java/org/apache/tajo/engine/planner/global/GlobalPlanner.java tajo-core/src/test/resources/queries/TestSelectQuery/testSelectPythonFuncs.sql tajo-core/src/test/resources/python/test_udaf.py tajo-core/src/test/resources/queries/TestSelectQuery/testSelectWithPredicateOnPythonFunc.sql tajo-plan/src/main/java/org/apache/tajo/plan/function/python/TajoScriptEngine.java tajo-core/src/test/resources/results/TestGroupByQuery/testPythonUdaf.result tajo-plan/src/main/java/org/apache/tajo/plan/function/FunctionInvokeContext.java tajo-core/src/test/resources/queries/TestGroupByQuery/testPythonUdafWithNullData.sql tajo-catalog/tajo-catalog-common/src/main/java/org/apache/tajo/function/FunctionInvocation.java tajo-plan/src/main/java/org/apache/tajo/plan/function/stream/CSVLineDeserializer.java tajo-plan/src/main/java/org/apache/tajo/plan/function/PythonAggFunctionInvoke.java tajo-plan/src/main/java/org/apache/tajo/plan/function/stream/CSVLineSerializer.java tajo-core/src/test/resources/queries/TestGroupByQuery/testPythonUdaf.sql tajo-core/src/test/resources/queries/TestGroupByQuery/testGroupbyWithPythonFunc.sql tajo-plan/src/main/java/org/apache/tajo/plan/function/AggFunction.java tajo-core/src/test/resources/results/TestGroupByQuery/testDistinctPythonUdafWithUnion1.result tajo-core/src/test/resources/results/TestGroupByQuery/testComplexTargetWithPythonUdaf.result tajo-core/src/test/resources/python/test_funcs.py tajo-plan/src/main/java/org/apache/tajo/plan/function/stream/TextFieldSerializerDeserializer.java tajo-plan/src/main/java/org/apache/tajo/plan/expr/EvalContext.java tajo-plan/src/main/java/org/apache/tajo/plan/function/stream/StreamingUtil.java tajo-core/src/test/resources/results/TestGroupByQuery/testPythonUdaf2.result tajo-core/src/test/resources/results/TestGroupByQuery/testPythonUdafWithHaving.result tajo-core/src/test/resources/python/test_funcs.pyc tajo-plan/src/main/java/org/apache/tajo/plan/function/AggFunctionInvoke.java tajo-plan/src/main/java/org/apache/tajo/plan/function/ClassBasedAggFunctionInvoke.java tajo-core/src/test/resources/queries/TestGroupByQuery/testDistinctPythonUdafWithUnion1.sql tajo-plan/src/main/java/org/apache/tajo/plan/function/PythonFunctionInvoke.java

          People

          • Assignee:
            jihoonson Jihoon Son
            Reporter:
            jihoonson Jihoon Son
          • Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development