Hive
  1. Hive
  2. HIVE-45

Hive: GenericUDF and support of complex object

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.4.0
    • Component/s: Query Processor
    • Labels:
      None

      Description

      GenericUDF are more powerful than UDF in the following ways:
      1. It can accept arguments of complex types, and return complex types.
      2. It can accept variable length of arguments.
      3. It can accept an infinite number of function signature - for example, it's easy to write a GenericUDF that accepts array<int>, array<array<int>> and so on (arbitrary levels of nesting).
      4. It can do short-circuit evaluations.

        Issue Links

          Activity

          Zheng Shao created issue -
          Zheng Shao made changes -
          Field Original Value New Value
          Component/s contrib/hive [ 12312455 ]
          Owen O'Malley made changes -
          Project Hadoop Core [ 12310240 ] Hadoop Hive [ 12310843 ]
          Key HADOOP-4229 HIVE-45
          Issue Type Improvement [ 4 ] Bug [ 1 ]
          Component/s contrib/hive [ 12312455 ]
          Assignee Zheng Shao [ zshao ]
          Ashish Thusoo made changes -
          Component/s Query Processor [ 12312586 ]
          Zheng Shao made changes -
          Assignee Zheng Shao [ zshao ]
          Zheng Shao made changes -
          Link This issue blocks HIVE-164 [ HIVE-164 ]
          Hide
          Zheng Shao added a comment - - edited

          Please see HIVE-164 for the interface of GenericUDF.

          Show
          Zheng Shao added a comment - - edited Please see HIVE-164 for the interface of GenericUDF.
          Hide
          Zheng Shao added a comment -

          Some considerations for this design:

          • UDFTemplate authors will have to understand the ObjectInspector stuff. Another way would be to pass java Objects as parameters and function return values, and let the authors use reflection. However that means we cannot pass Objects without a concrete Java class to these template functions.
          • I put the restriction that "If the ObjectInspectors of the parameters do not change, then result.oi cannot change as well." because we expect the type of the result of a template function call to remain the same if the parameter types are not changed.
          Show
          Zheng Shao added a comment - Some considerations for this design: UDFTemplate authors will have to understand the ObjectInspector stuff. Another way would be to pass java Objects as parameters and function return values, and let the authors use reflection. However that means we cannot pass Objects without a concrete Java class to these template functions. I put the restriction that "If the ObjectInspectors of the parameters do not change, then result.oi cannot change as well." because we expect the type of the result of a template function call to remain the same if the parameter types are not changed.
          Hide
          Zheng Shao added a comment -

          I plan to add a new package: ql.udf.template for this.
          CASE and IF will be implemented as a subclass of this UDFTemplate.

          This will also allow functions with dynamic number of parameters.

          Show
          Zheng Shao added a comment - I plan to add a new package: ql.udf.template for this. CASE and IF will be implemented as a subclass of this UDFTemplate. This will also allow functions with dynamic number of parameters.
          Zheng Shao made changes -
          Summary Hive: UDFTemplate and support of complex object Hive: GenericUDF and support of complex object
          Description We should allow users to define UDF template (in the sense of c++ template that can take args of different types), and let UDF take complex objects.
          This should be pretty simple given that we have the ObjectInspector which can navigate through the internal structure of the object.
          GenericUDF are more powerful than UDF in the following ways:
          1. It can accept arguments of complex types, and return complex types.
          2. It can accept variable length of arguments.
          3. It can accept an infinite number of function signature - for example, it's easy to write a GenericUDF that accepts array<int>, array<array<int>> and so on (arbitrary levels of nesting).
          4. It can do short-circuit evaluations.
          Hide
          Zheng Shao added a comment -

          This issue is fixed as part of HIVE-164.

          Show
          Zheng Shao added a comment - This issue is fixed as part of HIVE-164 .
          Zheng Shao made changes -
          Status Open [ 1 ] Resolved [ 5 ]
          Fix Version/s 0.4.0 [ 12313714 ]
          Resolution Fixed [ 1 ]
          Zheng Shao made changes -
          Link This issue blocks HIVE-470 [ HIVE-470 ]
          Hide
          Edward Capriolo added a comment -

          I just put in a Jira for https://issues.apache.org/jira/browse/HIVE-471. I realize this jira adds support for some if not all of the things I was looking to do.

          Without getting over generic I would have liked to support.

          evalautate(String, String,Object[])
          

          As well as

           evaluate(  evalautate(String, String,Object[]), "getName", "this", "that")
          

          It seems like what you are doing here makes that possible.

          Show
          Edward Capriolo added a comment - I just put in a Jira for https://issues.apache.org/jira/browse/HIVE-471 . I realize this jira adds support for some if not all of the things I was looking to do. Without getting over generic I would have liked to support. evalautate(String, String,Object[]) As well as evaluate( evalautate(String, String,Object[]), "getName", "this", "that") It seems like what you are doing here makes that possible.
          Carl Steinbach made changes -
          Status Resolved [ 5 ] Closed [ 6 ]
          Gavin made changes -
          Link This issue blocks HIVE-164 [ HIVE-164 ]
          Gavin made changes -
          Link This issue is depended upon by HIVE-164 [ HIVE-164 ]
          Gavin made changes -
          Link This issue blocks HIVE-470 [ HIVE-470 ]
          Gavin made changes -
          Link This issue is depended upon by HIVE-470 [ HIVE-470 ]
          Transition Time In Source Status Execution Times Last Executer Last Execution Date
          Open Open Resolved Resolved
          226d 23h 9m 1 Zheng Shao 05/May/09 01:12
          Resolved Resolved Closed Closed
          955d 22h 54m 1 Carl Steinbach 17/Dec/11 00:07

            People

            • Assignee:
              Zheng Shao
              Reporter:
              Zheng Shao
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development