Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-9271

Provide UDF framework corresponding to Hive's GenericUDF

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • Backend, Frontend
    • None
    • ghx-label-3

    Description

      Hive GenericUDF are superior to normal UDFs in the following ways:

      1. It can accept arguments of complex types, and return complex types.
      2. It can accept variable length of arguments.
      3. It can accept an infinite number of function signature - for example, it's easy to write a GenericUDF that accepts array<int>, array<array<int>> and so on (arbitrary levels of nesting).
      4. It can do short-circuit evaluations using DeferedObject. Arguments can in any types and it's allowed to do lazy-evaluation for them.

      The masking functions added for Ranger column masking are some important examples of GenericUDF. For instance, there're hundreds of ways to use mask_show_first_n:

         mask_show_first_n(val)
         mask_show_first_n(val, 8)
         mask_show_first_n(val, 8, 'X', 'x', 'n')
         mask_show_first_n(val, 8, 'x', 'x', 'x', 'x', -1)
         mask_show_first_n(val, 8, 'x', -1, 'x', 'x', '9')
         ...

      We have to implement hundreds of overloads for all possible combinations.

      Currently we don't support complex types in UDF arguments or return type, so we should at least provide a framework to support UDFs that:

      1. It can accept variable length of arguments.
      2. Arguments can in any types. Their actual values are extracted in the UDF (lazy-evaluation).

      For 2, maybe just adding a field in impala_udf::AnyVal reflecting the actual types is enough.

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              stigahuang Quanlong Huang
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated: