Uploaded image for project: 'Phoenix'
  1. Phoenix
  2. PHOENIX-4958

Hbase does not load updated UDF class simultaneously on whole cluster

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • None
    • Patch

    Description

      To update UDF according to https://phoenix.apache.org/udf.html limitations, I do next steps:

      1. Drop existing function and JAR file:
        DROP FUNCTION my_function;
        DELETE JAR 'hdfs:/.../udf-v1.jar;
      2. Remove JAR file across cluster's local file system, like:
        rm ${hbase.local.dir}/jars/udf-v1.jar
      3. Upload updated JAR file and create the same function:
        ADD JARS '/.../udf-v2.jar;
        CREATE FUNCTION my_function(...) ... using jar 'hdfs:/.../udf-v2.jar';
        

      The problem is, that every RegionServer could keep the previously loaded function undefined period of time until GC decides to collect appropriate DynamicClassLoader instance which was loaded old UDF class. As result, some RegionServers might execute new function's code, but others - the old one. There is no way to ensure that the function was reloaded by whole cluster.

      As a proposed fix, I'd updated the UDFExpression to keep DynamicClassLoaders per-tenant and per-jar key. Since JAR name must be changed to correctly update the UDF, it's working for described use case.

      Attachments

        1. PHOENIX-4958.patch
          12 kB
          Volodymyr Kvych

        Activity

          People

            Unassigned Unassigned
            Kvych Volodymyr Kvych
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated: