Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-31312

Transforming Hive simple UDF (using JAR) expression may incur CNFE in later evaluation

Log workAgile BoardRank to TopRank to BottomAttach filesAttach ScreenshotVotersWatch issueWatchersCreate sub-taskConvert to sub-taskLinkCloneLabelsUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 2.4.5, 3.0.0
    • Fix Version/s: 2.4.6, 3.0.0
    • Component/s: SQL
    • Labels:
      None

      Description

      In SPARK-26560, we ensured that Hive UDF using JAR is executed regardless of current thread context classloader.

      Wenchen Fan pointed out another potential issue in post-review of SPARK-26560 - quoting the comment:

      Found a potential problem: here we call HiveSimpleUDF.dateType (which is a lazy val), to force to load the class with the corrected class loader.

      However, if the expression gets transformed later, which copies HiveSimpleUDF, then calling HiveSimpleUDF.dataType will re-trigger the class loading, and at that time there is no guarantee that the corrected classloader is used.

      I think we should materialize the loaded class in HiveSimpleUDF.

      This JIRA issue is to track the effort of verifying the potential issue and fixing the issue.

        Attachments

        Issue Links

          Activity

          $i18n.getText('security.level.explanation', $currentSelection) Viewable by All Users
          Cancel

            People

            • Assignee:
              kabhwan Jungtaek Lim Assign to me
              Reporter:
              kabhwan Jungtaek Lim

              Dates

              • Created:
                Updated:
                Resolved:

                Issue deployment