Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-27786

SHA1, MD5, and Base64 expression codegen doesn't work when commons-codec is shaded

Attach filesAttach ScreenshotVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Minor
    • Resolution: Fixed
    • Affects Version/s: 2.4.0
    • Fix Version/s: 3.0.0
    • Component/s: SQL
    • Labels:
      None

      Description

      When running a custom build of Spark which shades commons-codec, the sha1Hex expression generates code which doesn't compile:

      org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator: failed to compile: org.codehaus.commons.compiler.CompileException: File 'generated.java', Line 47, Column 93: A method named "sha1Hex" is not declared in any enclosing class nor any supertype, nor through a static import
      

      This is caused by an interaction between Spark's code generator and the shading: the current code generator embeds "org.apache.commons.codec.digest.DigestUtils.sha1Hex" into a larger codegen template, preventing JarJarLinks from being able to replace it with the shaded class's name. The generated code ends up using the unshaded name but the unshaded dependency isn't on our classpath, triggering the above compilation error.

      To fix this problem and allow for proper shading, we can replace the hardcoded string literal with classof[DigestUtils].getName

        Attachments

          Activity

            People

            • Assignee:
              joshrosen Josh Rosen
              Reporter:
              joshrosen Josh Rosen

              Dates

              • Created:
                Updated:
                Resolved:

                Issue deployment