Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-44743

Reflect function behavior different from Hive

Attach filesAttach ScreenshotVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 3.4.1
    • 4.0.0
    • PySpark, SQL
    • None

    Description

      Spark reflect function will fail if underlying method call throws exception. This causes the whole job to fail.

      In Hive however the exception is caught and null is returned. Simple test to reproduce the behavior

      select reflect('java.net.URLDecoder', 'decode', '%') 

      The workaround would be to wrap this call in a try
      https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/CallMethodViaReflection.scala#L136

      We can support this by adding a new UDF `try_reflect` which mimics the Hive's behavior. Please share your thoughts on this.

      Attachments

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            fanjia Jia Fan
            nownikhil Nikhil Goyal
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment