Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-23517

Make pyspark.util._exception_message produce the trace from Java side for Py4JJavaError

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Minor
    • Resolution: Fixed
    • Affects Version/s: 2.3.0
    • Fix Version/s: 2.3.1
    • Component/s: PySpark
    • Labels:
      None

      Description

      Currently pyspark.util._exception_message doesn't show its trace and message from Py4JJavaError as below:

      >>> from pyspark.util import _exception_message
      >>> try:
      ...     sc._jvm.java.lang.String(None)
      ... except Exception as e:
      ...     pass
      ...
      >>> e.message
      ''
      

      This is actually a problem in some code paths we can expect this error. For example, see

      from pyspark.sql.functions import udf
      spark.conf.set("spark.sql.execution.arrow.enabled", True)
      spark.range(1).select(udf(lambda x: [[]])()).toPandas()
      
      Traceback (most recent call last):
        File "<stdin>", line 1, in <module>
        File "/.../spark/python/pyspark/sql/dataframe.py", line 2009, in toPandas
          raise RuntimeError("%s\n%s" % (_exception_message(e), msg))
      RuntimeError:
      Note: toPandas attempted Arrow optimization because 'spark.sql.execution.arrow.enabled' is set to true. Please set it to false to disable this.
      

        Attachments

          Activity

            People

            • Assignee:
              hyukjin.kwon Hyukjin Kwon
              Reporter:
              hyukjin.kwon Hyukjin Kwon
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: