Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-45996

Show proper dependency requirement messages for Spark Connect

Attach filesAttach ScreenshotVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    Description

      ./bin/pyspark --remote local
      

      We should improve the error messages below.

      /.../pyspark/shell.py:57: UserWarning: Failed to initialize Spark session.
        warnings.warn("Failed to initialize Spark session.")
      Traceback (most recent call last):
        File "/.../pyspark/shell.py", line 52, in <module>
          spark = SparkSession.builder.getOrCreate()
        File "/.../pyspark/sql/session.py", line 476, in getOrCreate
          from pyspark.sql.connect.session import SparkSession as RemoteSparkSession
        File "/.../pyspark/sql/connect/session.py", line 53, in <module>
          from pyspark.sql.connect.client import SparkConnectClient, ChannelBuilder
        File "/.../pyspark/sql/connect/client/__init__.py", line 22, in <module>
          from pyspark.sql.connect.client.core import *  # noqa: F401,F403
        File "/.../pyspark/sql/connect/client/core.py", line 51, in <module>
          import google.protobuf.message
      ModuleNotFoundError: No module named 'google
      
      /.../pyspark/shell.py:57: UserWarning: Failed to initialize Spark session.
        warnings.warn("Failed to initialize Spark session.")
      Traceback (most recent call last):
        File "/.../pyspark/shell.py", line 52, in <module>
          spark = SparkSession.builder.getOrCreate()
        File "/.../pyspark/sql/session.py", line 476, in getOrCreate
          from pyspark.sql.connect.session import SparkSession as RemoteSparkSession
        File "/.../pyspark/sql/connect/session.py", line 53, in <module>
          from pyspark.sql.connect.client import SparkConnectClient, ChannelBuilder
        File "/.../pyspark/sql/connect/client/__init__.py", line 22, in <module>
          from pyspark.sql.connect.client.core import *  # noqa: F401,F403
        File "/.../pyspark/sql/connect/client/core.py", line 52, in <module>
          from grpc_status import rpc_status
      ModuleNotFoundError: No module named 'grpc_status'
      

      Attachments

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            gurwls223 Hyukjin Kwon
            gurwls223 Hyukjin Kwon
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment