Description
If you set SPARK_PRINT_LAUNCH_COMMAND=1 to see what java command is being used to launch spark and then try to run pyspark it errors out with a very non-useful error message:
Traceback (most recent call last):
File "/homes/tgraves/test/hadoop2/y-spark-git/python/pyspark/shell.py", line 43, in <module>
sc = SparkContext(appName="PySparkShell", pyFiles=add_files)
File "/homes/tgraves/test/hadoop2/y-spark-git/python/pyspark/context.py", line 94, in _init_
SparkContext._ensure_initialized(self, gateway=gateway)
File "/homes/tgraves/test/hadoop2/y-spark-git/python/pyspark/context.py", line 184, in _ensure_initialized
SparkContext._gateway = gateway or launch_gateway()
File "/homes/tgraves/test/hadoop2/y-spark-git/python/pyspark/java_gateway.py", line 51, in launch_gateway
gateway_port = int(proc.stdout.readline())
ValueError: invalid literal for int() with base 10: 'Spark Command: /home/gs/java/jdk/bin/java -cp :/home/gs/hadoop/current/share/hadoop/common/hadoop-gpl-compression.jar:/home/gs/hadoop/current/share/hadoop/hdfs/lib/YahooDNSToSwitchMapping-0.2.14020207'
Attachments
Issue Links
- is related to
-
SPARK-2313 PySpark should accept port via a command line argument rather than STDIN
- Resolved