[SPARK-1134] ipython won't run standalone python script - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Resolved
Priority: Major
Resolution: Fixed
Affects Version/s: 0.9.0, 0.9.1
Fix Version/s: 0.9.2, 1.0.0
Component/s: PySpark
Labels:
- pyspark

Description

Using Spark 0.9.0, python 2.6.6, and ipython 1.1.0.

The problem: If I want to run a python script as a standalone app, the docs say I should execute the command "pyspark myscript.py". This works as long as IPYTHON=0. But if IPYTHON=1 this doesn't work.

This problem arose for me because I tried to save myself typing by setting IPYTHON=1 in my shell profile script. Which then meant I was unable to execute pyspark standalone scripts.

My analysis:
in the pyspark script, command line arguments are simply ignored if ipython is used:

if [[ "$IPYTHON" = "1" ]] ; then
  exec ipython $IPYTHON_OPTS
else
  exec "$PYSPARK_PYTHON" "$@"
fi

I thought I could get around this by changing the script to pass $@. However, this doesn't work: doing so results in an error saying multiple spark contexts can't be run at once.

This is because of a feature?/bug? of ipython related to the PYTHONSTARTUP environment variable. the pyspark script sets this variable to point to the python/shell.py script, which initializes the Spark Context. In regular python, the PYTHONSTARTUP script runs ONLY if python is invoked in interactive mode; if run with a script, it ignores the variable. iPython runs that script every time, regardless. Which means it will always execute Spark's shell.py script to initialize the spark context even when it was invoked with a script.

Proposed solution:
short term: add this information to the Spark docs regarding iPython. Something like "Note, iPython can only be used interactively. Use regular Python to execute pyspark script files."
long term: change the pyspark script to tell if arguments are passed in; if so, just call python instead of pyspark, or don't set the PYTHONSTARTUP variable? Or maybe fix shell.py to detect if it's being invoked in non-interactively and not initialize sc.

Attachments

Activity

People

Assignee:: Diana Carroll

Reporter:: Diana Carroll

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Dates

Created:: 25/Feb/14 07:39

Updated:: 03/Apr/14 22:50

Resolved:: 03/Apr/14 22:50