Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-7688

PySpark + ipython throws port out of range exception

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Critical
    • Resolution: Not A Problem
    • 1.4.0
    • None
    • PySpark
    • None

    Description

      Saw this error when I set PYSPARK_PYTHON to "ipython" and ran `bin/pyspark`:

      Py4JJavaError: An error occurred while calling z:org.apache.spark.api.python.PythonRDD.runJob.
      : org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 0.0 failed 1 times, most recent failure: Lost task 0.0 in stage 0.0 (TID 0, localhost): java.lang.IllegalArgumentException: port out of range:458964785
      	at java.net.InetSocketAddress.checkPort(InetSocketAddress.java:143)
      	at java.net.InetSocketAddress.<init>(InetSocketAddress.java:185)
      	at java.net.Socket.<init>(Socket.java:241)
      	at org.apache.spark.api.python.PythonWorkerFactory.createSocket$1(PythonWorkerFactory.scala:75)
      	at org.apache.spark.api.python.PythonWorkerFactory.liftedTree1$1(PythonWorkerFactory.scala:90)
      	at org.apache.spark.api.python.PythonWorkerFactory.createThroughDaemon(PythonWorkerFactory.scala:89)
      	at org.apache.spark.api.python.PythonWorkerFactory.create(PythonWorkerFactory.scala:62)
      	at org.apache.spark.SparkEnv.createPythonWorker(SparkEnv.scala:130)
      	at org.apache.spark.api.python.PythonRDD.compute(PythonRDD.scala:72)
      	at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:277)
      	at org.apache.spark.rdd.RDD.iterator(RDD.scala:244)
      	at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:63)
      	at org.apache.spark.scheduler.Task.run(Task.scala:70)
      	at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213)
      	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
      	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
      	at java.lang.Thread.run(Thread.java:744)
      
      Driver stacktrace:
      	at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1266)
      	at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1257)
      	at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1256)
      	at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
      	at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
      	at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1256)
      	at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:730)
      	at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:730)
      	at scala.Option.foreach(Option.scala:236)
      	at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:730)
      	at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1450)
      	at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1411)
      	at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)
      

      My env (OS X 10.10):

       ipython
      Python 2.7.9 |Anaconda 2.2.0 (x86_64)| (default, Dec 15 2014, 10:37:34)
      Type "copyright", "credits" or "license" for more information.
      
      IPython 3.0.0 -- An enhanced Interactive Python.
      

      Attachments

        Activity

          People

            davies Davies Liu
            mengxr Xiangrui Meng
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: