Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-3772

RDD operation on IPython REPL failed with an illegal port number

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 1.2.0
    • 1.2.0
    • PySpark
    • Mac OS X 10.9.5, Python 2.7.8, IPython 2.2.0

    Description

      To reproduce this issue, we should execute following commands on the commit: 6e27cb630de69fa5acb510b4e2f6b980742b1957.

      $ PYSPARK_PYTHON=ipython ./bin/pyspark
      ...
      In [1]: file = sc.textFile('README.md')
      In [2]: file.first()
      ...
      14/10/03 08:50:13 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
      14/10/03 08:50:13 WARN LoadSnappy: Snappy native library not loaded
      14/10/03 08:50:13 INFO FileInputFormat: Total input paths to process : 1
      14/10/03 08:50:13 INFO SparkContext: Starting job: runJob at PythonRDD.scala:334
      14/10/03 08:50:13 INFO DAGScheduler: Got job 0 (runJob at PythonRDD.scala:334) with 1 output partitions (allowLocal=true)
      14/10/03 08:50:13 INFO DAGScheduler: Final stage: Stage 0(runJob at PythonRDD.scala:334)
      14/10/03 08:50:13 INFO DAGScheduler: Parents of final stage: List()
      14/10/03 08:50:13 INFO DAGScheduler: Missing parents: List()
      14/10/03 08:50:13 INFO DAGScheduler: Submitting Stage 0 (PythonRDD[2] at RDD at PythonRDD.scala:44), which has no missing parents
      14/10/03 08:50:13 INFO MemoryStore: ensureFreeSpace(4456) called with curMem=57388, maxMem=278019440
      14/10/03 08:50:13 INFO MemoryStore: Block broadcast_1 stored as values in memory (estimated size 4.4 KB, free 265.1 MB)
      14/10/03 08:50:13 INFO DAGScheduler: Submitting 1 missing tasks from Stage 0 (PythonRDD[2] at RDD at PythonRDD.scala:44)
      14/10/03 08:50:13 INFO TaskSchedulerImpl: Adding task set 0.0 with 1 tasks
      14/10/03 08:50:13 INFO TaskSetManager: Starting task 0.0 in stage 0.0 (TID 0, localhost, PROCESS_LOCAL, 1207 bytes)
      14/10/03 08:50:13 INFO Executor: Running task 0.0 in stage 0.0 (TID 0)
      14/10/03 08:50:14 ERROR Executor: Exception in task 0.0 in stage 0.0 (TID 0)
      java.lang.IllegalArgumentException: port out of range:1027423549
      at java.net.InetSocketAddress.checkPort(InetSocketAddress.java:143)
      at java.net.InetSocketAddress.<init>(InetSocketAddress.java:188)
      at java.net.Socket.<init>(Socket.java:244)
      at org.apache.spark.api.python.PythonWorkerFactory.createSocket$1(PythonWorkerFactory.scala:75)
      at org.apache.spark.api.python.PythonWorkerFactory.liftedTree1$1(PythonWorkerFactory.scala:90)
      at org.apache.spark.api.python.PythonWorkerFactory.createThroughDaemon(PythonWorkerFactory.scala:89)
      at org.apache.spark.api.python.PythonWorkerFactory.create(PythonWorkerFactory.scala:62)
      at org.apache.spark.SparkEnv.createPythonWorker(SparkEnv.scala:100)
      at org.apache.spark.api.python.PythonRDD.compute(PythonRDD.scala:71)
      at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262)
      at org.apache.spark.rdd.RDD.iterator(RDD.scala:229)
      at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:61)
      at org.apache.spark.scheduler.Task.run(Task.scala:56)
      at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:182)
      at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
      at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
      at java.lang.Thread.run(Thread.java:744)

      Attachments

        Activity

          People

            Unassigned Unassigned
            cocoatomo Tomohiko K.
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: