Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-671

Spark runs out of memory on fork/exec (affects both pipes and python)

Log workAgile BoardRank to TopRank to BottomAttach filesAttach ScreenshotBulk Copy AttachmentsBulk Move AttachmentsVotersWatch issueWatchersCreate sub-taskConvert to sub-taskMoveLinkCloneLabelsUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete CommentsDelete
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 0.5.0, 0.5.1, 0.5.2, 0.6.0, 0.6.1
    • 0.7.3, 0.8.0
    • PySpark, Spark Core
    • None

    Description

      Because the JVM uses fork/exec to launch child processes, any child process initially has the memory footprint of its parent. In the case of a large Spark JVM that spawns many child processes (for Pipe or Python support), this quickly leads to kernel memory exhaustion.

      This problem is discussed here:
      https://gist.github.com/1970815

      It results in errors like this:

      13/01/31 20:18:48 INFO cluster.TaskSetManager: Loss was due to java.io.IOException: Cannot run program "cat": java.io.IOException: error=12, Cannot allocate memory
             at java.lang.ProcessBuilder.start(ProcessBuilder.java:475)
             at spark.rdd.PipedRDD.compute(PipedRDD.scala:38)
             at spark.RDD.computeOrReadCheckpoint(RDD.scala:203)
             at spark.RDD.iterator(RDD.scala:192)
             at spark.scheduler.ResultTask.run(ResultTask.scala:76)
      

      I was able to workaround by allowing for memory over-commitment by the kernel on all slaves,

      echo 1 > /proc/sys/vm/overcommit_memory
      

      but we should try to include a more robust solution, such as the one here:
      https://github.com/axiak/java_posix_spawn

      Attachments

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            jey Jey Kottalam Assign to me
            patrick Patrick McFadin
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment