Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-12759

Spark should fail fast if --executor-memory is too small for spark to start

Attach filesAttach ScreenshotVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Trivial
    • Resolution: Fixed
    • 1.6.0
    • 2.0.0
    • Spark Shell
    • None

    Description

      With the UnifiedMemoryManager, the minimum memory for executor and driver JVMs was increased to 450MB. There is code in UnifiedMemoryManager to provide a helpful warning if less than that much memory is provided.

      However if you set --executor-memory to something less than that, from the driver process you just see executor failures with no warning, since the more meaningful errors are buried in the executor logs. Eg., on Yarn, you see

      16/01/11 13:59:32 WARN cluster.YarnSchedulerBackend$YarnSchedulerEndpoint: Container marked as failed: container_1452548703600_0001_01_000002 on host: imran-adhoc-2.vpc.cloudera.com. Exit status: 1. Diagnostics: Exception from container-launch.
      Container id: container_1452548703600_0001_01_000002
      Exit code: 1
      Stack trace: ExitCodeException exitCode=1: 
      	at org.apache.hadoop.util.Shell.runCommand(Shell.java:561)
      	at org.apache.hadoop.util.Shell.run(Shell.java:478)
      	at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:738)
      	at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:210)
      	at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:302)
      	at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:82)
      	at java.util.concurrent.FutureTask.run(FutureTask.java:262)
      	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
      	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
      	at java.lang.Thread.run(Thread.java:745)
      
      
      Container exited with a non-zero exit code 1
      

      Though there is already a message from UnifiedMemoryManager if there isn't enough memory for the driver, as long as this is being changed it would be nice if the message more clearly indicated the --driver-memory configuration as well.

      Attachments

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            djalova Daniel Jalova
            irashid Imran Rashid
            Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment