Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-5162

Python yarn-cluster mode

Attach filesAttach ScreenshotVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Closed
    • Major
    • Resolution: Fixed
    • None
    • 1.4.0
    • PySpark, YARN

    Description

      Running pyspark in yarn is currently limited to ‘yarn-client’ mode. It would be great to be able to submit python applications to the cluster and (just like java classes) have the resource manager setup an AM on any node in the cluster. Does anyone know the issues blocking this feature? I was snooping around with enabling python apps:

      Removing the logic stopping python and yarn-cluster from sparkSubmit.scala

      ...
      // The following modes are not supported or applicable
      (clusterManager, deployMode) match

      { ... case (_, CLUSTER) if args.isPython => printErrorAndExit("Cluster deploy mode is currently not supported for python applications.") ... }

      and submitting application via:

      HADOOP_CONF_DIR=insert conf dir ./bin/spark-submit --master yarn-cluster --num-executors 2 —-py-files insert location of egg here --executor-cores 1 ../tools/canary.py

      Everything looks to run alright, pythonRunner is picked up as main class, resources get setup, yarn client gets launched but falls flat on its face:

      2015-01-08 18:48:03,444 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService: DEBUG: FAILED { redacted/.sparkStaging/application_1420594669313_4687/canary.py, 1420742868009, FILE, null }, Resource redacted/.sparkStaging/application_1420594669313_4687/canary.py changed on src filesystem (expected 1420742868009, was 1420742869284

      and

      2015-01-08 18:48:03,446 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource: Resource redacted/.sparkStaging/application_1420594669313_4687/canary.py(->/data/4/yarn/nm/usercache/klassen/filecache/11/canary.py) transitioned from DOWNLOADING to FAILED

      Tracked this down to the apache hadoop code(FSDownload.java line 249) related to container localization of files upon downloading. At this point thought it would be best to raise the issue here and get input.

      Attachments

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            lianhuiwang Lianhui Wang
            dklassen Dana Klassen
            Votes:
            1 Vote for this issue
            Watchers:
            11 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment