Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-5838

Changing SPARK_LOCAL_DIRS option in spark-env.sh does not take effect without daemon restart

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Minor
    • Resolution: Not A Problem
    • 1.1.1
    • None
    • Deploy, EC2, Spark Submit
    • None

    Description

      This issue has already been mentioned in the mailing list here: http://apache-spark-user-list.1001560.n3.nabble.com/set-spark-local-dir-on-driver-program-doesn-t-take-effect-td11040.html

      The problem usually has to do with Spark creating too many files during shuffles, filling up the small amount of disk space that most EC2 instances have for root on /mnt2.

      The workaround is to set SPARK_LOCAL_DIRS to a larger volume (e.g. to the /mnt/spark volume only, removing /mnt2).

      However for these changes to take effect, the daemons need to be restarted with sbin/stop-all -> sbin/start-all.
      Even more troubling is the fact that the Web UI-> Environment reports that the spark.local.dir is set to the new path, but Spark still spills to /mnt2 as well.

      To my knowledge this is not mentioned anywhere in the documentation or any other mailing list reply except for the one I linked.

      I guess possible solutions are to either ensure the change does take effect so that reality agrees with what the Web UI is reporting, or include a section on the documentation of EC2 for this kind of problem.

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              tvas Theodore Vasiloudis
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: