Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-10975

Shuffle files left behind on Mesos without dynamic allocation

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 1.5.1
    • None
    • Mesos
    • None

    Description

      (from mailing list)

      Running on Mesos in coarse-grained mode. No dynamic allocation or shuffle service.

      I see that there are two types of temporary files under /tmp folder associated with every executor: /tmp/spark-<UUID> and /tmp/blockmgr-<UUID>. When job is finished /tmp/spark-<UUID> is gone, but blockmgr directory is left with all gigabytes in it.

      The reason is that logic to clean up files is only enabled when the shuffle service is running, see https://github.com/apache/spark/pull/7820

      The shuffle files should be placed in the Mesos sandbox or under `tmp/spark` unless the shuffle service is enabled.

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              dragos Dragos Dascalita Haut
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: