Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-17233

Shuffle file will be left over the capacity of disk when dynamic schedule is enabled in a long running case.

Attach filesAttach ScreenshotVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Incomplete
    • Affects Version/s: 1.5.2, 1.6.2, 2.0.0
    • Fix Version/s: None
    • Component/s: Spark Core
    • Labels:

      Description

      When I execute some sql statement periodically in the long running thriftserver, I found the disk device will be full after about one week later.
      After check the file on linux, I found so many shuffle files left on the block-mgr dir whose shuffle stage had finished long time ago.
      Finally I find when it's need to clean shuffle file, driver will total each executor to do the ShuffleClean. But when dynamic schedule is enabled, executor will be down itself and executor can't clean its shuffle file, then file was left.

      I test it in Spark 1.5 but master branch must have this issue.

        Attachments

        Issue Links

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              carlmartin carlmartin

              Dates

              • Created:
                Updated:
                Resolved:

                Issue deployment