Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-1676

HDFS FileSystems continually pile up in the FS cache

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Critical
    • Resolution: Fixed
    • 0.9.1, 1.0.0
    • 0.9.2, 1.0.0
    • Spark Core
    • None

    Description

      Due to HDFS-3545, FileSystem.get() always produces (and caches) a new FileSystem when provided with a new UserGroupInformation (UGI), even if the UGI represents the same user as another UGI. This causes a buildup of FileSystem objects at an alarming rate, often one per task for something like sc.textFile(). The bug is especially hard-hitting for NativeS3FileSystem, which also maintains an open connection to S3, clogging up the system file handles.

      The bug was introduced in https://github.com/apache/spark/pull/29, where doAs was made the default behavior.

      A fix is not forthcoming for the general case, as UGIs do not cache well, but this problem can lead to spark clusters entering into a failed state and requiring executors be restarted.

      Attachments

        Issue Links

          Activity

            People

              tgraves Thomas Graves
              ilikerps Aaron Davidson
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: