Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-12107

long running apps may have a huge number of StatisticsData instances under FileSystem

VotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Critical
    • Resolution: Fixed
    • 2.7.0
    • 2.8.0, 2.7.3, 2.6.4, 3.0.0-alpha1
    • fs
    • None

    Description

      We observed with some of our apps (non-mapreduce apps that use filesystems) that they end up accumulating a huge memory footprint coming from FileSystem$Statistics$StatisticsData (in the allData list of Statistics).

      Although the thread reference from StatisticsData is a weak reference, and thus can get cleared once a thread goes away, the actual StatisticsData instances in the list won't get cleared until any of these following methods is called on Statistics:

      • getBytesRead()
      • getBytesWritten()
      • getReadOps()
      • getLargeReadOps()
      • getWriteOps()
      • toString()

      It is quite possible to have an application that interacts with a filesystem but does not call any of these methods on the Statistics. If such an application runs for a long time and has a large amount of thread churn, the memory footprint will grow significantly.

      The current workaround is either to limit the thread churn or to invoke these operations occasionally to pare down the memory. However, this is still a deficiency with FileSystem$Statistics itself in that the memory is controlled only as a side effect of those operations.

      Attachments

        1. HADOOP-12107.001.patch
          7 kB
          Sangjin Lee
        2. HADOOP-12107.002.patch
          11 kB
          Sangjin Lee
        3. HADOOP-12107.003.patch
          11 kB
          Sangjin Lee
        4. HADOOP-12107.004.patch
          11 kB
          Sangjin Lee
        5. HADOOP-12107.005.patch
          11 kB
          Sangjin Lee

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            sjlee0 Sangjin Lee
            sjlee0 Sangjin Lee
            Votes:
            0 Vote for this issue
            Watchers:
            16 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment