Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-18203

Intelligently manage a pool of open references to store files

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 2.0.0
    • None
    • regionserver
    • None

    Description

      When bringing a region online we open every store file and keep the file open, to avoid further round trips to the HDFS namenode during reads. Naively keeping open every store file we encounter is a bad idea. There should be an upper bound. We should close and reopen files as needed once we are above the upper bound. We should choose candidates to close on a LRU basis. Otherwise we can (and some users have in production) overrun high (~64k) open file handle limits on the server if the aggregate number of store files is too large.

      Note the 'open files' here refers to open/active references to files at the HDFS level. How this maps to active file descriptors at the OS level depends on concurrency of access (block transfers, short circuit reads). The more open files we have at the HDFS level the higher number of OS level file handles we can expect to consume.

      Attachments

        Activity

          People

            Unassigned Unassigned
            apurtell Andrew Kyle Purtell
            Votes:
            0 Vote for this issue
            Watchers:
            15 Start watching this issue

            Dates

              Created:
              Updated: