Uploaded image for project: 'Cassandra'
  1. Cassandra
  2. CASSANDRA-13049

Too many open files during bootstrapping

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Normal
    • Resolution: Unresolved
    • None
    • None

    Description

      We just upgraded from 2.2.5 to 3.0.10 and got issue during bootstrapping. So likely this is something made worse along with improving IO performance in Cassandra 3.

      On our side, the issue is that we have lots of small sstables and thus when bootstrapping a new node, it receives lots of files during streaming and Cassandra keeps all of them open for an unpredictable amount of time. Eventually we hit "Too many open files" error and around that time, I can see ~1M open files through lsof and almost all of them are *-Data.db and *-Index.db. Definitely we should use a better compaction strategy to reduce the number of sstables but I see a few possible improvements in Cassandra:

      1. We use memory map when reading data from sstables. Every time we create a new memory map, there is one more file descriptor open. Memory map improves IO performance when dealing with large files, do we want to set a file size threshold when doing this?

      2. Whenever we finished receiving a file from peer, we create a SSTableReader/BigTableReader, which includes opening the data file and index file, and keep them open until some time later (unpredictable). See StreamReceiveTask#L110, BigTableWriter#openFinal and SSTableReader#InstanceTidier. Is it better to lazily open the data/index files or close them more often to reclaim the file descriptors?

      I searched all known issue in JIRA and looks like this is a new issue in Cassandra 3. cc Stefania for comments.

      Attachments

        Activity

          People

            szhou Simon Zhou
            szhou Simon Zhou
            Simon Zhou
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

              Created:
              Updated: