Uploaded image for project: 'Apache Cassandra'
  1. Apache Cassandra
  2. CASSANDRA-2988

Improve SSTableReader.load() when loading index files

Agile BoardAttach filesAttach ScreenshotBulk Copy AttachmentsBulk Move AttachmentsVotersWatch issueWatchersCreate sub-taskConvert to sub-taskMoveLinkCloneLabelsUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Low
    • Resolution: Fixed
    • 1.0.0, 1.1.0
    • None
    • None

    Description

      • when we create BufferredRandomAccessFile, we pass skipCache=true. This hurts the read performance because we always process the index files sequentially. Simple fix would be set it to false.
      • multiple index files of a single column family can be loaded in parallel. This buys a lot when you have multiple super large index files.
      • we may also change how we buffer. By using BufferredRandomAccessFile, for every read, we need bunch of checking like
      • do we need to rebuffer?
      • isEOF()?
      • assertions
        These can be simplified to some extent. We can blindly buffer the index file by chunks and process the buffer until a key lies across boundary of a chunk. Then we rebuffer and start from the beginning of the partially read key. Conceptually, this is same as what BRAF does but w/o the overhead in the read**() methods in BRAF.

      Attachments

        1. 2988-2-cleaned.txt
          7 kB
          Jonathan Ellis
        2. 2988-2-v2.txt
          3 kB
          Jonathan Ellis
        3. 2988-parallel-v2.txt
          8 kB
          Jonathan Ellis
        4. c2988-2-v2
          7 kB
          Michael Wu
        5. c2988-modified-buffer.patch
          8 kB
          Michael Wu
        6. c2988-parallel-load-sstables.patch
          7 kB
          Michael Wu

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            mw Michael Wu Assign to me
            mw Michael Wu
            Michael Wu
            Jonathan Ellis
            Votes:
            1 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment