Uploaded image for project: 'Cassandra'
  1. Cassandra
  2. CASSANDRA-10520

Compressed writer and reader should support non-compressed data.

Agile BoardAttach filesAttach ScreenshotVotersWatch issueWatchersCreate sub-taskConvert to sub-taskMoveLinkCloneLabelsUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    Description

      Compressing uncompressible data, as done, for instance, to write SSTables during stress-tests, results in chunks larger than 64k which are a problem for the buffer pooling mechanisms employed by the CompressedRandomAccessReader. This results in non-negligible performance issues due to excessive memory allocation.

      To solve this problem and avoid decompression delays in the cases where it does not provide benefits, I think we should allow compressed files to store uncompressed chunks as alternative to compressed data. Such a chunk could be written after compression returns a buffer larger than, for example, 90% of the input, and would not result in additional delays in writing. On reads it could be recognized by size (using a single global threshold constant in the compression metadata) and data could be directly transferred into the decompressed buffer, skipping the decompression step and ensuring a 64k buffer for compressed data always suffices.

      Attachments

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            blambov Branimir Lambov Assign to me
            blambov Branimir Lambov
            Branimir Lambov
            Robert Stupp
            Votes:
            0 Vote for this issue
            Watchers:
            10 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Issue deployment