Uploaded image for project: 'Kafka'
  1. Kafka
  2. KAFKA-3776 Unify store and downstream caching in streams
  3. KAFKA-3973

Investigate feasibility of caching bytes vs. records

Attach filesAttach ScreenshotVotersWatch issueWatchersLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • 0.10.1.0
    • streams
    • None

    Description

      Currently the cache stores and accounts for records, not bytes or objects. This investigation would be around measuring any performance overheads that come from storing bytes or objects. As an outcome we should know whether 1) we should store bytes or 2) we should store objects.

      If we store objects, the cache still needs to know their size (so that it can know if the object fits in the allocated cache space, e.g., if the cache is 100MB and the object is 10MB, we'd have space for 10 such objects). The investigation needs to figure out how to find out the size of the object efficiently in Java.

      If we store bytes, then we are serialising an object into bytes before caching it, i.e., we take a serialisation cost. The investigation needs measure how bad this cost can be especially for the case when all objects fit in cache (and thus any extra serialisation cost would show).

      Attachments

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            bbejeck Bill Bejeck
            enothereska Eno Thereska
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment