Uploaded image for project: 'Ratis'
  1. Ratis
  2. RATIS-1931 Support Zero-Copy in ratis-grpc
  3. RATIS-2093

Decouple metadata and configuration entries from appendEntries buffer for stateMachineCache

Attach filesAttach ScreenshotVotersWatch issueWatchersLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • 3.1.0
    • None
    • None

    Description

      When testing zero-copy in Ozone (stateMachineCache enabled), we saw hundreds of thousands of ServerProtocol messages trapped unclosed although the number of entries cached in Ozone StateMachine was small (<500). Also, the utilization of direct memory by Netty is high and doesn't go down after the test run is done.

       

      Turns out, an appendEntries request can contain multiple log entries. Some of them can be metadata or configuration entries whose size is small (~10-20 bytes). Some of them can be StateMachine entries whose size is much bigger (4mb).

      Today, when stateMachineCache is enabled, the StateMachine entities stored in LogCache don't have a reference count to the original appendEntries, but metadata and configuration entries do. Because the size of metadata and configuration is small, they will almost never fill up the LogCache to trigger a cacheEvict. Their references to the original appendEntries request prevent the request buffer from being released when StateMachine cache evicts the StateMachine entries.

       

      When stateMachineCache enabled, the metadata and config entries should not hold a reference to the original appendEntries.

      I did a quick test and compared the direct mem util and number of unreleased message before and after making the change.

       

      Attachments

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            duongnguyen Duong
            duongnguyen Duong
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Time Tracking

                Estimated:
                Original Estimate - Not Specified
                Not Specified
                Remaining:
                Remaining Estimate - 0h
                0h
                Logged:
                Time Spent - 1h
                1h

                Slack

                  Issue deployment