Uploaded image for project: 'Kudu'
  1. Kudu
  2. KUDU-3036

RPC size multiplication for DDL operations might hit maximum RPC size limit

Attach filesAttach ScreenshotVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 1.0.0, 1.0.1, 1.1.0, 1.2.0, 1.3.0, 1.3.1, 1.4.0, 1.5.0, 1.6.0, 1.7.0, 1.8.0, 1.7.1, 1.9.0, 1.10.0, 1.10.1, 1.11.0, 1.11.1
    • Fix Version/s: 1.12.0
    • Component/s: master, rpc

      Description

      When a table uses multi-tier partitioning scheme, with large number of partitions created, an AlterTable request that affects many partitions/tablets turns into a much larger UpdateConsensus RPC when leader master pushes the corresponding update on the system tablet to follower masters.

      I did some testing for this use case. With AlterTable RPC adding new range partitions, I observed the following:

      • With range x 2 hash partitions, with the incoming AlterTable RPC request size is 37070 bytes, the size for the corresponding UpdateConsensus is 274278 bytes (~ 7x multiplication factor).
      • With range x 10 hash partitions, with the incoming AlterTable RPC request size is 37070 bytes, the size for the corresponding UpdateConsensus when leader master pushes the updates on the system tablet to followers is 1365438 bytes (~ 36x multiplication factor).

      With that, it's easy to hit the limit on the maximum PRC size (controlled via the --rpc_max_message_size flag) in case of larger Kudu clusters. If that happens, Kudu masters start continuous leader re-election cycle since follower masters don't receive any Raft heartbeats from their leader: the heartbeats are rejected at the lower RPC layer due to the maximum RPC size limit.

        Attachments

        Issue Links

          Activity

            People

            • Assignee:
              aserbin Alexey Serbin
              Reporter:
              aserbin Alexey Serbin

              Dates

              • Created:
                Updated:
                Resolved:

                Issue deployment