Uploaded image for project: 'Kudu'
  1. Kudu
  2. KUDU-1806

Creating a list of scan tokens should retrieve tablets in larger batches

Attach filesAttach ScreenshotVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Critical
    • Resolution: Fixed
    • 1.2.0
    • 1.2.0
    • client
    • None

    Description

      In a test on a 200-node cluster with 40 concurrent query streams, we found that the Impala planner was sometimes taking minutes to fetch the list of scan tokens. The tables in the query had several thousand tablets, so with the default batch size of 10 tablets per GetTableLocations RPC, the planning required hundreds of round trips, each of which had some chance of getting bumped from the queue due to backpressure, etc.

      A local hack to change the batching to 1000 tablets per RPC reduced the planning times down to sub-second.

      Attachments

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            tlipcon Todd Lipcon
            tlipcon Todd Lipcon
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment