Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-2567 KRPC milestone 1
  3. IMPALA-6685

Improve profile in KrpcDataStreamRecvr and KrpcDataStreamSender

Agile BoardAttach filesAttach ScreenshotVotersWatch issueWatchersLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • ghx-label-8

    Description

      The existing profiles in KrpcDataStreamRecvr and KrpcDataStreamSender made it hard to diagnose slow queries shown in IMPALA-6657. In particular, there are times in which the profile of the receiver showing a lot of time waiting for row batches to arrive while the sender is also showing a lot of time waiting for responses of TransmitData() RPC.

      A couple of improvements can be done to make it slightly easier to diagnose the problem:

      • track the number of deferred row batches over time in KrpcDataStreamRecvr
      • track the number of bytes dequeued over time in KrpcDataStreamRecvr
      • track the amount of time row batches spent in deferred queue
      • track the number of bytes sent from KrpcDataStreamSender over time

      The above items help identify cases in which one fragment instances containing an exchange node is slow for a period of time (e.g. the parent of exchange node spills heavily), causing all senders to that fragment instance to block waiting for responses. As all senders are blocked waiting for previous RPC to complete, they will not produce more rows and all other fragment instances will be starved, leading to the high wait time shown in their receiver's profile. The time series counter for the number of deferred row batches in a receiver helps identify cases described above.

      Attachments

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            kwho Michael Ho
            kwho Michael Ho
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment