Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-5473

Make diagnosing network issues easier

    XMLWordPrintableJSON

    Details

      Description

      With our current metrics in the profile, it's hard to debug queries that get slow throughput from their exchanges.

      The following cases have different causes, but similar symptoms (e.g. a high InactiveTimer in the xchg profile):

      1. Downstream sender does not produce rows quickly (perhaps because its child instances do not produce rows quickly).

      2. Downstream sender can not send rows quickly, perhaps because of network congestion.

      3. Downstream sender does not start producing rows until some time after the upstream has started (captured by FirstBatchArrivalWaitTime).

      4. Downstream sender does not close stream until some time after all rows are sent.

      We should try to improve these metrics so that all the information about who is slow, and why, is available clearly in the runtime profile. Distinguishing cases 1 and 2 is particularly important.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                Unassigned
                Reporter:
                henryr Henry Robinson
              • Votes:
                0 Vote for this issue
                Watchers:
                5 Start watching this issue

                Dates

                • Created:
                  Updated: