Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-5473

Make diagnosing network issues easier

    XMLWordPrintableJSON

Details

    Description

      With our current metrics in the profile, it's hard to debug queries that get slow throughput from their exchanges.

      The following cases have different causes, but similar symptoms (e.g. a high InactiveTimer in the xchg profile):

      1. Downstream sender does not produce rows quickly (perhaps because its child instances do not produce rows quickly).

      2. Downstream sender can not send rows quickly, perhaps because of network congestion.

      3. Downstream sender does not start producing rows until some time after the upstream has started (captured by FirstBatchArrivalWaitTime).

      4. Downstream sender does not close stream until some time after all rows are sent.

      We should try to improve these metrics so that all the information about who is slow, and why, is available clearly in the runtime profile. Distinguishing cases 1 and 2 is particularly important.

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              henryr Henry Robinson
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated: