Details
-
Bug
-
Status: Open
-
Critical
-
Resolution: Unresolved
-
Impala 2.11.0
-
None
-
None
-
ghx-label-4
Description
While running a highly concurrent spilling workload on a large cluster queries start running slower, even light weight queries that are not running are affected by this slow down.
EXCHANGE_NODE (id=9):(Total: 3m1s, non-child: 3m1s, % non-child: 100.00%) - ConvertRowBatchTime: 999.990us - PeakMemoryUsage: 0 - RowsReturned: 108.00K (108001) - RowsReturnedRate: 593.00 /sec DataStreamReceiver: BytesReceived(4s000ms): 254.47 KB, 338.82 KB, 338.82 KB, 852.43 KB, 1.32 MB, 1.33 MB, 1.50 MB, 2.53 MB, 2.99 MB, 3.00 MB, 3.00 MB, 3.00 MB, 3.00 MB, 3.00 MB, 3.00 MB, 3.00 MB, 3.00 MB, 3.00 MB, 3.16 MB, 3.49 MB, 3.80 MB, 4.15 MB, 4.55 MB, 4.84 MB, 4.99 MB, 5.07 MB, 5.41 MB, 5.75 MB, 5.92 MB, 6.00 MB, 6.00 MB, 6.00 MB, 6.07 MB, 6.28 MB, 6.33 MB, 6.43 MB, 6.67 MB, 6.91 MB, 7.29 MB, 8.03 MB, 9.12 MB, 9.68 MB, 9.90 MB, 9.97 MB, 10.44 MB, 11.25 MB - BytesReceived: 11.73 MB (12301692) - DeserializeRowBatchTimer: 957.990ms - FirstBatchArrivalWaitTime: 0.000ns - PeakMemoryUsage: 644.44 KB (659904) - SendersBlockedTimer: 0.000ns - SendersBlockedTotalTimer(*): 0.000ns
DataStreamSender (dst_id=9):(Total: 1s819ms, non-child: 1s819ms, % non-child: 100.00%) - BytesSent: 234.64 MB (246033840) - NetworkThroughput(*): 139.58 MB/sec - OverallThroughput: 128.92 MB/sec - PeakMemoryUsage: 33.12 KB (33920) - RowsReturned: 108.00K (108001) - SerializeBatchTime: 133.998ms - TransmitDataRPCTime: 1s680ms - UncompressedRowBatchSize: 446.42 MB (468102200)
Timeouts seen in IMPALA-6285 are caused by this issue
I1206 12:44:14.925405 25274 status.cc:58] RPC recv timed out: Client foo-17.domain.com:22000 timed-out during recv call.
@ 0x957a6a impala::Status::Status()
@ 0x11dd5fe impala::DataStreamSender::Channel::DoTransmitDataRpc()
@ 0x11ddcd4 impala::DataStreamSender::Channel::TransmitDataHelper()
@ 0x11de080 impala::DataStreamSender::Channel::TransmitData()
@ 0x11e1004 impala::ThreadPool<>::WorkerThread()
@ 0xd10063 impala::Thread::SuperviseThread()
@ 0xd107a4 boost::detail::thread_data<>::run()
@ 0x128997a (unknown)
@ 0x7f68c5bc7e25 start_thread
@ 0x7f68c58f534d __clone
A similar behavior was also observed with KRPC enabled IMPALA-6048
Attachments
Attachments
Issue Links
- duplicates
-
IMPALA-6048 Queries make very slow progress and report WaitForRPC() stuck for too long
- Resolved
- relates to
-
IMPALA-6285 Avoid printing the stack as part of DoTransmitDataRpc as it leads to burning lots of kernel CPU
- Resolved
-
IMPALA-6048 Queries make very slow progress and report WaitForRPC() stuck for too long
- Resolved