Details
-
Sub-task
-
Status: Open
-
Major
-
Resolution: Unresolved
-
Impala 3.0, Impala 2.12.0
-
None
-
ghx-label-8
Description
There is no way to manage the network bandwidth usages of a query. In other words, a query which shuffles a huge amount of data can slow down other concurrent queries. The followings are the observed bandwidth of a query when it's run alone and when it's run with another query which shuffles a lot of data across the network. We should consider extending the resource pool concept to also manage network usage.
Good case: DataStreamSender (dst_id=4) - BytesSent: 828.3 MiB (868564531) - InactiveTotalTime: 0ns (0) - NetworkThroughput(*): 706.4 MiB/s (740751383) Bad case: DataStreamSender (dst_id=4) - BytesSent: 828.3 MiB (868564531) - InactiveTotalTime: 0ns (0) - NetworkThroughput(*): 182.3 MiB/s (191106930)