Details
-
Sub-task
-
Status: Resolved
-
Major
-
Resolution: Cannot Reproduce
-
None
-
None
-
None
-
None
Description
If Q1 TPCH is run 10 times in a row on the same cluster, with IO elevator,the perf stays relatively consistent (or improves due to JIT and stuff). Without, there's a large perf dip in the middle (usually dag #3 or so for me), then recovery.
This dip is not caused directly by GC time, and at least in my case I cannot see any particular part becoming slower in YK profiles, or any obvious correlates. One can easily see tasks slow down from 3-7 to 10-30 seconds (up to 60 with slow HDFS reads). One thing that happens is that kernel CPU time goes up to 9-12% on all the daemons when the slowdown is "ramping up" and occuring, compared to usual 0-3% levels. I didn't investigate much where that comes from since this is not a mainline scenario.
Still, interesting to learn what causes this.
YK dumps provided upon request.
Attachments
Issue Links
- is duplicated by
-
HIVE-10474 LLAP: investigate why TPCH Q1 1k is slow
- Resolved