Details
-
Sub-task
-
Status: Closed
-
Major
-
Resolution: Fixed
-
None
-
None
-
None
Description
Looks exactly like HIVE-10744. Last comment there has internal app IDs. Logs upon request.
6 (number of slots) tasks from a machine are stuck.
jstack for target daemon sayeth:
7 Found one Java-level deadlock: 8 ============================= 9 10 "IPC Server handler 4 on 15001": 11 waiting to lock Monitor@0x00007f3cb0005cb8 (Object@0x000000008cc3ce98, a java/lang/Object), 12 which is held by "Wait-Queue-Scheduler-0" 13 "Wait-Queue-Scheduler-0": 14 waiting to lock Monitor@0x00007f3cb0004d98 (Object@0x000000009234cf58, a org/apache/hadoop/hive/llap/daemon/impl/Q ueryInfo$FinishableStateTracker), 15 which is held by "IPC Server handler 4 on 15001"
Oh, this time it is not q1; I was running bunch of TPCDS queries in sequence for some cache test. No parallel queries. There may have been task failures before.
The query that got stuck had lots and lots of reducers
Map 1: 1/1 Map 10: 1/1 Map 11: 85/85 Map 13: 1/1 Map 14: 1/1 Map 15: 1/1 Map 16: 1/1 Map 17: 94/94 Map 19: 1/1 Map 2: 1/1 Map 20: 1/1 Map 3: 91/91 Map 7: 1/1 Map 8: 1/1 Map 9: 1/1 Reducer 12: 391/391 Reducer 18: 197/197 Reducer 4: 1009/1009 Reducer 5: 1003(+6)/1009 Reducer 6: 0(+1)/1
I think it's query 58