Details

    • Sub-task
    • Status: Closed
    • Major
    • Resolution: Fixed
    • None
    • llap
    • None
    • None

    Description

      Looks exactly like HIVE-10744. Last comment there has internal app IDs. Logs upon request.
      6 (number of slots) tasks from a machine are stuck.
      jstack for target daemon sayeth:

         7 Found one Java-level deadlock:
        8 =============================
        9 
       10 "IPC Server handler 4 on 15001":
       11   waiting to lock Monitor@0x00007f3cb0005cb8 (Object@0x000000008cc3ce98, a java/lang/Object),
       12   which is held by "Wait-Queue-Scheduler-0"
       13 "Wait-Queue-Scheduler-0":
       14   waiting to lock Monitor@0x00007f3cb0004d98 (Object@0x000000009234cf58, a org/apache/hadoop/hive/llap/daemon/impl/Q     ueryInfo$FinishableStateTracker),
       15   which is held by "IPC Server handler 4 on 15001"
      

      Oh, this time it is not q1; I was running bunch of TPCDS queries in sequence for some cache test. No parallel queries. There may have been task failures before.
      The query that got stuck had lots and lots of reducers

      Map 1: 1/1    Map 10: 1/1    Map 11: 85/85    Map 13: 1/1    Map 14: 1/1    Map 15: 1/1    Map 16: 1/1    Map 17: 94/94    Map 19: 1/1    Map 2: 1/1    Map 20: 1/1    Map 3: 91/91    Map 7: 1/1    Map 8: 1/1    Map 9: 1/1    Reducer 12: 391/391    Reducer 18: 197/197    Reducer 4: 1009/1009    Reducer 5: 1003(+6)/1009    Reducer 6: 0(+1)/1
      

      I think it's query 58

      Attachments

        1. HIVE-10842.1.txt
          4 kB
          Siddharth Seth

        Activity

          People

            sseth Siddharth Seth
            sershe Sergey Shelukhin
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: