Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-15529

LLAP: TaskSchedulerService can get stuck when scheduling tasks as disabled node is not re-enabled in NodeEnablerCallable

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Critical
    • Resolution: Fixed
    • None
    • 2.2.0
    • llap
    • None
    • Reviewed

    Description

      Easier way to simulate the issue:
      1. Start hive cli with "--hiveconf hive.execution.mode=llap"
      2. Run a sql script file (e.g sql script containing tpc-ds queries)
      3. In the middle of the run, press "ctrl+C" which would interrupt the current job. This should not exit the hive cli yet.
      4. After sometime, launch the same SQL script in same cli. This would get stuck indefinitely (waiting for computing the splits).

      Even when cli is quit, AM runs forever until explicitly killed.

      Issue seems to be around LlapTaskSchedulerService::schedulePendingTasks dealing with the loop when it encounters DELAYED_RESOURCES on task scheduling.

      Attachments

        1. HIVE-15529.1.patch
          1 kB
          Rajesh Balamohan

        Activity

          People

            rajesh.balamohan Rajesh Balamohan
            rajesh.balamohan Rajesh Balamohan
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: