Uploaded image for project: 'Apache Drill'
  1. Apache Drill
  2. DRILL-3030

Foreman hangs trying to cancel non-root fragments

VotersWatch issueWatchersLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Cannot Reproduce
    • 1.0.0
    • 1.2.0
    • Execution - Flow
    • None

    Description

      Steps to repro:

      1. Ran long running query on a clean drill restart.
      2. Killed a non foreman node.
      3. Restarted drillbits using clush.

      One of the drillbits(coincidentally a foreman node always) refused to shutdown.

      Jstack shows that the foreman is waiting

        at org.apache.drill.exec.rpc.ReconnectingConnection$ConnectionListeningFuture.waitAndRun(ReconnectingConnection.java:105)
              at org.apache.drill.exec.rpc.ReconnectingConnection.runCommand(ReconnectingConnection.java:81)
              - locked <0x000000073878aaa8> (a org.apache.drill.exec.rpc.control.ControlConnectionManager)
              at org.apache.drill.exec.rpc.control.ControlTunnel.cancelFragment(ControlTunnel.java:57)
              at org.apache.drill.exec.work.foreman.QueryManager.cancelExecutingFragments(QueryManager.java:192)
              at org.apache.drill.exec.work.foreman.Foreman$StateSwitch.processEvent(Foreman.java:824)
              at org.apache.drill.exec.work.foreman.Foreman$StateSwitch.processEvent(Foreman.java:768)
              at org.apache.drill.common.EventProcessor.sendEvent(EventProcessor.java:73)
              at org.apache.drill.exec.work.foreman.Foreman$StateSwitch.moveToState(Foreman.java:770)
              at org.apache.drill.exec.work.foreman.Foreman.moveToState(Foreman.java:871)
              at org.apache.drill.exec.work.foreman.Foreman.access$2700(Foreman.java:107)
              at org.apache.drill.exec.work.foreman.Foreman$StateListener.moveToState(Foreman.java:1132)
              at org.apache.drill.exec.work.foreman.QueryManager$1.statusUpdate(QueryManager.java:460)
      

      Attachments

        1. threadstack
          605 kB
          Ramana Inukonda Nagaraj

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            cchang@maprtech.com Chun Chang
            inramana Ramana Inukonda Nagaraj
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment