Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-7872 Extended health checks to mark node as down
  3. IMPALA-10477

Mark executor node as down if it repeatedly failed to startup fragment instance

Attach filesAttach ScreenshotAdd voteVotersWatch issueWatchersLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • None
    • None
    • ghx-label-2

    Description

      If an executor node notices that it is somehow unhealthy and repeatedly gets failures during fragment instance startup, it should report its unhealthy state to statestore so that the node could be removed from executor group and coordinators would not schedule new tasks on the unhealthy node.  

       

      Attachments

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            wzhou Wenzhe Zhou
            wzhou Wenzhe Zhou

            Dates

              Created:
              Updated:

              Slack

                Issue deployment