Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-8677

Removing an unused node does not leave consistent remote scheduling unchanged

Attach filesAttach ScreenshotVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • Impala 3.2.0
    • Impala 3.3.0
    • Backend
    • None
    • ghx-label-8

    Description

      When working on IMPALA-8630, I discovered that SchedulerTest::RemoteExecutorCandidateConsistency works mostly by happenstance.

      The root of the issue is that in Scheduler::GetRemotExecutorCandidates() we want to avoid returning duplicates and put all the IpAddrs in a set:

      set<IpAddr> distinct_backends;
      ...
      distinct_backends.insert(*executor_addr);
      ...
      for (const IpAddr& addr : distinct_backends) {
        remote_executor_candidates->push_back(addr);
      }

      This sorts the IpAddrs, and the remote_executor_candidates does not return elements in the order in which they are encountered.

      Suppose that we are running with num_remote_executor_candidates=2 and random replicas is false. There is exactly one file. GetRemoteExecutorCandidates() returns these executor candidates (IpAddrs):

      {192.168.1.2, 192.168.1.3}

      The first entry is chosen because it is first. Nothing was scheduled on 192.168.1.3, but removing it may change the scheduling outcome. This is because of the sort. Suppose 192.168.1.3 is gone, but the next closest executor is 192.168.1.1 (or some node less than 192.168.1.2). Even though it is farther in the context of the hashring, GetRemoteExecutorCandidates() would return:

      {192.168.1.1, 192.168.1.2}

      and the first entry would be chosen.

      To eliminate this inconsistency, it might be useful to retain the order in which elements match via the hashring.

      In terms of impact, this would increase the number of files that would potentially change scheduling when a node leaves. It might have unnecessary changes. If using random replica set to true, it doesn't matter. It is unclear how much this would impact otherwise.

      Attachments

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            joemcdonnell Joe McDonnell
            joemcdonnell Joe McDonnell
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment