Uploaded image for project: 'Apache IoTDB'
  1. Apache IoTDB
  2. IOTDB-5883

Refactor redirection and dispatching target

    XMLWordPrintableJSON

Details

    Description

      The current redirection mechanism has the following issues:

      1. The lower level (e.g., from the consensus layer) redirection will be overwritten by QueryExecution. Even if the TSStatus from the lower level is already REDIRECTION_RECOMMEND, QueryExecution will still recalculate the redirection. Even worse, the redirection calculated may lead to a wrong node (see the second issue for an explanation), although the client could just be sending to the right node.

      2. The dispatching target and redirection target can be stale. For each FragmentInstance, its dispatching target and redirection target is based on the PartitionCache, and the very first node in the associated ReplicaSet is chosen as the dispatching target and redirection target.
      However, as the PartitionCache is not updated after a leadership change, the first node in a ReplicaSet may not be the leader/primary/master node.
      As a result, the FragmentInstance may be dispatched/redirected to a non-leader node, which will incur further redirection.

      Solutions:

      1. QueryExection will calculate the redirection only when the TSStatus from the lower level is REDIRECTION_RECOMMEND and it does not include a redirection node.
      Such a situation is somehow rare since most REDIRECTION_RECOMMEND returned by the lower level will include a redirection node.

      2. In each ReplicaSet, an optional preferred location is recorded. When the preferred location is set, it will be chosen as the dispatching target and redirection target.
      When REDIRECTION_RECOMMEND is returned from the lower level and a redirection node is included, the preferred location of the ReplicaSet will be updated to that node.
      Furthermore, if the node that generates the MPP plan is in the ReplicaSet, the FragmentInstance will not be dispatched to another node. It is because the consensus layer has a better chance to know who the leader is than the PartitionCache. Consequently, a consensus layer redirection is more accurate than an MPP-level redirection.

      Attachments

        1. image-2023-05-16-12-22-29-563.png
          145 kB
          Tian Jiang
        2. image-2023-05-16-12-42-50-397.png
          21 kB
          Tian Jiang

        Issue Links

          Activity

            People

              jt2594838 Tian Jiang
              jt2594838 Tian Jiang
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated: