Uploaded image for project: 'Apache Drill'
  1. Apache Drill
  2. DRILL-5098

Improving fault tolerance for connection between client and foreman node.

    XMLWordPrintableJSON

    Details

      Description

      With DRILL-5015 we allowed support for specifying multiple Drillbits in connection string and randomly choosing one out of it. Over time some of the Drillbits specified in the connection string may die and the client can fail to connect to Foreman node if random selection happens to be of dead Drillbit.
      Even if ZooKeeper is used for selecting a random Drillbit from the registered one there is a small window when client selects one Drillbit and then that Drillbit went down. The client will fail to connect to this Drillbit and error out.

      Instead if we try multiple Drillbits (configurable tries count through connection string) then the probability of hitting this error window will reduce in both the cases improving fault tolerance. During further investigation it was also found that if there is Authentication failure then we throw that error as generic RpcException. We need to improve that as well to capture this case explicitly since in case of Auth failure we don't want to try multiple Drillbits.

      Connection string example with new parameter:
      jdbc:drill:drillbit=<node name>[:<port>][,<node name2>[:<port>]...;tries=5

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                shamirwasia Sorabh Hamirwasia
                Reporter:
                shamirwasia Sorabh Hamirwasia
                Reviewer:
                Chun Chang
              • Votes:
                0 Vote for this issue
                Watchers:
                6 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: