Details
-
Improvement
-
Status: Open
-
Minor
-
Resolution: Unresolved
-
None
-
None
Description
Current DrillClient does not recover from a DrillBit failure when the connection is initiated via a Zookeeper quorum.
This JIRA covers adding this capability to the C++ DrillClient.
The key question to consider: how far does the DrillClient go in recovering the connection? One interesting example to consider if when the DrillBit fails in the middle of a query.
Scenario:
1. An app connects via ODBC to a Drill cluster via the ZK quorum. ZK assigns a Drillbit node1 to the connection.
2. The app issues a query and node1 starts processing the query and returns 2 RecordBatches back
3. node1 fails.
4. DrillClient detects the loss of node1 and negotiates with the quorum for a replacement, node2
5. Question: does DrillClient try to re-execute the query and skip over the first two RecordBatches (since DrillClient knows that was the state of the connection to node1)?