[CASSANDRA-19598] advanced.resolve-contact-points: unresolved hostname being clobbered during reconnection - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Triage Needed
Priority: Normal
Resolution: Unresolved
Fix Version/s: None
Component/s: Client/java-driver
Labels:
None

Platform:

All
Impacts:

None

Description

Hello, this is a bug ticket for 4.18.0 of the Java driver.

I am running in an environment where I have 3 Cassandra nodes. We have a use case to redeploy the cluster from the ground up at midnight every day. This means that all 3 nodes become unavailable for a short period of time and 3 new nodes with 3 new ip addresses get spun up and placed behind the contact point hostname. If you set advanced.resolve-contact-points to FALSE, the java driver should re-resolve the hostname for every new connection to that node. This occurs prior to and for the first redeployment, but the unresolved hostname is clobbered during the reconnection process and replaced with a resolved IP address, making additional redeployments fruitless. We provide a singular hostname as a contact point.

In our case, what is happening is that all 3 nodes become unavailable while our CICD process is destroying the existing cluster and replacing it with a new one. During the window of unavailability, the Java driver attempts to reconnect to each node, two of which internally (internal to the driver) have resolved IP addresses and one of which retains the unresolved hostname. Here is a screenshot that captures the internal state of the 3 nodes within `PoolManager` prior to the finished redeployment of the cluster. Note that there are 2 resolved IP addresses and 1 unresolved hostname.

This 2:1 ratio of resolved IP:unresolved hostname is the correct internal state for a 3 node cluster when `advanced.resolve-contact-points` is set to `FALSE`.

Eventually, the hostname points to one of the 3 new valid nodes, and the java driver reconnects and discovers the new peers. However, as part of this reconnection process, the internal Node that held the unresolved hostname is now overwritten with a Node that has the resolved IP address:

Note that we no longer have 2 resolved IP addresses and 1 unresolved hostname; rather, we have 3 resolved IP addresses, which is an incorrect internal state when `advanced.resolve-contact-points` is set to `FALSE`. One of the nodes should have retained the unresolved hostname.

At this stage, the Java driver no longer queries the hostname for new connections, and further redeployments of ours result in failure because the hostname is no longer amongst the list of nodes that are queried for reconnection. This causes us to need to restart the application.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

image-2024-04-29-20-13-56-161.png
30/Apr/24 01:13
166 kB
Andrew Orlowski
image-2024-04-29-22-57-26-910.png
30/Apr/24 03:57
46 kB
Andrew Orlowski

Activity

People

Assignee:: Unassigned

Reporter:: Andrew Orlowski

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Dates

Created:: 30/Apr/24 01:41

Updated:: 30/Apr/24 04:03