Details
-
Improvement
-
Status: Resolved
-
Normal
-
Resolution: Fixed
-
None
-
Operability
-
Normal
-
All
-
None
-
Description
Problem Statement
As we know, Cassandra exchanges important topology and token-ownership-related details over Gossip. Cassandra internally maintains the following two separate caches that have the token-ownership information maintained: 1) Gossip cache and 2) Storage Service cache. The first Gossip cache is updated on a node, followed by the storage service cache. In the hot path, ownership is calculated from the storage service cache. Since two separate caches maintain the same information, then inconsistencies are bound to happen. It could be very well feasible that the Gossip cache has up-to-date ownership of the Cassandra cluster, but the service cache does not, and in that scenario, inconsistent data will be served to the user.
Currently, there is no mechanism in Cassandra that detects and fixes these two caches.
Long-term solution
We are going with the long-term transactional metadata (https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-21) to handle such inconsistencies, and that’s the right thing to do.
Short-term solution
But CEP-21 might take some time, and until then, there is a need to detect such inconsistencies. Once we detect inconsistencies, then we could have two options: 1) restart the node or 2) Fix the inconsistencies on-the-fly.
This JIRA is providing a short-term solution. Please review the pull request (on 4.1): https://github.com/apache/cassandra/pull/3548