Fix Version/s: None
MacOS 10.8.4, Java 1.7.0_25, JBoss community 7.1.1, Datastax Java Driver 1.0.3
When multiple clients/threads do an UPDATE with a failed IF clause, then retry with a good IF clause, eventually updates with that primary key stop functioning. This acts like a race condition and will not reproduce for me unless I have IF clauses that fail with an incorrect previous value, such as in an update race. In my specific case, I hard coded my LWT retry logic's first UPDATE IF attempt to have always an incorrect previous value so that it would iterate and retry (see attached code.) The second update (retry) would try with the "returned current value" and would generally win. If this pattern was executed under load (jmeter to a test servlet with a lot of parallel requests), eventually I could not update the row with the PK I was using. This was even the case in cqlsh.
The java driver complains about the following which I'm assuming is a red herring:
javax.ejb.EJBException: com.datastax.driver.core.exceptions.NoHostAvailableException: All host(s) tried for query failed (tried: /127.0.0.1 ([/127.0.0.1] Unexpected exception triggered (org.apache.cassandra.transport.messages.ErrorMessage$WrappedException: org.apache.cassandra.transport.ProtocolException: Unknown code 8 for a consistency level)))
There is nothing printed to the cassandra console except for this:
INFO 16:45:50,645 GC for ParNew: 224 ms for 1 collections, 66767104 used; max is 1046937600
And cqlsh ends up behaving like this in my 1 node 1 keyspace 1 replication_factor environment:
cqlsh:formula11> select last_value from id_pools where name='jae';
cqlsh:formula11> update id_pools set last_value=262 where name='jae' if last_value=261;
Request did not complete within rpc_timeout.
It is worth noting that other PKS continue to function in this id_pools table. Please note that we only use these "id pools" for low volume required ascending ids and use UUIDs for other unique ids.