Description
Motivation
Currently, all implicit read operations start RW transactions, thus it's possible to catch "Failed to acquire a lock due to a conflict" exception. Generally speaking, given issue should be resolved by substituting RW with RO for all implicit read transactions, however such approach will decrease linearizability so it's required to verify it with product management. It's still however possible to have special case RO for implicit single-key get operation that will set readTimestamp on primary replica instead of transaction coordinator and thus provide cluster-wide linearizability even for RO transactions (only for single-key implicit get operations). Within this ticket, such special RO transactions should be introduced along with their usage switch for single-get implicit reads.
Definition of Done
- Implicit single-get operations use special RO transactions that provide cluster-wide linearizability and thus do not throw "Failed to acquire a lock due to a conflict" exception.
- ItAbstractDataStreamerTest#testAutoFlushByTimer adjusted: catch block removed.
Implementation Notes
1. Basically, what we need to do here is to start RO transaction instead of RW one in case of single-key implicit get, thus we should add
if (tx == null) { tx = txManager.begin(true); }
right in front of
return enlistInTx(
Please pay attention, that we want to start special case RO transaction that should go to primary and only primary, so it's not valid to put aforementioned tx = txManager.begin(true); at the very beginning of the method, because in that case balancer may return non-primary through evaluateReadOnlyRecipientNode. Corresponging comment should be added.
2. Such specifal case RO transcation doesn't require readTimestamp calcualtion on tx.start for the evaluation point of view, however it still required it for lowWatermark managerment:
readOnlyTxFutureById.compute(new TxIdAndTimestamp(readTimestamp, txId), (txIdAndTimestamp, readOnlyTxFuture) -> { assert readOnlyTxFuture == null : "previous transaction has not completed yet: " + txIdAndTimestamp; if (lowWatermark != null && readTimestamp.compareTo(lowWatermark) <= 0) { throw new IgniteInternalException( TX_READ_ONLY_TOO_OLD_ERR, "Timestamp read-only transaction must be greater than the low watermark: [txTimestamp={}, lowWatermark={}]", readTimestamp, lowWatermark ); } return new CompletableFuture<>(); });
So, seems that it worth to leave readTimestamp generatoin at it's current place.
3. And again in order to have cluster-wide linearizability it's requried to use primaryReplica now as readTimestamp instead of the one proposed in readOnlyReplicaRequest. Basically that means substitution of
HybridTimestamp readTimestamp = request.readTimestamp();
with
HybridTimestamp readTimestamp; if (request.requestType() == RequestType.RO_GET && request.implicit()) { readTimestamp = hybridClock.now(); } else { readTimestamp = request.readTimestamp(); }
along with
CompletableFuture<Void> safeReadFuture = isPrimaryInTimestamp(isPrimary, readTimestamp) ? completedFuture(null) : safeTime.waitFor(readTimestamp);
in PartitionReplicaListnener. That on its part required adding implicit() to ReadOnlySingleRowReplicaRequest that should be properly set on the client side.
4. That specific operation type should also include a timestamp in the response (using TimestampAware). It is necessary to use the timestmp to adjust the clock on the transaction coordinator (despite the fact that we are talking about a single-get operation, it is a transaction, and the node that invoked the operation is called a transaction coordinator). Then we can use clock.now() to update the observation timestamp.
Attachments
Issue Links
- causes
-
IGNITE-19954 .NET: Thin 3.0: TestAutoFlushFrequency is flaky
- Resolved
- depends upon
-
IGNITE-19227 Wait for schema availability outside JRaft threads
- Resolved
- relates to
-
IGNITE-19889 Implement observable timestamp on server
- Resolved
- links to