[IGNITE-19824] Implicit RO should be used in implicit single gets - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Resolved
Priority: Major
Resolution: Fixed
Affects Version/s: None
Fix Version/s: 3.0
Component/s: None
Labels:
- ignite-3

Description

Motivation

Currently, all implicit read operations start RW transactions, thus it's possible to catch "Failed to acquire a lock due to a conflict" exception. Generally speaking, given issue should be resolved by substituting RW with RO for all implicit read transactions, however such approach will decrease linearizability so it's required to verify it with product management. It's still however possible to have special case RO for implicit single-key get operation that will set readTimestamp on primary replica instead of transaction coordinator and thus provide cluster-wide linearizability even for RO transactions (only for single-key implicit get operations). Within this ticket, such special RO transactions should be introduced along with their usage switch for single-get implicit reads.

Definition of Done

Implicit single-get operations use special RO transactions that provide cluster-wide linearizability and thus do not throw "Failed to acquire a lock due to a conflict" exception.
ItAbstractDataStreamerTest#testAutoFlushByTimer adjusted: catch block removed.

Implementation Notes

1. Basically, what we need to do here is to start RO transaction instead of RW one in case of single-key implicit get, thus we should add

if (tx == null) {
    tx = txManager.begin(true);
}

right in front of

return enlistInTx(

Please pay attention, that we want to start special case RO transaction that should go to primary and only primary, so it's not valid to put aforementioned tx = txManager.begin(true); at the very beginning of the method, because in that case balancer may return non-primary through evaluateReadOnlyRecipientNode. Corresponging comment should be added.

2. Such specifal case RO transcation doesn't require readTimestamp calcualtion on tx.start for the evaluation point of view, however it still required it for lowWatermark managerment:

readOnlyTxFutureById.compute(new TxIdAndTimestamp(readTimestamp, txId), (txIdAndTimestamp, readOnlyTxFuture) -> {
    assert readOnlyTxFuture == null : "previous transaction has not completed yet: " + txIdAndTimestamp;

    if (lowWatermark != null && readTimestamp.compareTo(lowWatermark) <= 0) {
        throw new IgniteInternalException(
                TX_READ_ONLY_TOO_OLD_ERR,
                "Timestamp read-only transaction must be greater than the low watermark: [txTimestamp={}, lowWatermark={}]",
                readTimestamp, lowWatermark
        );
    }

    return new CompletableFuture<>();
});

So, seems that it worth to leave readTimestamp generatoin at it's current place.

3. And again in order to have cluster-wide linearizability it's requried to use primaryReplica now as readTimestamp instead of the one proposed in readOnlyReplicaRequest. Basically that means substitution of

HybridTimestamp readTimestamp = request.readTimestamp();

with

HybridTimestamp readTimestamp;

if (request.requestType() == RequestType.RO_GET && request.implicit()) {
    readTimestamp = hybridClock.now();
} else {
    readTimestamp = request.readTimestamp();
}

along with

CompletableFuture<Void> safeReadFuture = isPrimaryInTimestamp(isPrimary, readTimestamp) ? completedFuture(null)
        : safeTime.waitFor(readTimestamp);

in PartitionReplicaListnener. That on its part required adding implicit() to ReadOnlySingleRowReplicaRequest that should be properly set on the client side.

4. That specific operation type should also include a timestamp in the response (using TimestampAware). It is necessary to use the timestmp to adjust the clock on the transaction coordinator (despite the fact that we are talking about a single-get operation, it is a transaction, and the node that invoked the operation is called a transaction coordinator). Then we can use clock.now() to update the observation timestamp.

Attachments

Issue Links

causes

IGNITE-19954 .NET: Thin 3.0: TestAutoFlushFrequency is flaky

Resolved

depends upon

IGNITE-19227 Wait for schema availability outside JRaft threads

Resolved

relates to

IGNITE-19889 Implement observable timestamp on server

Resolved

links to

GitHub Pull Request #2641

Activity

People

Assignee:: Vladislav Pyatkov

Reporter:: Alexander Lapin

Reviewer:: Alexander Lapin

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Dates

Created:: 23/Jun/23 12:15

Updated:: 04/Sep/24 16:10

Resolved:: 05/Oct/23 08:20

Time Tracking

Estimated:

Not Specified

Remaining:

Logged:

3.5h