Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
None
-
Fixed an issue with IgniteAtomicSequence that led to AssertionError.
-
Release Notes Required
Description
Using IgniteAtomicSequence can lead to the following AssertionError:
java.lang.AssertionError: null at org.apache.ignite.internal.processors.datastructures.GridCacheAtomicSequenceImpl$2.call(GridCacheAtomicSequenceImpl.java:307) at org.apache.ignite.internal.processors.datastructures.GridCacheAtomicSequenceImpl$2.call(GridCacheAtomicSequenceImpl.java:298) at org.apache.ignite.internal.processors.cache.GridCacheUtils.retryTopologySafe(GridCacheUtils.java:1418) at org.apache.ignite.internal.processors.datastructures.GridCacheAtomicSequenceImpl.internalUpdate(GridCacheAtomicSequenceImpl.java:230) at org.apache.ignite.internal.processors.datastructures.GridCacheAtomicSequenceImpl.incrementAndGet(GridCacheAtomicSequenceImpl.java:135)
The following code produces the mentioned error:
private Callable<Long> internalUpdate(final long l, final boolean updated) { return new Callable<Long>() { @Override public Long call() throws Exception { assert distUpdateFreeTop.isHeldByCurrentThread() || distUpdateLockedTop.isHeldByCurrentThread(); try (GridNearTxLocal tx = CU.txStartInternal(ctx, cacheView, PESSIMISTIC, REPEATABLE_READ)) { GridCacheAtomicSequenceValue seq = cacheView.get(key); checkRemoved(); assert seq != null; <-- This assert can trigger the error in case the partition loss policy is IGNORE and the corresponding partition has been lost.
The root cause of the issue is that for in-memory case partition loss policy is IGNORE. Therefore, the following read can return a null value without any exceptions and trigger the mentioned AssertionError.
try (GridNearTxLocal tx = CU.txStartInternal(ctx, cacheView, PESSIMISTIC, REPEATABLE_READ)) {
GridCacheAtomicSequenceValue seq = cacheView.get(key);
The possible workaround is setting a reasonable number of backups in AtomicConfiguration. Monitoring of lost partitions would be nice as well.
The proposed solution is quite obvious. Need to change the assert assert seq != null; to explicit check and throw a suitable exception if needed. This should allow the user to detect this and re-create the sequence, for example.
Attachments
Issue Links
- links to