Uploaded image for project: 'Ignite'
  1. Ignite
  2. IGNITE-10187

Partition data can be lost after recover from WAL and no data were ever checkpointed.

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: persistence
    • Labels:
      None

      Description

      Steps to reproduce:
      1. Start a node.

      2. Disable checkpoints.

      3. Put some data.

      4. Flush WAL.

      5. Restart node.

      6. Next put hangs sporadically forever awaiting for next topology that will never happens.
      The issue caused by ClusterTopologyException thrown due to partition MOVING state, however it is expected partition to be in OWNING state.

      The root cause is partition doesn't restore OWNING state after recover from WAL as it was not checkpointed or contains no data when checkpoint occurs (partition was in initial state).

       

      Seems, forcing checkpoint before disabling it resolves the issue. See CacheMvccTxFailoverTest.testSingleNodeTxMissedCommitNoCheckpoint().

       

        Attachments

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              amashenkov Andrey Mashenkov
            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated: