Uploaded image for project: 'Ignite'
  1. Ignite
  2. IGNITE-13877

Error restarting the node with switching from disabled WAL archiving to enabled

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 2.11
    • Component/s: persistence
    • Labels:
      None
    • Release Note:
      Safe restart of node with switching from disabled to enabled WAL archiving.
    • Ignite Flags:
      Release Notes Required

      Description

      If a user starts a node with WAL archiving disabled, and then poured data there and there were more than DataStorageConfiguration#walSegments and then wants to restart a node with WAL archiving enabled, they will fail due to the following error:

      SEVERE: Critical system error detected. Will be handled accordingly to configured handler [hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0, super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet [SYSTEM_WORKER_BLOCKED, SYSTEM_CRITICAL_OPERATION_TIMEOUT]]], failureCtx=FailureContext [type=CRITICAL_ERROR, err=class o.a.i.i.processors.cache.persistence.StorageException: Failed to read checkpoint record from WAL, persistence consistency cannot be guaranteed. Make sure configuration points to correct WAL folders and WAL folder is properly mounted [ptr=FileWALPointer [idx=11, fileOff=15864934, len=21409], walPath=db/wal, walArchive=db/wal/archive]]]
      class org.apache.ignite.internal.processors.cache.persistence.StorageException: Failed to read checkpoint record from WAL, persistence consistency cannot be guaranteed. Make sure configuration points to correct WAL folders and WAL folder is properly mounted [ptr=FileWALPointer [idx=11, fileOff=15864934, len=21409], walPath=db/wal, walArchive=db/wal/archive]
      	at org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.performBinaryMemoryRestore(GridCacheDatabaseSharedManager.java:2324)
      	at org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.readMetastore(GridCacheDatabaseSharedManager.java:799)
      	at org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.notifyMetaStorageSubscribersOnReadyForRead(GridCacheDatabaseSharedManager.java:3523)
      	at org.apache.ignite.internal.IgniteKernal.start(IgniteKernal.java:1206)
      	at org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.start0(IgnitionEx.java:2089)
      	at org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.start(IgnitionEx.java:1758)
      	at org.apache.ignite.internal.IgnitionEx.start0(IgnitionEx.java:1147)
      	at org.apache.ignite.internal.IgnitionEx.startConfigurations(IgnitionEx.java:1065)
      	at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:951)
      	at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:850)
      	at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:720)
      	at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:689)
      	at org.apache.ignite.Ignition.start(Ignition.java:344)
      

      At this point, the user can be offered the following workaround:
      Move all segments to WAL archive directory (include consistentId directory) as they are except the last one. Last one rename as index % DataStorageConfiguration#walSegments.

      Described workaround should be done automatically without user intervention.

        Attachments

        1. Ignite13877Test.java
          5 kB
          Kirill Tkalenko

          Issue Links

            Activity

              People

              • Assignee:
                ktkalenko@gridgain.com Kirill Tkalenko
                Reporter:
                ktkalenko@gridgain.com Kirill Tkalenko
                Reviewer:
                Sergey Chugunov
              • Votes:
                0 Vote for this issue
                Watchers:
                4 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved:

                  Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 1h 10m
                  1h 10m