Uploaded image for project: 'Ignite'
  1. Ignite
  2. IGNITE-13877

Error restarting the node with switching from disabled WAL archiving to enabled

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • 2.11
    • persistence
    • None
    • Safe restart of node with switching from disabled to enabled WAL archiving.
    • Release Notes Required

    Description

      If a user starts a node with WAL archiving disabled, and then poured data there and there were more than DataStorageConfiguration#walSegments and then wants to restart a node with WAL archiving enabled, they will fail due to the following error:

      SEVERE: Critical system error detected. Will be handled accordingly to configured handler [hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0, super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet [SYSTEM_WORKER_BLOCKED, SYSTEM_CRITICAL_OPERATION_TIMEOUT]]], failureCtx=FailureContext [type=CRITICAL_ERROR, err=class o.a.i.i.processors.cache.persistence.StorageException: Failed to read checkpoint record from WAL, persistence consistency cannot be guaranteed. Make sure configuration points to correct WAL folders and WAL folder is properly mounted [ptr=FileWALPointer [idx=11, fileOff=15864934, len=21409], walPath=db/wal, walArchive=db/wal/archive]]]
      class org.apache.ignite.internal.processors.cache.persistence.StorageException: Failed to read checkpoint record from WAL, persistence consistency cannot be guaranteed. Make sure configuration points to correct WAL folders and WAL folder is properly mounted [ptr=FileWALPointer [idx=11, fileOff=15864934, len=21409], walPath=db/wal, walArchive=db/wal/archive]
      	at org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.performBinaryMemoryRestore(GridCacheDatabaseSharedManager.java:2324)
      	at org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.readMetastore(GridCacheDatabaseSharedManager.java:799)
      	at org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.notifyMetaStorageSubscribersOnReadyForRead(GridCacheDatabaseSharedManager.java:3523)
      	at org.apache.ignite.internal.IgniteKernal.start(IgniteKernal.java:1206)
      	at org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.start0(IgnitionEx.java:2089)
      	at org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.start(IgnitionEx.java:1758)
      	at org.apache.ignite.internal.IgnitionEx.start0(IgnitionEx.java:1147)
      	at org.apache.ignite.internal.IgnitionEx.startConfigurations(IgnitionEx.java:1065)
      	at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:951)
      	at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:850)
      	at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:720)
      	at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:689)
      	at org.apache.ignite.Ignition.start(Ignition.java:344)
      

      At this point, the user can be offered the following workaround:
      Move all segments to WAL archive directory (include consistentId directory) as they are except the last one. Last one rename as index % DataStorageConfiguration#walSegments.

      Described workaround should be done automatically without user intervention.

      Attachments

        1. Ignite13877Test.java
          5 kB
          Kirill Tkalenko

        Issue Links

          Activity

            People

              ktkalenko@gridgain.com Kirill Tkalenko
              ktkalenko@gridgain.com Kirill Tkalenko
              Sergey Chugunov Sergey Chugunov
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 1h 10m
                  1h 10m