Uploaded image for project: 'Ignite'
  1. Ignite
  2. IGNITE-8066

Reset wal segment idx

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 2.4
    • 2.5
    • general
    • None

    Description

      1) On activation grid read checkpoint status with segment idx=7742:

      2018-03-21 02:34:04.465[INFO ]exchange-worker-#152%DPL_GRID%DplGridNodeName%[o.a.i.i.p.c.d.d.p.GridDhtPartitionsExchangeFuture] Successfully activated caches [nodeId=9c0c2e76-fb7f-46df-8b0b-3379d0c91db9, clie
      nt=false, topVer=AffinityTopologyVersion [topVer=161, minorTopVer=1]]
      2018-03-21 02:34:04.479[INFO ]exchange-worker-#152%DPL_GRID%DplGridNodeName%[o.a.i.i.p.c.d.d.p.GridDhtPartitionsExchangeFuture] Finished waiting for partition release future [topVer=AffinityTopologyVersion [t
      opVer=161, minorTopVer=1], waitTime=0ms, futInfo=NA]
      2018-03-21 02:34:04.487[INFO ]exchange-worker-#152%DPL_GRID%DplGridNodeName%[o.a.i.i.p.c.p.GridCacheDatabaseSharedManager] Read checkpoint status [startMarker=/gridgain/ssd/data/10_126_1_172_47500/cp/15215870
      60132-aafbf88b-f783-40e8-8e3c-ef60cd383e21-START.bin, endMarker=/gridgain/ssd/data/10_126_1_172_47500/cp/1521587060132-aafbf88b-f783-40e8-8e3c-ef60cd383e21-END.bin]
      2018-03-21 02:34:04.488[INFO ]exchange-worker-#152%DPL_GRID%DplGridNodeName%[o.a.i.i.p.c.p.GridCacheDatabaseSharedManager] Applying lost cache updates since last checkpoint record [lastMarked=FileWALPointer [
      idx=7742, fileOff=1041057120, len=1470746], lastCheckpointId=aafbf88b-f783-40e8-8e3c-ef60cd383e21]
      

      2) but right after it (with only two metrics messages in log between it) write checkpoint with wal segment idx=0

      2018-03-21 02:35:21.875[INFO ]exchange-worker-#152%DPL_GRID%DplGridNodeName%[o.a.i.i.p.c.p.GridCacheDatabaseSharedManager] Finished applying WAL changes [updatesApplied=0, time=77388ms]
      2018-03-21 02:35:22.386[INFO ]db-checkpoint-thread-#243%DPL_GRID%DplGridNodeName%[o.a.i.i.p.c.p.GridCacheDatabaseSharedManager] Checkpoint started [checkpointId=8cf946e6-a718-4388-8bef-c76bf79d93cd, startPtr=
      FileWALPointer [idx=0, fileOff=77196029, len=450864], checkpointLockWait=0ms, checkpointLockHoldTime=422ms, pages=16379, reason='node started']
      2018-03-21 02:35:25.934[INFO ]db-checkpoint-thread-#243%DPL_GRID%DplGridNodeName%[o.a.i.i.p.c.p.GridCacheDatabaseSharedManager] Checkpoint finished [cpId=8cf946e6-a718-4388-8bef-c76bf79d93cd, pages=16379, mar
      kPos=FileWALPointer [idx=0, fileOff=77196029, len=450864], walSegmentsCleared=0, markDuration=508ms, pagesWrite=155ms, fsync=3391ms, total=4054ms] 
      

      Then we get some AssertionError while trying to archive wal segment 0 when lastArchivedIdx=7742

      Attachments

        1. tc.png
          118 kB
          Evgeny Stanilovsky

        Issue Links

          Activity

            People

              zstan Evgeny Stanilovsky
              sbberkov Alexander Belyak
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: