Uploaded image for project: 'Jackrabbit Oak'
  1. Jackrabbit Oak
  2. OAK-4826

Auto removal of orphaned checkpoints

Attach filesAttach ScreenshotVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Closed
    • Major
    • Resolution: Fixed
    • None
    • 1.4.8, 1.5.11, 1.6.0
    • core
    • None

    Description

      Currently if in a running system there are some orphaned checkpoint present then they prevent the revision gc (compaction for segment) from being effective.

      So far the practice has been to use oak-run checkpoints rm-unreferenced command to clean them up manually. This was set to manual as it was not possible to determine whether current checkpoint is in use or not. rm-unreferenced works with the basis that checkpoints are only made from AsyncIndexUpdate and hence can check if the checkpoint is in use by cross checking with :async state. Doing it in auto mode is risky as checkpoint api can be used by any module.

      With OAK-2314 we also record some metadata like creator and name. This can be used for auto cleanup. For example in some running system following checkpoints are listed

      Mon Sep 19 18:02:09 EDT 2016	Sun Jun 16 18:02:09 EDT 2019	r15744787d0a-1-1	
       
      creator=AsyncIndexUpdate
      name=fulltext-async
      thread=sling-default-4070-Registered Service.653
       
      Mon Sep 19 18:02:09 EDT 2016	Sun Jun 16 18:02:09 EDT 2019	r15744787d0a-0-1	
       
      creator=AsyncIndexUpdate
      name=async
      thread=sling-default-4072-Registered Service.656
       
      Fri Aug 19 18:57:33 EDT 2016	Thu May 16 18:57:33 EDT 2019	r156a50612e1-1-1	
       
      creator=AsyncIndexUpdate
      name=async
      thread=sling-default-10-Registered Service.654
       
      Wed Aug 10 12:13:20 EDT 2016	Tue May 07 12:25:52 EDT 2019	r156753ac38d-0-1	
       
      creator=AsyncIndexUpdate
      name=async
      thread=sling-default-6041-Registered Service.1966
      

      As can be seen that last 2 checkpoints are orphan and they would prevent revision gc. For auto mode we can use following heuristic

      1. List all current checkpoints
      2. Only keep the latest checkpoint for given creator and name combo. Other entries from same pair which are older i.e. creation time can be consider orphan and deleted

      This logic can be implemented org.apache.jackrabbit.oak.checkpoint.Checkpoints and can be invoked by Revision GC logic (both in DocumentNodeStore and SegmentNodeStore) to determine the base revision to keep

      Attachments

        1. OAK-4826.patch
          9 kB
          Marcel Reutegger
        2. OAK-4826.patch
          7 kB
          Marcel Reutegger
        3. OAK-4826.patch
          6 kB
          Marcel Reutegger

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            mreutegg Marcel Reutegger
            chetanm Chetan Mehrotra
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment