Uploaded image for project: 'Jackrabbit Oak'
  1. Jackrabbit Oak
  2. OAK-1639

MarkSweepGarbageCollector improvements

Details

    • Task
    • Status: Closed
    • Major
    • Resolution: Fixed
    • None
    • 0.20
    • core
    • None

    Description

      While reviewing the patch for OAK-1582 I stumbled over a few issues with MarkSweepGarbageCollector that need improving. First an foremost MarkSweepGarbageCollector needs better documentation. The current javadoc as for many methods and arguments their semantics and invariants are unclear.

      Furthermore:

      • MarkSweepGarbageCollector#init(): why an init method, and not pass the respective arguments directly to the constructor? Also at when are clients allowed to call init? Can I call it while a a GC cycle is currently taking place?
      • Is there (do we need) a protection for multiple GCs being initiated in parallel?
      • MarkSweepGarbageCollector.Sweeper#run and MarkSweepGarbageCollector.BlobIdRetriever#retrieve catch Exception and e.printStackTrace(). This needs improving.
      • MarkSweepGarbageCollector#sweep catches InterruptedExceptionInterruptedException and {{e.printStackTrace()}. This is wrong as at least the threads interrupted status need to be set.
      • DocumentNodeStore#getReferencedBlobsIterator is passed into MarkSweepGarbageCollector#init) in DocumentNodeStoreService. Won't this iterator be consumed after the first gc run such that any further run won't do anything?

      amitj_76, could you have a look at these points and create sub tasks as required?

      Attachments

        Activity

          DocumentNodeStore#getReferencedBlobsIterator is passed into MarkSweepGarbageCollector#init) in DocumentNodeStoreService. Won't this iterator be consumed after the first gc run such that any further run won't do anything?

          Fixed this with http://svn.apache.org/r1583017. Currently a single MarkSweepGarbageCollector is created an used. However I am not sure if MarkSweepGarbageCollector is thread safe as it has instance state. Probably better way would be to create a new instance for every invocation

          chetanm Chetan Mehrotra added a comment - DocumentNodeStore#getReferencedBlobsIterator is passed into MarkSweepGarbageCollector#init) in DocumentNodeStoreService. Won't this iterator be consumed after the first gc run such that any further run won't do anything? Fixed this with http://svn.apache.org/r1583017 . Currently a single MarkSweepGarbageCollector is created an used. However I am not sure if MarkSweepGarbageCollector is thread safe as it has instance state. Probably better way would be to create a new instance for every invocation
          amitjain Amit Jain added a comment -

          Have added a patch for point 1,3, 4 above.
          Will take the documentation improvement patch separately so, that its easier to review.

          amitjain Amit Jain added a comment - Have added a patch for point 1,3, 4 above. Will take the documentation improvement patch separately so, that its easier to review.
          amitjain Amit Jain added a comment -

          Thanks mduerig and chetanm for the review.

          >> Is there (do we need) a protection for multiple GCs being initiated in parallel?
          MarkSweepGarbageCollector is not thread-safe so multiple invocations in parallel from the same instance would cause problems. Though if multiple instances are used then it shouldn't be a problem.

          amitjain Amit Jain added a comment - Thanks mduerig and chetanm for the review. >> Is there (do we need) a protection for multiple GCs being initiated in parallel? MarkSweepGarbageCollector is not thread-safe so multiple invocations in parallel from the same instance would cause problems. Though if multiple instances are used then it shouldn't be a problem.
          mduerig Michael Dürig added a comment -

          MarkSweepGarbageCollector is not thread-safe

          I think this should go into the class Javadoc.

          mduerig Michael Dürig added a comment - MarkSweepGarbageCollector is not thread-safe I think this should go into the class Javadoc.

          All the issues raised have been fixed. MarkSweepGC now uses a constructor and a new instance is created for every GC call. The exceptions are properly logged and it now uses an external Executor for executing jobs.

          Closing the issue

          chetanm Chetan Mehrotra added a comment - All the issues raised have been fixed. MarkSweepGC now uses a constructor and a new instance is created for every GC call. The exceptions are properly logged and it now uses an external Executor for executing jobs. Closing the issue
          stillalex Alex Deparvu added a comment -

          bulk close for the 0.20.0 release

          stillalex Alex Deparvu added a comment - bulk close for the 0.20.0 release

          People

            chetanm Chetan Mehrotra
            mduerig Michael Dürig
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: