Uploaded image for project: 'Jackrabbit Oak'
  1. Jackrabbit Oak
  2. OAK-2808

Active deletion of 'deleted' Lucene index files from DataStore without relying on full scale Blob GC

    XMLWordPrintableJSON

Details

    Description

      With storing of Lucene index files within DataStore our usage pattern
      of DataStore has changed between JR2 and Oak.

      With JR2 the writes were mostly application based i.e. if application
      stores a pdf/image file then that would be stored in DataStore. JR2 by
      default would not write stuff to DataStore. Further in deployment
      where large number of binary content is present then systems tend to
      share the DataStore to avoid duplication of storage. In such cases
      running Blob GC is a non trivial task as it involves a manual step and
      coordination across multiple deployments. Due to this systems tend to
      delay frequency of GC

      Now with Oak apart from application the Oak system itself actively
      uses the DataStore to store the index files for Lucene and there the
      churn might be much higher i.e. frequency of creation and deletion of
      index file is lot higher. This would accelerate the rate of garbage
      generation and thus put lot more pressure on the DataStore storage
      requirements.

      Discussion thread http://markmail.org/thread/iybd3eq2bh372zrl

      Attachments

        1. OAK-2808-1.patch
          10 kB
          Thomas Mueller
        2. copyonread-stats.png
          27 kB
          Chetan Mehrotra

        Issue Links

          There are no Sub-Tasks for this issue.

          Activity

            People

              catholicon Vikas Saurabh
              chetanm Chetan Mehrotra
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: