Uploaded image for project: 'Jackrabbit Oak'
  1. Jackrabbit Oak
  2. OAK-5192

Reduce Lucene related growth of repository size

Attach filesAttach ScreenshotVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

    Details

      Description

      I observed Lucene indexing contributing to up to 99% of repository growth. While the size of the index itself is well inside reasonable bounds, the overall turnover of data being written and removed again can be as much as 99%.

      In the case of the TarMK this negatively impacts overall system performance due to fast growing number of tar files / segments, bad locality of reference, cache misses/thrashing when looking up segments and vastly prolonged garbage collection cycles.

        Attachments

        1. added-bytes-zoom.png
          184 kB
          Michael Dürig
        2. diff.txt.zip
          2.42 MB
          Michael Dürig
        3. binSize100.txt
          6 kB
          Michael Dürig
        4. binSize16384.txt
          7 kB
          Michael Dürig
        5. binSizeTotal.txt
          8 kB
          Michael Dürig
        6. nonBinSizeTotal.txt
          6 kB
          Michael Dürig
        7. OAK-5192.0.patch
          10 kB
          Tommaso Teofili
        8. Screen Shot 2017-07-03 at 16.50.00.png
          107 kB
          Tommaso Teofili

        Issue Links

          Activity

            People

            • Assignee:
              teofili Tommaso Teofili
              Reporter:
              mduerig Michael Dürig

              Dates

              • Created:
                Updated:
                Resolved:

                Agile

                  Issue deployment