Uploaded image for project: 'Jackrabbit Oak'
  1. Jackrabbit Oak
  2. OAK-9754

Increase default dump threshold for multithreaded download

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • indexing
    • None

    Description

      Looking at the detailed log output of indexing job using Oak with Multi-Threaded Download Strategy, lots of small files are being created because we have a low dump threshold of 1MB per file. https://github.com/apache/jackrabbit-oak/blob/trunk/oak-run-commons/src/main/java/org/apache/jackrabbit/oak/index/indexer/document/flatfile/FlatFileNodeStoreBuilder.java#L91 

      We should increase the threshold if possible to even larger to 16 MB instead, that way we have 16 MB, with 8 threads that is 128 MB. This would (hopefully) reduce the number of files from 22'972 to 1'435, which is more more reasonable. Also, I don't think it would bring any risk of out-of-memory.

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              chockerlin@gmail.com Yu-An Lin
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated: