Uploaded image for project: 'Jackrabbit Oak'
  1. Jackrabbit Oak
  2. OAK-7285

Reindexing using --doc-traversal-mode can OOM while aggregation in some cases

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 1.8.0
    • Fix Version/s: 1.9.0, 1.10.0, 1.8.3
    • Component/s: lucene, mongomk
    • Labels:
      None

      Description

      --doc-traversal-mode works on the notion of preferred children which is computed using path fragments that form aggregate rules.

      The idea is reading through aggregated paths should avoid keeping non useful nodes (for path being currently indexed) in memory.

      But, currently, in case, say when there multiple preferred children - jcr:content, metadata, then an index defn indexing parent of a very deep tree root would try to read in the whole tree before concluding that it doesn't have preferred children

      e.g. with preferred list - jcr:content and metadata and index looking for jcr:content indexing following structure

      + /path/being/indexed
         + very
            + very
            + very
                 + deep
                 + tree
      + /some-sibling
      

      Currently, while looking for jcr:content, the code concludes that it doesn't exist only after reaching /some-sibling (or if number of children read of /path/being/indexed is >= num_preferred_children).

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                catholicon Vikas Saurabh
                Reporter:
                catholicon Vikas Saurabh
              • Votes:
                0 Vote for this issue
                Watchers:
                2 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: