Uploaded image for project: 'Jackrabbit Oak'
  1. Jackrabbit Oak
  2. OAK-6081

Indexing tooling via oak-run



    • Type: New Feature
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 1.7.4, 1.8.0
    • Component/s: indexing, run
    • Labels:


      To enable better management for indexing related operation specially around reindexing indexes on large repository setup we should implement some tooling as part of oak-run

      The tool would support

      1. For DocumentNodeStore setup it would be possible to connect oak-run to a live cluster and it would take care of indexing -> storing index on disk -> merging index -> importing it back at end. This would ensure that live setup faces minimum disruption and is not loaded much
      2. For SegementNodeStore setup it would be possible to index on a cloned setup and then provide a way to copy the index back

      Future Enhancements

      1. Resumable tarversal - It should be able to reindex large repo with resumable traversal such that even if indexing breaks due to some issue it can resume from last state (OAK-5833)
      2. Multithreaded traversal - Current indexing is single threaded and hence for large repo it can take long time. Plan here is to support multi threaded indexing where each thread can be assigned a part of repository tree to index and in the end the indexes are merged


          Issue Links



              • Assignee:
                chetanm Chetan Mehrotra
                chetanm Chetan Mehrotra
              • Votes:
                1 Vote for this issue
                3 Start watching this issue


                • Created: