Uploaded image for project: 'Jackrabbit Oak'
  1. Jackrabbit Oak
  2. OAK-6081

Indexing tooling via oak-run

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Closed
    • Major
    • Resolution: Fixed
    • None
    • 1.7.4, 1.8.0
    • indexing, run
    • None

    Description

      To enable better management for indexing related operation specially around reindexing indexes on large repository setup we should implement some tooling as part of oak-run

      The tool would support

      1. For DocumentNodeStore setup it would be possible to connect oak-run to a live cluster and it would take care of indexing -> storing index on disk -> merging index -> importing it back at end. This would ensure that live setup faces minimum disruption and is not loaded much
      2. For SegementNodeStore setup it would be possible to index on a cloned setup and then provide a way to copy the index back

      Future Enhancements

      1. Resumable tarversal - It should be able to reindex large repo with resumable traversal such that even if indexing breaks due to some issue it can resume from last state (OAK-5833)
      2. Multithreaded traversal - Current indexing is single threaded and hence for large repo it can take long time. Plan here is to support multi threaded indexing where each thread can be assigned a part of repository tree to index and in the end the indexes are merged

      Attachments

        Issue Links

          Activity

            People

              chetanm Chetan Mehrotra
              chetanm Chetan Mehrotra
              Votes:
              1 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: