Uploaded image for project: 'Maven Indexer'
  1. Maven Indexer
  2. MINDEXER-151

Speed up Index update from remote

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Closed
    • Major
    • Resolution: Fixed
    • None
    • 7.0.0
    • None

    Description

      Currently, if you execute from examples the BasicUsageExample, it will perform "full" update, and the full update (to get from "empty" index to "up to date" index) takes 15 or more minutes. Yes, Central index is huge, but there is room for improvement.

      Steps happening during update(s):

      • properties file downloaded
      • GZ file(s) downloaded (depending is it incremental or full)
      • the GZ files are processed into temporary Lucene index
      • the target (being updated) indexing context index is "replaced" (or merged, depends) with temporary Lucene index

      Downloading files are several seconds, but it is the processing of the GZIP raw records into Lucene index that takes long time. This can be improved.

      IndexUpdateRequest got new field int threads with default value of 1 (same will happen as before). When set to something greater than 1 (accepted values are positive numbers), then IndexDataReader will behave slightly differently that with threads=1: it will create N (threads) "silo" indexes, spawn N threads, and process the input file on N threads into N silos that are merged at the end. This should improve huge update times (as index is huge as well), ideally halve it as experiments show (ideal on my HW is 4 threads that halves the full index update time).

      Using very large numbers may make things worse, as time may be spent on managing/merging silos, so the "sweet spot" is probably HW dependendant.

      Attachments

        Activity

          People

            cstamas Tamas Cservenak
            cstamas Tamas Cservenak
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: