Uploaded image for project: 'Solr'
  1. Solr
  2. SOLR-12854

Document steps to improve delta import via DataImportHandler



    • Type: Improvement
    • Status: Resolved
    • Priority: Major
    • Resolution: Won't Fix
    • Affects Version/s: 7.5
    • Fix Version/s: None
    • Labels:


      Delta imports in DataImportHandler is sometimes slower than full imports where the delta import makes multiple queries compare to full import and hence making it time complex. Listed in: https://wiki.apache.org/solr/DataImportHandlerDeltaQueryViaFullImport

      In the mailing list; http://lucene.472066.n3.nabble.com/Number-of-requests-spike-up-when-i-do-the-delta-Import-td4338162.html one of the Solr users have noted a workaround which works perfectly and improves delta import performance, where we need to specify ${dataimporter.last_index_time} in the delta_import_query, and not delta_query.

      I found a hacky way to limit the number of 
      times deltaImportQuery was executed.
      As designed, solr executes deltaQuery to get a list of ids that need to be indexed. For each of those, it executes deltaImportQuery, which is typically very similar to the full query.
      I constructed a deltaQuery to purposely only return 1 row. E.g.
      deltaQuery = "SELECT id FROM table WHERE rownum=1" // written for 
      oracle, likely requires a different syntax for other dbs. Also, it occurred 
      to you could probably include the date>= '${dataimporter.last_index_time}' 
      filter here so this returns 0 rows if no data has changed
      Since deltaImportQuery now *only gets called once I needed to add the filter logic to *deltaImportQuery *to only select the changed rows (that logic is normally in *deltaQuery). E.g.
      deltaImportQuery = [normal import query] WHERE date >= 

      A number of other users have adopted the strategy and DIH delta import performance has improved, and henceforth documenting this strategy as TIP will help other users too.


          Issue Links



              • Assignee:
                sarkaramrit2@gmail.com Amrit Sarkar
              • Votes:
                0 Vote for this issue
                1 Start watching this issue


                • Created: