Solr
  1. Solr
  2. SOLR-1004

Optimizing the abort command in delta import

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Minor Minor
    • Resolution: Fixed
    • Affects Version/s: 1.3
    • Fix Version/s: 1.4
    • Labels:
      None
    • Environment:

      Java - Lucene - Solr - DataImportHandler

      Description

      I have seen that when abort command is called in a deltaImport, in DocBuilder.java, at doDelta functions it's just checked for abortion at the begining of collectDelta, after that function and at the end of collectDelta.
      The problem I have found is that if there is a big number of documents to modify and abort is called in the middle of delta collection, it will not take effect until all data is collected.
      Same happens when we start deleteting or updating documents. In updating case, there is an abortion check inside buildDocument but, as it is called inside a "while" for all docs to update, it will keep going throw all docs of the bucle and skipping them.
      I propose to do an abortion check inside every loop of data collection and after calling build document in doDelta function.

      In the case of modifing documents, the code in DocBuilder.java would look like:

      while (pkIter.hasNext()) {
      Map<String, Object> map = pkIter.next();
      vri.addNamespace(DataConfig.IMPORTER_NS + ".delta", map);
      buildDocument(vri, null, map, root, true, null);
      pkIter.remove();
      //check if abortion
      if (stop.get())

      { allPks = null ; pkIter = null ; return; }

      }

      In the case of document deletion (deleteAll function in DocBuilder): Just if (stop.get())

      { break ; }

      at the end of every loop and call this just after deleteAll is called (in doDelta)
      if (stop.get())

      { allPks = null; deletedKeys = null; return; }

      Finally in collect delta:

      while (true) {
      //check for abortion
      if (stop.get())

      { return myModifiedPks; }

      Map<String, Object> row = entityProcessor.nextModifiedRowKey();

      if (row == null)
      break;
      ...

      And the same for delete-query collection and parent-delta-query collection

      I didn't atach de patch because is the first time I open an issue and don't know if you want to code it as I do. Just wanted to explain the idea and how I solved, I think it can be useful for other users.

      1. SOLR-1004.patch
        2 kB
        Shalin Shekhar Mangar

        Activity

        Hide
        Shalin Shekhar Mangar added a comment -

        Changes

        1. Check for abort in nextModifiedRow detection
        2. Check for abort in nextDeletedRow
        3. Check in doDelta
        4. Check getModifiedParentRowKey

        Marc, can you see the patch to ensure all your changes got in?

        Show
        Shalin Shekhar Mangar added a comment - Changes Check for abort in nextModifiedRow detection Check for abort in nextDeletedRow Check in doDelta Check getModifiedParentRowKey Marc, can you see the patch to ensure all your changes got in?
        Hide
        Shalin Shekhar Mangar added a comment -

        Committed revision 745742.

        Thanks Marc!

        Show
        Shalin Shekhar Mangar added a comment - Committed revision 745742. Thanks Marc!
        Hide
        Grant Ingersoll added a comment -

        Bulk close for Solr 1.4

        Show
        Grant Ingersoll added a comment - Bulk close for Solr 1.4

          People

          • Assignee:
            Shalin Shekhar Mangar
            Reporter:
            Marc Sturlese
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Time Tracking

              Estimated:
              Original Estimate - 0.5h
              0.5h
              Remaining:
              Remaining Estimate - 0.5h
              0.5h
              Logged:
              Time Spent - Not Specified
              Not Specified

                Development