Solr
  1. Solr
  2. SOLR-974

DataImportHandler should not commit if no data has been updated

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Minor Minor
    • Resolution: Fixed
    • Affects Version/s: 1.3
    • Fix Version/s: 1.4
    • Labels:
      None

      Description

      The DataImportHandler always finishes an import with a commit, even if it retrieved no data from its data source. Add a short circuit to not commit if no data was imported.

      Related discussion:
      http://www.nabble.com/Performance-Hit-for-Zero-Record-Dataimport-td21572935.html

      1. SOLR-974.patch
        4 kB
        Shalin Shekhar Mangar
      2. SOLR-974.patch
        4 kB
        Shalin Shekhar Mangar

        Issue Links

          Activity

          Hide
          Shalin Shekhar Mangar added a comment -

          Changes

          1. If command is delta-import and 'clean' parameter is false or not specified, if no documents were created and none were identified to be deleted, then commit is not called.
          Show
          Shalin Shekhar Mangar added a comment - Changes If command is delta-import and 'clean' parameter is false or not specified, if no documents were created and none were identified to be deleted, then commit is not called.
          Hide
          Wojtek Piaseczny added a comment -

          Why only if the command is delta-import? I'm managing my updates within my DB, so I'm always using the full-import command.

          Show
          Wojtek Piaseczny added a comment - Why only if the command is delta-import? I'm managing my updates within my DB, so I'm always using the full-import command.
          Hide
          Shalin Shekhar Mangar added a comment -

          Fair enough. We can extend this to full import if the user specified clean=false. I'll update the patch.

          Show
          Shalin Shekhar Mangar added a comment - Fair enough. We can extend this to full import if the user specified clean=false. I'll update the patch.
          Hide
          Shalin Shekhar Mangar added a comment -

          Changed to skip commit if no documents were created.

          Note – the onImportEnd event listener is still invoked even if no documents were created and commit was skipped. I think that is alright.

          Show
          Shalin Shekhar Mangar added a comment - Changed to skip commit if no documents were created. Note – the onImportEnd event listener is still invoked even if no documents were created and commit was skipped. I think that is alright.
          Hide
          Karthik K added a comment -
          Note - the onImportEnd event listener is still invoked even if no documents were created and commit was skipped. I think that is alright.

          Is there anything in the Context object that says that the no documents were created and commit was skipped. Otherwise - onImportEndListener would continue to execute even if in reality no documents were imported then, that is not so useful.

          Show
          Karthik K added a comment - Note - the onImportEnd event listener is still invoked even if no documents were created and commit was skipped. I think that is alright. Is there anything in the Context object that says that the no documents were created and commit was skipped. Otherwise - onImportEndListener would continue to execute even if in reality no documents were imported then, that is not so useful.
          Hide
          Shalin Shekhar Mangar added a comment -

          No, nothing right now. The XML response would say that no documents were created. However, one can add a postCommit or newSearcher listener if a commit is all you are interested in.

          Show
          Shalin Shekhar Mangar added a comment - No, nothing right now. The XML response would say that no documents were created. However, one can add a postCommit or newSearcher listener if a commit is all you are interested in.
          Hide
          Noble Paul added a comment -

          I guess the best thing is to expose the 'stats' as a variable in DIH. This can also be exposed through the Context#getStats()

          Show
          Noble Paul added a comment - I guess the best thing is to expose the 'stats' as a variable in DIH. This can also be exposed through the Context#getStats()
          Hide
          Shalin Shekhar Mangar added a comment -

          I guess the best thing is to expose the 'stats' as a variable in DIH. This can also be exposed through the Context#getStats()

          I like this idea. I'll give a patch.

          Show
          Shalin Shekhar Mangar added a comment - I guess the best thing is to expose the 'stats' as a variable in DIH. This can also be exposed through the Context#getStats() I like this idea. I'll give a patch.
          Hide
          Shalin Shekhar Mangar added a comment -

          Committed revision 738020.

          Thanks Wojtek!

          Kay, we can work on exposing the statistics through context with SOLR-989. With this change, one can easily detect if any documents were created or not.

          Show
          Shalin Shekhar Mangar added a comment - Committed revision 738020. Thanks Wojtek! Kay, we can work on exposing the statistics through context with SOLR-989 . With this change, one can easily detect if any documents were created or not.
          Hide
          Grant Ingersoll added a comment -

          Bulk close for Solr 1.4

          Show
          Grant Ingersoll added a comment - Bulk close for Solr 1.4

            People

            • Assignee:
              Shalin Shekhar Mangar
              Reporter:
              Wojtek Piaseczny
            • Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development