Solr
  1. Solr
  2. SOLR-2551

Check dataimport.properties for write access before starting import

    Details

      Description

      A common mistake is that the /conf (respectively the dataimport.properties) file is not writable for solr. It would be great if that were detected on starting a dataimport job.

      Currently and import might grind away for days and fail if it can't write its timestamp to the dataimport.properties file.

      1. SOLR-2551.patch
        6 kB
        Shalin Shekhar Mangar

        Issue Links

          Activity

          Hide
          Shalin Shekhar Mangar added a comment -

          If DIH is unable to write dataimport.properties, it logs a message saying so. We don't want the import to fail in this case because a lot of people use only full-imports which does not need the dataimport.properties at all. What do you suggest?

          Show
          Shalin Shekhar Mangar added a comment - If DIH is unable to write dataimport.properties, it logs a message saying so. We don't want the import to fail in this case because a lot of people use only full-imports which does not need the dataimport.properties at all. What do you suggest?
          Hide
          C S added a comment -

          I might be wrong but isn't the dataimport.properties written even if no delta-query is configured? So even if you'd just be interested in full imports, you'll run into an exception at the end of your full import when solr attempts to write the timestamp into dataimport.properties.

          However, if that's not the case (i.e. a non-writable dataimport.properties does not break a full import), i'd suggest that the check if dataimport.properties is writable should only be done if a delta-query is defined, and in this case it should refuse to start the import.

          Show
          C S added a comment - I might be wrong but isn't the dataimport.properties written even if no delta-query is configured? So even if you'd just be interested in full imports, you'll run into an exception at the end of your full import when solr attempts to write the timestamp into dataimport.properties. However, if that's not the case (i.e. a non-writable dataimport.properties does not break a full import), i'd suggest that the check if dataimport.properties is writable should only be done if a delta-query is defined, and in this case it should refuse to start the import.
          Hide
          Shalin Shekhar Mangar added a comment -

          Patch to fail imports if the data config supports delta import but the dataimport.properties file is not writable.

          Added a test to verify.

          Show
          Shalin Shekhar Mangar added a comment - Patch to fail imports if the data config supports delta import but the dataimport.properties file is not writable. Added a test to verify.
          Hide
          Shalin Shekhar Mangar added a comment -

          Committed revision 1135954 on trunk and 1135956 on branch_3x.

          Show
          Shalin Shekhar Mangar added a comment - Committed revision 1135954 on trunk and 1135956 on branch_3x.
          Hide
          Robert Muir added a comment -

          Bulk close for 3.3

          Show
          Robert Muir added a comment - Bulk close for 3.3
          Hide
          Chris Male added a comment -

          We're seeing some test failures on Jenkins due to exceptions being thrown by DataImporter.checkWritablePersistFile.

          Any chance you can shed some light on the issue Shalin?

          Show
          Chris Male added a comment - We're seeing some test failures on Jenkins due to exceptions being thrown by DataImporter.checkWritablePersistFile. Any chance you can shed some light on the issue Shalin?
          Hide
          Shalin Shekhar Mangar added a comment -

          I'm taking a look. Thanks Chris.

          Show
          Shalin Shekhar Mangar added a comment - I'm taking a look. Thanks Chris.
          Hide
          Shalin Shekhar Mangar added a comment -

          The tests run in parallel so for the brief time that the dataimport.properties is set to read-only, other tests can fail. There is no way to change the location of the properties file so I don't see a way to fix this. Should we ignore this particular test?

          Show
          Shalin Shekhar Mangar added a comment - The tests run in parallel so for the brief time that the dataimport.properties is set to read-only, other tests can fail. There is no way to change the location of the properties file so I don't see a way to fix this. Should we ignore this particular test?
          Hide
          Chris Male added a comment -

          Amazingly, I came to the exact same conclusion at the same time. Doesn't every test of delta functionality write to the dataimport.properties file?

          Show
          Chris Male added a comment - Amazingly, I came to the exact same conclusion at the same time. Doesn't every test of delta functionality write to the dataimport.properties file?
          Hide
          Shalin Shekhar Mangar added a comment -

          Doesn't every test of delta functionality write to the dataimport.properties file?

          Yes, it does but I don't think any of our tests rely on the contents of the properties file.

          Ironically, the fact that the tests failed is proof that this feature works

          Show
          Shalin Shekhar Mangar added a comment - Doesn't every test of delta functionality write to the dataimport.properties file? Yes, it does but I don't think any of our tests rely on the contents of the properties file. Ironically, the fact that the tests failed is proof that this feature works
          Hide
          Chris Male added a comment -

          Okay, so, given that basically every test writes to this file, what are our options?

          To me it seems since the file is getting written too (whether we rely on the contents or not), this could get in the way of another test. So perhaps we need to pull the checkWritablePersistFile method out for awhile and re-assess how to achieve the same functionality in a way the tests can handle?

          Show
          Chris Male added a comment - Okay, so, given that basically every test writes to this file, what are our options? To me it seems since the file is getting written too (whether we rely on the contents or not), this could get in the way of another test. So perhaps we need to pull the checkWritablePersistFile method out for awhile and re-assess how to achieve the same functionality in a way the tests can handle?
          Hide
          Chris Male added a comment -

          or alternatively, we could make the DIH tests run sequentially, so we don't hit this problem.

          Show
          Chris Male added a comment - or alternatively, we could make the DIH tests run sequentially, so we don't hit this problem.
          Hide
          Shalin Shekhar Mangar added a comment -

          Yes, lets disable this test for now. I don't think it is even worth testing. I guess I just had too much time that day

          Another option could be to run the DIH tests sequentially.

          Show
          Shalin Shekhar Mangar added a comment - Yes, lets disable this test for now. I don't think it is even worth testing. I guess I just had too much time that day Another option could be to run the DIH tests sequentially.
          Hide
          Steve Rowe added a comment - - edited

          I'll switch the DIH tests to run sequentially. The benchmark module does this by setting the tests.threadspercpu property to zero.

          Here's the patch:

          Index: solr/contrib/dataimporthandler/build.xml
          ===================================================================
          --- solr/contrib/dataimporthandler/build.xml    (revision 1145189)
          +++ solr/contrib/dataimporthandler/build.xml    (working copy)
          @@ -23,6 +23,9 @@
               Data Import Handler
             </description>
           
          +  <!-- the tests have some parallel problems: writability to single copy of dataimport.properties -->
          +  <property name="tests.threadspercpu" value="0"/>
          +
             <import file="../contrib-build.xml"/>
           
           </project>
          

          Committing shortly.

          Show
          Steve Rowe added a comment - - edited I'll switch the DIH tests to run sequentially. The benchmark module does this by setting the tests.threadspercpu property to zero. Here's the patch: Index: solr/contrib/dataimporthandler/build.xml =================================================================== --- solr/contrib/dataimporthandler/build.xml (revision 1145189) +++ solr/contrib/dataimporthandler/build.xml (working copy) @@ -23,6 +23,9 @@ Data Import Handler </description> + <!-- the tests have some parallel problems: writability to single copy of dataimport.properties --> + <property name="tests.threadspercpu" value="0"/> + <import file="../contrib-build.xml"/> </project> Committing shortly.
          Hide
          Steve Rowe added a comment -

          Committed the patch to run DIH tests sequentially:

          • r1145194: trunk
          • r1145196: branch_3x
          Show
          Steve Rowe added a comment - Committed the patch to run DIH tests sequentially: r1145194: trunk r1145196: branch_3x
          Hide
          Shalin Shekhar Mangar added a comment -

          Thanks Steven!

          Show
          Shalin Shekhar Mangar added a comment - Thanks Steven!
          Hide
          Steve Rowe added a comment -

          The Lucene-Solr-tests-only-trunk Jenkins job has run only once since the DIH tests were made to run sequentially (https://builds.apache.org/job/Lucene-Solr-tests-only-trunk/9500/), so I'll delay closing this issue until it's successfully run 15 or 20 more times, which should take less than one day.

          Show
          Steve Rowe added a comment - The Lucene-Solr-tests-only-trunk Jenkins job has run only once since the DIH tests were made to run sequentially ( https://builds.apache.org/job/Lucene-Solr-tests-only-trunk/9500/ ), so I'll delay closing this issue until it's successfully run 15 or 20 more times, which should take less than one day.
          Hide
          Chris Male added a comment -

          Seems running the tests sequentially has solved the error.

          Show
          Chris Male added a comment - Seems running the tests sequentially has solved the error.
          Hide
          Uwe Schindler added a comment -

          Bulk close after 3.4 is released

          Show
          Uwe Schindler added a comment - Bulk close after 3.4 is released

            People

            • Assignee:
              Shalin Shekhar Mangar
              Reporter:
              C S
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development