Uploaded image for project: 'Solr'
  1. Solr
  2. SOLR-4788

Multiple Entities DIH delta import: dataimporter.[entityName].last_index_time is empty

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 4.2, 4.3
    • 4.4, 6.0
    • None

    Description

      conf/dataimport.properties
      entity1.last_index_time=2013-05-06 03\:02\:06
      last_index_time=2013-05-06 03\:05\:22
      entity2.last_index_time=2013-05-06 03\:03\:14
      entity3.last_index_time=2013-05-06 03\:05\:22
      
      conf/solrconfig.xml
      <?xml version="1.0" encoding="UTF-8" ?>
      ...
      
      
          <requestHandler name="/dataimport" class="org.apache.solr.handler.dataimport.DataImportHandler">
              <lst name="defaults">
                  <str name="config">dihconfig.xml</str>
              </lst>
          </requestHandler>
      ...
      
      conf/dihconfig.xml
      <?xml version="1.0" encoding="UTF-8" ?>
      <dataConfig>
          <dataSource name="source1"
                      type="JdbcDataSource" driver="com.mysql.jdbc.Driver"
                      url="jdbc:mysql://*:*/*"
                      user="*" password="*"/>
      
          <document name="strings">
              <entity name="entity1" pk="id" dataSource="source1"
                      query="SELECT * FROM table_a"
                      deltaQuery="SELECT table_a_id FROM table_b WHERE last_modified > '${dataimporter.entity1.last_index_time}'"
                      deltaImportQuery="SELECT * FROM table_a WHERE id = '${dataimporter.entity1.id}'"
                      transformer="TemplateTransformer">
                  <field> ...
                    ... 
                  ... </field>
              </entity>
              <entity name="entity2">
                    ... 
                    ...
              </entity>
              <entity name="entity3">
                    ... 
                    ...
              </entity>
          </document>
      </dataConfig>
      

       

      In above setup, dataimporter.entity1.last_index_time is empty string and cause the sql query having error

      Attachments

        1. SOLR-4788.patch
          3 kB
          James Dyer
        2. entitytest.patch
          7 kB
          Shawn Heisey
        3. entitytest.patch
          8 kB
          Shawn Heisey
        4. entitytest.patch
          4 kB
          Shawn Heisey
        5. entitytest.patch
          4 kB
          Shawn Heisey
        6. entitytest.patch
          5 kB
          Shawn Heisey

        Issue Links

          Activity

            badllama77 Eric added a comment -

            Same issue

            solr-spec
            4.1.0.2013.01.16.17.21.36

            solr-impl
            4.1.0 1434440 - sarowe - 2013-01-16 17:21:36

            lucene-spec
            4.1.0

            badllama77 Eric added a comment - Same issue solr-spec 4.1.0.2013.01.16.17.21.36 solr-impl 4.1.0 1434440 - sarowe - 2013-01-16 17:21:36 lucene-spec 4.1.0
            billnbell Bill Bell added a comment -

            Has someone fixed this? I am not seeing a patch.

            billnbell Bill Bell added a comment - Has someone fixed this? I am not seeing a patch.
            elyograg Shawn Heisey added a comment -

            billnbell I started looking into the code for a user on IRC. I got incredibly lost and never did find the problem. I'm willing to take another look, but I'll need guidance, and whoever could guide me would probably be able to find the problem faster than I can.

            elyograg Shawn Heisey added a comment - billnbell I started looking into the code for a user on IRC. I got incredibly lost and never did find the problem. I'm willing to take another look, but I'll need guidance, and whoever could guide me would probably be able to find the problem faster than I can.
            elyograg Shawn Heisey added a comment -

            A review of all Solr issues that mention last_index_time turns up SOLR-4051 (via SOLR-1970) as a possible candidate for the commit that broke this functionality. This assumes of course that it worked after the feature was added by SOLR-783, which is probably a safe assumption.

            SOLR-4051 says that it patches functionality that was introduced to 3.6. I think that was added by SOLR-2382, so it might have been SOLR-2382 that broke things.

            If I get some time in the near future I will attempt to write a test that illustrates the bug, and see if I can run that test on 3.6 as well. If anyone out there can try a manual test on 3.6, that would save some time.

            Side note: the code uses two constants for "last_index_time" - LAST_INDEX_TIME and LAST_INDEX_KEY. Those should probably be combined.

            elyograg Shawn Heisey added a comment - A review of all Solr issues that mention last_index_time turns up SOLR-4051 (via SOLR-1970 ) as a possible candidate for the commit that broke this functionality. This assumes of course that it worked after the feature was added by SOLR-783 , which is probably a safe assumption. SOLR-4051 says that it patches functionality that was introduced to 3.6. I think that was added by SOLR-2382 , so it might have been SOLR-2382 that broke things. If I get some time in the near future I will attempt to write a test that illustrates the bug, and see if I can run that test on 3.6 as well. If anyone out there can try a manual test on 3.6, that would save some time. Side note: the code uses two constants for "last_index_time" - LAST_INDEX_TIME and LAST_INDEX_KEY. Those should probably be combined.
            elyograg Shawn Heisey added a comment -

            From what I can tell, there have never been any tests for the [entityName].last_index_time properties, so I have no idea when this problem started happening.

            I tried to create a test for this in trunk, by duplicating TestSqlEntityProcessorDelta to a new class called TestSqlEntityProcessorDeltaEntity, and then changing all the dih.last_index_time values so they have the proper entity name, but the test fails, showing twice as many database calls at it expected. Hopefully someone can tell me what I did wrong.

            Attaching entitytest.patch.

            elyograg Shawn Heisey added a comment - From what I can tell, there have never been any tests for the [entityName] .last_index_time properties, so I have no idea when this problem started happening. I tried to create a test for this in trunk, by duplicating TestSqlEntityProcessorDelta to a new class called TestSqlEntityProcessorDeltaEntity, and then changing all the dih.last_index_time values so they have the proper entity name, but the test fails, showing twice as many database calls at it expected. Hopefully someone can tell me what I did wrong. Attaching entitytest.patch.
            elyograg Shawn Heisey added a comment -

            After I wrote that last comment, it occurred to me that the test might in fact be doing what it's supposed to be doing - failing because of the bug. I can't actually tell, because the logging config isn't right for the test that I copied, so it's not right for my test. I'm going to see whether I can figure out how to get log4j configured.

            elyograg Shawn Heisey added a comment - After I wrote that last comment, it occurred to me that the test might in fact be doing what it's supposed to be doing - failing because of the bug. I can't actually tell, because the logging config isn't right for the test that I copied, so it's not right for my test. I'm going to see whether I can figure out how to get log4j configured.
            elyograg Shawn Heisey added a comment -

            I figured out how to set up log4j.properties for the dataimport tests. I think the test is actually working correctly and showing the bug. New entitytest.patch attached.

            elyograg Shawn Heisey added a comment - I figured out how to set up log4j.properties for the dataimport tests. I think the test is actually working correctly and showing the bug. New entitytest.patch attached.
            elyograg Shawn Heisey added a comment -

            Better test. Rather than completely duplicate the existing test, this version extends it instead and overrides two methods. It probably needs a better name.

            I have no idea how to fix the problem, but at least we have a way to detect when it's fixed.

            elyograg Shawn Heisey added a comment - Better test. Rather than completely duplicate the existing test, this version extends it instead and overrides two methods. It probably needs a better name. I have no idea how to fix the problem, but at least we have a way to detect when it's fixed.
            elyograg Shawn Heisey added a comment -

            Another update to the patch. Import cleanup, javadoc update. Gave the test class a slightly better (but not imaginative) name.

            elyograg Shawn Heisey added a comment - Another update to the patch. Import cleanup, javadoc update. Gave the test class a slightly better (but not imaginative) name.
            elyograg Shawn Heisey added a comment -

            Tiny additional patch update - fixed imports on parent class to eliminate warnings in eclipse.

            elyograg Shawn Heisey added a comment - Tiny additional patch update - fixed imports on parent class to eliminate warnings in eclipse.
            elyograg Shawn Heisey added a comment -

            Arun Rangarajan posted to solr-user that he was running into this problem on an upgrade from 3.6.2 to 4.2.1, so now we know that it worked properly in the 3.x versions.

            elyograg Shawn Heisey added a comment - Arun Rangarajan posted to solr-user that he was running into this problem on an upgrade from 3.6.2 to 4.2.1, so now we know that it worked properly in the 3.x versions.
            billnbell Bill Bell added a comment -

            We are also running into this issue. Not sure how it happens yet though.

            billnbell Bill Bell added a comment - We are also running into this issue. Not sure how it happens yet though.
            jdyer James Dyer added a comment -

            Here is a patch with test coverage & a fix. I can commit this after the weekend.

            jdyer James Dyer added a comment - Here is a patch with test coverage & a fix. I can commit this after the weekend.
            billnbell Bill Bell added a comment -

            Can we get this into 4.4?

            billnbell Bill Bell added a comment - Can we get this into 4.4?

            Thanks James for the patch.

            I'm going to commit this to make sure that it gets into 4.4

            shalin Shalin Shekhar Mangar added a comment - Thanks James for the patch. I'm going to commit this to make sure that it gets into 4.4

            Commit 1500652 from shalin@apache.org
            [ https://svn.apache.org/r1500652 ]

            SOLR-4788: Multiple Entities DIH delta import: dataimporter.[entityName].last_index_time is empty

            jira-bot ASF subversion and git services added a comment - Commit 1500652 from shalin@apache.org [ https://svn.apache.org/r1500652 ] SOLR-4788 : Multiple Entities DIH delta import: dataimporter. [entityName] .last_index_time is empty

            Commit 1500662 from shalin@apache.org
            [ https://svn.apache.org/r1500662 ]

            SOLR-4788: Multiple Entities DIH delta import: dataimporter.[entityName].last_index_time is empty

            jira-bot ASF subversion and git services added a comment - Commit 1500662 from shalin@apache.org [ https://svn.apache.org/r1500662 ] SOLR-4788 : Multiple Entities DIH delta import: dataimporter. [entityName] .last_index_time is empty

            elyograg - Can you please open a separate issue for the logging configuration?

            shalin Shalin Shekhar Mangar added a comment - elyograg - Can you please open a separate issue for the logging configuration?
            sarowe Steven Rowe added a comment -

            Bulk close resolved 4.4 issues

            sarowe Steven Rowe added a comment - Bulk close resolved 4.4 issues

            People

              shalin Shalin Shekhar Mangar
              chakming chakming wong
              Votes:
              3 Vote for this issue
              Watchers:
              9 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: