Solr
  1. Solr
  2. SOLR-4788

Multiple Entities DIH delta import: dataimporter.[entityName].last_index_time is empty

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 4.2, 4.3
    • Fix Version/s: 4.4, 6.0
    • Labels:
      None
    • Environment:

      Description

      conf/dataimport.properties
      entity1.last_index_time=2013-05-06 03\:02\:06
      last_index_time=2013-05-06 03\:05\:22
      entity2.last_index_time=2013-05-06 03\:03\:14
      entity3.last_index_time=2013-05-06 03\:05\:22
      
      conf/solrconfig.xml
      <?xml version="1.0" encoding="UTF-8" ?>
      ...
      
      
          <requestHandler name="/dataimport" class="org.apache.solr.handler.dataimport.DataImportHandler">
              <lst name="defaults">
                  <str name="config">dihconfig.xml</str>
              </lst>
          </requestHandler>
      ...
      
      conf/dihconfig.xml
      <?xml version="1.0" encoding="UTF-8" ?>
      <dataConfig>
          <dataSource name="source1"
                      type="JdbcDataSource" driver="com.mysql.jdbc.Driver"
                      url="jdbc:mysql://*:*/*"
                      user="*" password="*"/>
      
          <document name="strings">
              <entity name="entity1" pk="id" dataSource="source1"
                      query="SELECT * FROM table_a"
                      deltaQuery="SELECT table_a_id FROM table_b WHERE last_modified > '${dataimporter.entity1.last_index_time}'"
                      deltaImportQuery="SELECT * FROM table_a WHERE id = '${dataimporter.entity1.id}'"
                      transformer="TemplateTransformer">
                  <field> ...
                    ... 
                  ... </field>
              </entity>
              <entity name="entity2">
                    ... 
                    ...
              </entity>
              <entity name="entity3">
                    ... 
                    ...
              </entity>
          </document>
      </dataConfig>
      

       

      In above setup, dataimporter.entity1.last_index_time is empty string and cause the sql query having error

      1. entitytest.patch
        5 kB
        Shawn Heisey
      2. entitytest.patch
        4 kB
        Shawn Heisey
      3. entitytest.patch
        4 kB
        Shawn Heisey
      4. entitytest.patch
        8 kB
        Shawn Heisey
      5. entitytest.patch
        7 kB
        Shawn Heisey
      6. SOLR-4788.patch
        3 kB
        James Dyer

        Issue Links

          Activity

          Hide
          Eric added a comment -

          Same issue

          solr-spec
          4.1.0.2013.01.16.17.21.36

          solr-impl
          4.1.0 1434440 - sarowe - 2013-01-16 17:21:36

          lucene-spec
          4.1.0

          Show
          Eric added a comment - Same issue solr-spec 4.1.0.2013.01.16.17.21.36 solr-impl 4.1.0 1434440 - sarowe - 2013-01-16 17:21:36 lucene-spec 4.1.0
          Hide
          Bill Bell added a comment -

          Has someone fixed this? I am not seeing a patch.

          Show
          Bill Bell added a comment - Has someone fixed this? I am not seeing a patch.
          Hide
          Shawn Heisey added a comment -

          Bill Bell I started looking into the code for a user on IRC. I got incredibly lost and never did find the problem. I'm willing to take another look, but I'll need guidance, and whoever could guide me would probably be able to find the problem faster than I can.

          Show
          Shawn Heisey added a comment - Bill Bell I started looking into the code for a user on IRC. I got incredibly lost and never did find the problem. I'm willing to take another look, but I'll need guidance, and whoever could guide me would probably be able to find the problem faster than I can.
          Hide
          Shawn Heisey added a comment -

          A review of all Solr issues that mention last_index_time turns up SOLR-4051 (via SOLR-1970) as a possible candidate for the commit that broke this functionality. This assumes of course that it worked after the feature was added by SOLR-783, which is probably a safe assumption.

          SOLR-4051 says that it patches functionality that was introduced to 3.6. I think that was added by SOLR-2382, so it might have been SOLR-2382 that broke things.

          If I get some time in the near future I will attempt to write a test that illustrates the bug, and see if I can run that test on 3.6 as well. If anyone out there can try a manual test on 3.6, that would save some time.

          Side note: the code uses two constants for "last_index_time" - LAST_INDEX_TIME and LAST_INDEX_KEY. Those should probably be combined.

          Show
          Shawn Heisey added a comment - A review of all Solr issues that mention last_index_time turns up SOLR-4051 (via SOLR-1970 ) as a possible candidate for the commit that broke this functionality. This assumes of course that it worked after the feature was added by SOLR-783 , which is probably a safe assumption. SOLR-4051 says that it patches functionality that was introduced to 3.6. I think that was added by SOLR-2382 , so it might have been SOLR-2382 that broke things. If I get some time in the near future I will attempt to write a test that illustrates the bug, and see if I can run that test on 3.6 as well. If anyone out there can try a manual test on 3.6, that would save some time. Side note: the code uses two constants for "last_index_time" - LAST_INDEX_TIME and LAST_INDEX_KEY. Those should probably be combined.
          Hide
          Shawn Heisey added a comment -

          From what I can tell, there have never been any tests for the [entityName].last_index_time properties, so I have no idea when this problem started happening.

          I tried to create a test for this in trunk, by duplicating TestSqlEntityProcessorDelta to a new class called TestSqlEntityProcessorDeltaEntity, and then changing all the dih.last_index_time values so they have the proper entity name, but the test fails, showing twice as many database calls at it expected. Hopefully someone can tell me what I did wrong.

          Attaching entitytest.patch.

          Show
          Shawn Heisey added a comment - From what I can tell, there have never been any tests for the [entityName] .last_index_time properties, so I have no idea when this problem started happening. I tried to create a test for this in trunk, by duplicating TestSqlEntityProcessorDelta to a new class called TestSqlEntityProcessorDeltaEntity, and then changing all the dih.last_index_time values so they have the proper entity name, but the test fails, showing twice as many database calls at it expected. Hopefully someone can tell me what I did wrong. Attaching entitytest.patch.
          Hide
          Shawn Heisey added a comment -

          After I wrote that last comment, it occurred to me that the test might in fact be doing what it's supposed to be doing - failing because of the bug. I can't actually tell, because the logging config isn't right for the test that I copied, so it's not right for my test. I'm going to see whether I can figure out how to get log4j configured.

          Show
          Shawn Heisey added a comment - After I wrote that last comment, it occurred to me that the test might in fact be doing what it's supposed to be doing - failing because of the bug. I can't actually tell, because the logging config isn't right for the test that I copied, so it's not right for my test. I'm going to see whether I can figure out how to get log4j configured.
          Hide
          Shawn Heisey added a comment -

          I figured out how to set up log4j.properties for the dataimport tests. I think the test is actually working correctly and showing the bug. New entitytest.patch attached.

          Show
          Shawn Heisey added a comment - I figured out how to set up log4j.properties for the dataimport tests. I think the test is actually working correctly and showing the bug. New entitytest.patch attached.
          Hide
          Shawn Heisey added a comment -

          Better test. Rather than completely duplicate the existing test, this version extends it instead and overrides two methods. It probably needs a better name.

          I have no idea how to fix the problem, but at least we have a way to detect when it's fixed.

          Show
          Shawn Heisey added a comment - Better test. Rather than completely duplicate the existing test, this version extends it instead and overrides two methods. It probably needs a better name. I have no idea how to fix the problem, but at least we have a way to detect when it's fixed.
          Hide
          Shawn Heisey added a comment -

          Another update to the patch. Import cleanup, javadoc update. Gave the test class a slightly better (but not imaginative) name.

          Show
          Shawn Heisey added a comment - Another update to the patch. Import cleanup, javadoc update. Gave the test class a slightly better (but not imaginative) name.
          Hide
          Shawn Heisey added a comment -

          Tiny additional patch update - fixed imports on parent class to eliminate warnings in eclipse.

          Show
          Shawn Heisey added a comment - Tiny additional patch update - fixed imports on parent class to eliminate warnings in eclipse.
          Hide
          Shawn Heisey added a comment -

          Arun Rangarajan posted to solr-user that he was running into this problem on an upgrade from 3.6.2 to 4.2.1, so now we know that it worked properly in the 3.x versions.

          Show
          Shawn Heisey added a comment - Arun Rangarajan posted to solr-user that he was running into this problem on an upgrade from 3.6.2 to 4.2.1, so now we know that it worked properly in the 3.x versions.
          Hide
          Bill Bell added a comment -

          We are also running into this issue. Not sure how it happens yet though.

          Show
          Bill Bell added a comment - We are also running into this issue. Not sure how it happens yet though.
          Hide
          James Dyer added a comment -

          Here is a patch with test coverage & a fix. I can commit this after the weekend.

          Show
          James Dyer added a comment - Here is a patch with test coverage & a fix. I can commit this after the weekend.
          Hide
          Bill Bell added a comment -

          Can we get this into 4.4?

          Show
          Bill Bell added a comment - Can we get this into 4.4?
          Hide
          Shalin Shekhar Mangar added a comment -

          Thanks James for the patch.

          I'm going to commit this to make sure that it gets into 4.4

          Show
          Shalin Shekhar Mangar added a comment - Thanks James for the patch. I'm going to commit this to make sure that it gets into 4.4
          Hide
          ASF subversion and git services added a comment -

          Commit 1500652 from shalin@apache.org
          [ https://svn.apache.org/r1500652 ]

          SOLR-4788: Multiple Entities DIH delta import: dataimporter.[entityName].last_index_time is empty

          Show
          ASF subversion and git services added a comment - Commit 1500652 from shalin@apache.org [ https://svn.apache.org/r1500652 ] SOLR-4788 : Multiple Entities DIH delta import: dataimporter. [entityName] .last_index_time is empty
          Hide
          ASF subversion and git services added a comment -

          Commit 1500662 from shalin@apache.org
          [ https://svn.apache.org/r1500662 ]

          SOLR-4788: Multiple Entities DIH delta import: dataimporter.[entityName].last_index_time is empty

          Show
          ASF subversion and git services added a comment - Commit 1500662 from shalin@apache.org [ https://svn.apache.org/r1500662 ] SOLR-4788 : Multiple Entities DIH delta import: dataimporter. [entityName] .last_index_time is empty
          Hide
          Shalin Shekhar Mangar added a comment -

          Shawn Heisey - Can you please open a separate issue for the logging configuration?

          Show
          Shalin Shekhar Mangar added a comment - Shawn Heisey - Can you please open a separate issue for the logging configuration?
          Hide
          Steve Rowe added a comment -

          Bulk close resolved 4.4 issues

          Show
          Steve Rowe added a comment - Bulk close resolved 4.4 issues

            People

            • Assignee:
              Shalin Shekhar Mangar
              Reporter:
              chakming wong
            • Votes:
              3 Vote for this issue
              Watchers:
              9 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development