Solr
  1. Solr
  2. SOLR-1520

QueryElevationComponent does not work when unique key is of type 'sint'

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 1.3
    • Fix Version/s: 3.6, 4.0-ALPHA
    • Component/s: Rules, search
    • Labels:
      None
    • Environment:

      Gentoo Linux, Apache Tomcat 6.0.20, Java 1.6.0_15-b03

      Description

      QueryElevationComponent does not work when unique key of the documents is of type 'sint'. I did not try any other combination, but when I looking at the source in QueryElevationComponent.java I doubt it will work with any other type than string.

      I propose to either make it work with other data types or at least to post a warning.

      1. SOLR-1520.patch
        2 kB
        Grant Ingersoll
      2. SOLR-1520.patch
        25 kB
        Grant Ingersoll

        Issue Links

          Activity

          Uwe Schindler made changes -
          Status Resolved [ 5 ] Closed [ 6 ]
          Grant Ingersoll made changes -
          Status Reopened [ 4 ] Resolved [ 5 ]
          Resolution Fixed [ 1 ]
          Grant Ingersoll made changes -
          Comment [ Not sure it is going to be doable, given how many things use that solrconfig (for instance, MinimalSchemaTest which has no unique key, it appears).

          ]
          Hide
          Yonik Seeley added a comment -

          I want to leverage the existing Solr distributed tests for SOLR-2949 (instead of writing another distributed test that requires spin up/tear down of shards and thus slows down the tests significantly)

          +1 to that part.

          The "test if a trie field works" doesn't need to be distributed and can go somewhere else.

          Show
          Yonik Seeley added a comment - I want to leverage the existing Solr distributed tests for SOLR-2949 (instead of writing another distributed test that requires spin up/tear down of shards and thus slows down the tests significantly) +1 to that part. The "test if a trie field works" doesn't need to be distributed and can go somewhere else.
          Hide
          Grant Ingersoll added a comment -

          Thanks, Hoss. Forgot to run the full suite of tests. The main reason I put it in the main config is that I wanted a trie field unique key and also b/c I want to leverage the existing Solr distributed tests for SOLR-2949 (instead of writing another distributed test that requires spin up/tear down of shards and thus slows down the tests significantly)

          Show
          Grant Ingersoll added a comment - Thanks, Hoss. Forgot to run the full suite of tests. The main reason I put it in the main config is that I wanted a trie field unique key and also b/c I want to leverage the existing Solr distributed tests for SOLR-2949 (instead of writing another distributed test that requires spin up/tear down of shards and thus slows down the tests significantly)
          Hide
          Hoss Man added a comment -

          Grant: i had to revert r1214937 as it was breaking several tests. example...

          java.lang.RuntimeException: org.apache.solr.common.SolrException
          	at org.apache.solr.util.TestHarness.<init>(TestHarness.java:152)
          	at org.apache.solr.util.TestHarness.<init>(TestHarness.java:135)
          	at org.apache.solr.util.TestHarness.<init>(TestHarness.java:125)
          	at org.apache.solr.SolrTestCaseJ4.initCore(SolrTestCaseJ4.java:245)
          	at org.apache.solr.SolrTestCaseJ4.initCore(SolrTestCaseJ4.java:103)
          	at org.apache.solr.SolrTestCaseJ4.initCore(SolrTestCaseJ4.java:92)
          	at org.apache.solr.MinimalSchemaTest.beforeClass(MinimalSchemaTest.java:40)
          Caused by: org.apache.solr.common.SolrException
          	at org.apache.solr.core.SolrCore.<init>(SolrCore.java:619)
          	at org.apache.solr.core.SolrCore.<init>(SolrCore.java:504)
          	at org.apache.solr.util.TestHarness$Initializer.initialize(TestHarness.java:216)
          	at org.apache.solr.util.TestHarness.<init>(TestHarness.java:140)
          Caused by: org.apache.solr.common.SolrException: QueryElevationComponent requires the schema to have a uniqueKeyField.
          	at org.apache.solr.handler.component.QueryElevationComponent.inform(QueryElevationComponent.java:161)
          	at org.apache.solr.core.SolrResourceLoader.inform(SolrResourceLoader.java:542)
          	at org.apache.solr.core.SolrCore.<init>(SolrCore.java:614)
          

          As i mentioned in email...

          Grant: at quick skim suggests that your change to 
          solr/core/src/test-files/solr/conf/solrconfig.xml in r1214937 broke 
          several tests that use that config w/o a uniqueKey field.
          
          At first glance, it's not even clear why you added that to the (heavily 
          overused) solrconfig.xml anyway, since there is already a 
          solrconfig-elevate.xml that is also used already.  (if the problem is 
          wanting to use different based types in diff test methods, you could just 
          instantiate multiple instances of QueryElevationComponent in 
          solrconfig-elevate.xml and refer to them by name.
          
          Show
          Hoss Man added a comment - Grant: i had to revert r1214937 as it was breaking several tests. example... java.lang.RuntimeException: org.apache.solr.common.SolrException at org.apache.solr.util.TestHarness.<init>(TestHarness.java:152) at org.apache.solr.util.TestHarness.<init>(TestHarness.java:135) at org.apache.solr.util.TestHarness.<init>(TestHarness.java:125) at org.apache.solr.SolrTestCaseJ4.initCore(SolrTestCaseJ4.java:245) at org.apache.solr.SolrTestCaseJ4.initCore(SolrTestCaseJ4.java:103) at org.apache.solr.SolrTestCaseJ4.initCore(SolrTestCaseJ4.java:92) at org.apache.solr.MinimalSchemaTest.beforeClass(MinimalSchemaTest.java:40) Caused by: org.apache.solr.common.SolrException at org.apache.solr.core.SolrCore.<init>(SolrCore.java:619) at org.apache.solr.core.SolrCore.<init>(SolrCore.java:504) at org.apache.solr.util.TestHarness$Initializer.initialize(TestHarness.java:216) at org.apache.solr.util.TestHarness.<init>(TestHarness.java:140) Caused by: org.apache.solr.common.SolrException: QueryElevationComponent requires the schema to have a uniqueKeyField. at org.apache.solr.handler.component.QueryElevationComponent.inform(QueryElevationComponent.java:161) at org.apache.solr.core.SolrResourceLoader.inform(SolrResourceLoader.java:542) at org.apache.solr.core.SolrCore.<init>(SolrCore.java:614) As i mentioned in email... Grant: at quick skim suggests that your change to solr/core/src/test-files/solr/conf/solrconfig.xml in r1214937 broke several tests that use that config w/o a uniqueKey field. At first glance, it's not even clear why you added that to the (heavily overused) solrconfig.xml anyway, since there is already a solrconfig-elevate.xml that is also used already. (if the problem is wanting to use different based types in diff test methods, you could just instantiate multiple instances of QueryElevationComponent in solrconfig-elevate.xml and refer to them by name.
          Hide
          Grant Ingersoll added a comment -

          The only issue is TrieField returns true for isTokenized() which causes the current approach to fail, but I guess we can just drop the isTokenized() requirement since it's somewhat undefined anyway what an multi-token id field means anyway. So, I'm just going to drop it and commit it w/ the only requirement being that the unique key exists.

          Show
          Grant Ingersoll added a comment - The only issue is TrieField returns true for isTokenized() which causes the current approach to fail, but I guess we can just drop the isTokenized() requirement since it's somewhat undefined anyway what an multi-token id field means anyway. So, I'm just going to drop it and commit it w/ the only requirement being that the unique key exists.
          Grant Ingersoll made changes -
          Resolution Fixed [ 1 ]
          Status Resolved [ 5 ] Reopened [ 4 ]
          Hide
          Grant Ingersoll added a comment -

          This really should work with TrieFields too.

          Show
          Grant Ingersoll added a comment - This really should work with TrieFields too.
          Grant Ingersoll made changes -
          Status Open [ 1 ] Resolved [ 5 ]
          Fix Version/s 3.6 [ 12319065 ]
          Fix Version/s 4.0 [ 12314992 ]
          Resolution Fixed [ 1 ]
          Hide
          Grant Ingersoll added a comment -

          Committed to trunk and branch

          Show
          Grant Ingersoll added a comment - Committed to trunk and branch
          Grant Ingersoll committed 1213826 (1 file)
          Grant Ingersoll made changes -
          Component/s Rules [ 12317000 ]
          Grant Ingersoll made changes -
          Attachment SOLR-1520.patch [ 12506538 ]
          Hide
          Grant Ingersoll added a comment -

          Adds in tests for this against schema11.xml. Should be close to ready to go. For now, it requires a non-tokenized field for the id.

          Show
          Grant Ingersoll added a comment - Adds in tests for this against schema11.xml. Should be close to ready to go. For now, it requires a non-tokenized field for the id.
          Grant Ingersoll made changes -
          Link This issue relates to SOLR-2580 [ SOLR-2580 ]
          Hide
          Grant Ingersoll added a comment -

          IRC log:

          gsingers: I need to test solr w/ multiple unique keys
          [16:48] gsingers: of different field types
          [16:48] gsingers: https://issues.apache.org/jira/browse/SOLR-1520
          [16:48] gsingers: any suggestions?
          [16:49] gsingers: seems like the Solr test case is pretty much one schema per
          [16:49] gsingers: test class
          [17:18] ThetaPh1: gsingers: is getStringIndex otherwise used in this, as Hoss mentioned?
          [17:18] ThetaPh1: because lese it should work also with numeric fields and also sint, tint,...
          [17:21] gsingers: ThetaPh1: I don't think getStringIndex is used anymore
          [17:22] hossman: grant: there's a new JUnit4-esque base class that makes it easier to use differnet configs in each test method ... you just call initCore(...) as needed in each test method
          [17:22] ThetaPh1: or in trunk the replacement...
          [17:22] yonik__: getStringIndex is gone in trunk
          [17:22] gsingers: it does still do FieldCache stuff
          [17:22] ThetaPh1: yonik__: but the bytesref one?
          [17:22] gsingers: but it is just BytesRef
          [17:22] ThetaPh1: that is what i meant
          [17:22] yonik__: gsingers: for schemas, just use schema11.xml - it has a float (actually sfloat) unique key
          [17:23] gsingers: yonik: OK, I can do that. This should work for any field that is not tokenized, right?
          [17:23] gsingers: as long as it is indexed
          [17:23] yonik__: other code paths test the transformation from indexed to readable and back... it's enough to test a single key that has a readable that is not identical to indexed
          [17:24] yonik__: strictly speaking, tokenized should be fine too
          [17:24] yonik__: or can be fine I should say
          [17:24] gsingers: but what if you get multiple docs back?
          [17:24] markrmiller left the chat room. (Quit: Commit this!)
          [17:24] gsingers: I was debating that
          [17:24] hossman: (FWIW: test like this are also possible with AbstractSolrTestCase – you just change the value getSchemaFile() returns before calling super.setUp(), but it's more explicit in the new base class)
          [17:25] gsingers: maybe it simply means elevate all those files
          [17:25] yonik__: it's just like sorting
          [17:25] gsingers: but, it is the unique key
          [17:25] yonik__: if you're using the field cache, you won't get more than one doc per value
          [17:25] gsingers: so, there shouldn't be more than one doc anyway
          [17:25] yonik__: right
          [17:25] yonik__: it actually could be tokenized, but should only result in one token
          [17:26] gsingers: true, but that's a bit harder to enforce
          [17:26] ThetaPh1: even when using getstringindex, trunk and 3.x should no longer throw exceptrion
          [17:26] ThetaPh1: 3.0 does, but this was incorrectly fixed
          [17:26] gsingers: and seems like it could cause weird errors for people
          [17:26] yonik__: gsingers: we don't enforce it for the unique key
          [17:26] gsingers: OK
          [17:26] yonik__: I don't believe
          [17:26] ThetaPh1: and if you use tint as unique key, it should also work with stringindex ahm termsindex
          [17:26] yonik__: anyway, if it's enforced anywhere, that would be the place
          [17:26] gsingers: I'm working on a diff. issue, anyone object to me c&p this onto the issue?
          [17:26] gsingers: so that I don't forget it
          [17:27] yonik__: ?

          Show
          Grant Ingersoll added a comment - IRC log: gsingers: I need to test solr w/ multiple unique keys [16:48] gsingers: of different field types [16:48] gsingers: https://issues.apache.org/jira/browse/SOLR-1520 [16:48] gsingers: any suggestions? [16:49] gsingers: seems like the Solr test case is pretty much one schema per [16:49] gsingers: test class [17:18] ThetaPh1: gsingers: is getStringIndex otherwise used in this, as Hoss mentioned? [17:18] ThetaPh1: because lese it should work also with numeric fields and also sint, tint,... [17:21] gsingers: ThetaPh1: I don't think getStringIndex is used anymore [17:22] hossman: grant: there's a new JUnit4-esque base class that makes it easier to use differnet configs in each test method ... you just call initCore(...) as needed in each test method [17:22] ThetaPh1: or in trunk the replacement... [17:22] yonik__: getStringIndex is gone in trunk [17:22] gsingers: it does still do FieldCache stuff [17:22] ThetaPh1: yonik__: but the bytesref one? [17:22] gsingers: but it is just BytesRef [17:22] ThetaPh1: that is what i meant [17:22] yonik__: gsingers: for schemas, just use schema11.xml - it has a float (actually sfloat) unique key [17:23] gsingers: yonik: OK, I can do that. This should work for any field that is not tokenized, right? [17:23] gsingers: as long as it is indexed [17:23] yonik__: other code paths test the transformation from indexed to readable and back... it's enough to test a single key that has a readable that is not identical to indexed [17:24] yonik__: strictly speaking, tokenized should be fine too [17:24] yonik__: or can be fine I should say [17:24] gsingers: but what if you get multiple docs back? [17:24] markrmiller left the chat room. (Quit: Commit this!) [17:24] gsingers: I was debating that [17:24] hossman: (FWIW: test like this are also possible with AbstractSolrTestCase – you just change the value getSchemaFile() returns before calling super.setUp(), but it's more explicit in the new base class) [17:25] gsingers: maybe it simply means elevate all those files [17:25] yonik__: it's just like sorting [17:25] gsingers: but, it is the unique key [17:25] yonik__: if you're using the field cache, you won't get more than one doc per value [17:25] gsingers: so, there shouldn't be more than one doc anyway [17:25] yonik__: right [17:25] yonik__: it actually could be tokenized, but should only result in one token [17:26] gsingers: true, but that's a bit harder to enforce [17:26] ThetaPh1: even when using getstringindex, trunk and 3.x should no longer throw exceptrion [17:26] ThetaPh1: 3.0 does, but this was incorrectly fixed [17:26] gsingers: and seems like it could cause weird errors for people [17:26] yonik__: gsingers: we don't enforce it for the unique key [17:26] gsingers: OK [17:26] yonik__: I don't believe [17:26] ThetaPh1: and if you use tint as unique key, it should also work with stringindex ahm termsindex [17:26] yonik__: anyway, if it's enforced anywhere, that would be the place [17:26] gsingers: I'm working on a diff. issue, anyone object to me c&p this onto the issue? [17:26] gsingers: so that I don't forget it [17:27] yonik__: ?
          Grant Ingersoll made changes -
          Attachment SOLR-1520.patch [ 12448453 ]
          Hide
          Grant Ingersoll added a comment -

          Here's a start on this, but I'm at a loss as to the best way to test it works b/c I'm not up on how to deal with needing multiple schema files (b/c I need to change unique keys)

          Show
          Grant Ingersoll added a comment - Here's a start on this, but I'm at a loss as to the best way to test it works b/c I'm not up on how to deal with needing multiple schema files (b/c I need to change unique keys)
          Hide
          Grant Ingersoll added a comment -

          Seems like this should work for any non-analyzed field that is the id field, right?

          Show
          Grant Ingersoll added a comment - Seems like this should work for any non-analyzed field that is the id field, right?
          Grant Ingersoll made changes -
          Assignee Grant Ingersoll [ gsingers ]
          Hoss Man made changes -
          Field Original Value New Value
          Link This issue is related to SOLR-1558 [ SOLR-1558 ]
          Hide
          Hoss Man added a comment -

          At first glance, this seems like it could be fixed by changing the two "new TermQuery" lines to use FIeldType.readableToIndexed() ... but there's also some funky stuff going on in ElevationComparatorSource that makes me think it's probably not that easy ... particularly the use of getStringIndex which suggests that the Trie Fields definitely won't work.

          Anyone have any suggestions on a true fix for this, or at least a way to make it fail cleanly with a good error message about the uniqueKey field type expectations that it has?

          Show
          Hoss Man added a comment - At first glance, this seems like it could be fixed by changing the two "new TermQuery" lines to use FIeldType.readableToIndexed() ... but there's also some funky stuff going on in ElevationComparatorSource that makes me think it's probably not that easy ... particularly the use of getStringIndex which suggests that the Trie Fields definitely won't work. Anyone have any suggestions on a true fix for this, or at least a way to make it fail cleanly with a good error message about the uniqueKey field type expectations that it has?
          Simon Lachinger created issue -

            People

            • Assignee:
              Grant Ingersoll
              Reporter:
              Simon Lachinger
            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Time Tracking

                Estimated:
                Original Estimate - 1h
                1h
                Remaining:
                Remaining Estimate - 1h
                1h
                Logged:
                Time Spent - Not Specified
                Not Specified

                  Development