Uploaded image for project: 'Solr'
  1. Solr
  2. SOLR-8395

query-time join (with scoring) for single value numeric fields::6.x ONLY

    Details

      Description

      since LUCENE-5868 we have an opportunity to improve SOLR-6234 to make it join int and long fields. I suppose it's worth to add "simple" test in Solr NoScore suite.

      • Alongside with that we can set multipleValues parameters giving fromField cardinality declared in schema;
      1. SOLR-8395-6x.patch
        16 kB
        Mikhail Khludnev
      2. SOLR-8395.patch
        2 kB
        Cao Manh Dat
      3. SOLR-8395.patch
        1 kB
        Mikhail Khludnev
      4. SOLR-8395.patch
        6 kB
        Cao Manh Dat
      5. SOLR-8395.patch
        15 kB
        Cao Manh Dat
      6. SOLR-8395.patch
        21 kB
        Mikhail Khludnev

        Issue Links

          Activity

          Hide
          caomanhdat Cao Manh Dat added a comment -

          Trivial patch for this issue.

          Show
          caomanhdat Cao Manh Dat added a comment - Trivial patch for this issue.
          Hide
          mkhludnev Mikhail Khludnev added a comment -

          It's nearly shocked me. The first path with multivalue fields ("uid_ls_dv", "rel_from_ls_dv") works out of the box even without LUCENE-5868!!
          The answer is in TrieField.createFields() for mv dv numerics Solr creates SortedSetDVs encoded as numbers and it works fine as-is. See also SOLR-7878.
          Thus, the only way to break the test is to use single valued docval fields. That's what I did in the SOLR-8395.patch. Now it fails

          java.lang.IllegalStateException: unexpected docvalues type NUMERIC for field 'rel_to_l_dv' (expected one of [SORTED, SORTED_SET]). Use UninvertingReader or index with docvalues.
          ..
          	at org.apache.lucene.index.DocValues.checkField(DocValues.java:208)
          	at org.apache.lucene.index.DocValues.getSortedSet(DocValues.java:306)
          	at org.apache.lucene.search.join.DocValuesTermsCollector.lambda$1(DocValuesTermsCollector.java:59)
          	at ..
          	at org.apache.lucene.search.join.JoinUtil.createJoinQuery(JoinUtil.java:146)
          ..
          org.apache.solr.search.join.TestScoreJoinQPNoScore.testJoinNumeric(TestScoreJoinQPNoScore.java:71)
          

          If you are going to work on it pls make sure ints and longs are covered both. I see one more trick in TrieField.createFields().

          Show
          mkhludnev Mikhail Khludnev added a comment - It's nearly shocked me. The first path with multivalue fields ("uid_ls_dv", "rel_from_ls_dv") works out of the box even without LUCENE-5868 !! The answer is in TrieField.createFields() for mv dv numerics Solr creates SortedSetDVs encoded as numbers and it works fine as-is. See also SOLR-7878 . Thus, the only way to break the test is to use single valued docval fields. That's what I did in the SOLR-8395.patch . Now it fails java.lang.IllegalStateException: unexpected docvalues type NUMERIC for field 'rel_to_l_dv' (expected one of [SORTED, SORTED_SET]). Use UninvertingReader or index with docvalues. .. at org.apache.lucene.index.DocValues.checkField(DocValues.java:208) at org.apache.lucene.index.DocValues.getSortedSet(DocValues.java:306) at org.apache.lucene.search.join.DocValuesTermsCollector.lambda$1(DocValuesTermsCollector.java:59) at .. at org.apache.lucene.search.join.JoinUtil.createJoinQuery(JoinUtil.java:146) .. org.apache.solr.search.join.TestScoreJoinQPNoScore.testJoinNumeric(TestScoreJoinQPNoScore.java:71) If you are going to work on it pls make sure ints and longs are covered both. I see one more trick in TrieField.createFields().
          Hide
          caomanhdat Cao Manh Dat added a comment - - edited

          I think it ready.
          Mikhail Khludnev Did i miss or misunderstand something?

          Show
          caomanhdat Cao Manh Dat added a comment - - edited I think it ready. Mikhail Khludnev Did i miss or misunderstand something?
          Hide
          caomanhdat Cao Manh Dat added a comment -

          Thanks for point it out to me.

          Show
          caomanhdat Cao Manh Dat added a comment - Thanks for point it out to me.
          Hide
          mkhludnev Mikhail Khludnev added a comment -

          I skimmed through SOLR-8395.patch

          • ScoreJoinQParserPlugin.OtherCoreJoinQuery.rewrite(IndexReader)
            ignores numericType, thus pls extract the calling one of {{JoinUtil.createJoinQuery()} into a method in SameCoreJoinQuery. Or even introduce a strategy in ScoreJoinQParserPlugin dispatching between these two factory methods.
          • if it happened, would you mind to add a test coverage into TestCrossCoreJoin? I'm asking because joining cross cores by numbers is an often demand.
          • Also, you added a perfect assert for matching numeric types, can you check it with negative assertions with assertQEx() ?
          Show
          mkhludnev Mikhail Khludnev added a comment - I skimmed through SOLR-8395.patch ScoreJoinQParserPlugin.OtherCoreJoinQuery.rewrite(IndexReader) ignores numericType , thus pls extract the calling one of {{JoinUtil.createJoinQuery()} into a method in SameCoreJoinQuery. Or even introduce a strategy in ScoreJoinQParserPlugin dispatching between these two factory methods. if it happened, would you mind to add a test coverage into TestCrossCoreJoin? I'm asking because joining cross cores by numbers is an often demand. Also, you added a perfect assert for matching numeric types, can you check it with negative assertions with assertQEx() ?
          Hide
          caomanhdat Cao Manh Dat added a comment -

          Mikhail Khludnev Thanks you for show me the important point. I updated class OtherCoreJoinQuery.

          Show
          caomanhdat Cao Manh Dat added a comment - Mikhail Khludnev Thanks you for show me the important point. I updated class OtherCoreJoinQuery .
          Hide
          mkhludnev Mikhail Khludnev added a comment -

          Back to work. Meanwhile LUCENE-7418 nuked legacy numerics from JoinUtil. This patch pulls it back into ScoreJoinQParserPlugin, however it requires to expose some join internals on public.
          I understand that having points in Solr is better, but is there anything preventing from forgiving such approach?

          Show
          mkhludnev Mikhail Khludnev added a comment - Back to work. Meanwhile LUCENE-7418 nuked legacy numerics from JoinUtil . This patch pulls it back into ScoreJoinQParserPlugin , however it requires to expose some join internals on public. I understand that having points in Solr is better, but is there anything preventing from forgiving such approach?
          Hide
          mkhludnev Mikhail Khludnev added a comment - - edited

          Vadim Ivanov, as well you experiment with patches in SOLR-4787 can you check this one too? I wonder how it can help for your case, let me know if you need it for certain version.

          Show
          mkhludnev Mikhail Khludnev added a comment - - edited Vadim Ivanov , as well you experiment with patches in SOLR-4787 can you check this one too? I wonder how it can help for your case, let me know if you need it for certain version.
          Hide
          mkhludnev Mikhail Khludnev added a comment -

          I'm going to commit SOLR-8395-6x.patch only in branch_6x. This won't go to master (7.0), it's too much to migrate unless we have a justification. Opinions?

          Show
          mkhludnev Mikhail Khludnev added a comment - I'm going to commit SOLR-8395-6x.patch only in branch_6x . This won't go to master (7.0), it's too much to migrate unless we have a justification. Opinions?
          Hide
          jira-bot ASF subversion and git services added a comment -

          Commit eb10b2c2668819d1f803ee358595487a6989a640 in lucene-solr's branch refs/heads/branch_6x from Mikhail Khludnev
          [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=eb10b2c ]

          SOLR-8395: join single value numerics.

          Show
          jira-bot ASF subversion and git services added a comment - Commit eb10b2c2668819d1f803ee358595487a6989a640 in lucene-solr's branch refs/heads/branch_6x from Mikhail Khludnev [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=eb10b2c ] SOLR-8395 : join single value numerics.
          Hide
          jira-bot ASF subversion and git services added a comment -

          Commit 46301f2fa2b67e9411de19b19453928c1dc4baf8 in lucene-solr's branch refs/heads/master from Mikhail Khludnev
          [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=46301f2 ]

          SOLR-8395: add disclaimer into 7.0 migration - it won't work there.

          Show
          jira-bot ASF subversion and git services added a comment - Commit 46301f2fa2b67e9411de19b19453928c1dc4baf8 in lucene-solr's branch refs/heads/master from Mikhail Khludnev [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=46301f2 ] SOLR-8395 : add disclaimer into 7.0 migration - it won't work there.
          Hide
          shalinmangar Shalin Shekhar Mangar added a comment -

          Closing after 6.3.0 release.

          Show
          shalinmangar Shalin Shekhar Mangar added a comment - Closing after 6.3.0 release.

            People

            • Assignee:
              mkhludnev Mikhail Khludnev
              Reporter:
              mkhludnev Mikhail Khludnev
            • Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development