Uploaded image for project: 'Solr'
  1. Solr
  2. SOLR-9217

{!join score=..}.. should delay join to createWeight

    Details

    • Type: Improvement
    • Status: Closed
    • Priority: Minor
    • Resolution: Fixed
    • Affects Version/s: 6.1
    • Fix Version/s: 6.6, 7.0
    • Component/s: query parsers
    • Labels:

      Description

      ScoreJoinQParserPlugin.XxxCoreJoinQuery executes JoinUtil.createJoinQuery on rewrite(), but it's inefficient in filter(...) syntax or fq I suppose it's filter() only problem, not fq. It's better to do that in createWeigh() as it's done in classic Solr JoinQuery, JoinQParserPlugin.
      All existing tests is enough, we just need to assert rewrite behavior - it should rewrite on enclosing range query or so, and doesn't on plain term query.

      1. SOLR_9217.patch
        3 kB
        Andrey Kudryavtsev
      2. SOLR-9217.patch
        4 kB
        Mikhail Khludnev
      3. SOLR-9217.patch
        3 kB
        gopikannan venugopalsamy

        Issue Links

          Activity

          Hide
          shanky_ty Shashank Tyagi added a comment - - edited

          Is this fixed?Where is good place for starting this.

          Show
          shanky_ty Shashank Tyagi added a comment - - edited Is this fixed?Where is good place for starting this.
          Hide
          mkhludnev Mikhail Khludnev added a comment -

          Hello,
          Shashank Tyagi, you are welcome, just look at https://wiki.apache.org/solr/HowToContribute .
          I've checked that the description is still valid and describes where to start well.

          Show
          mkhludnev Mikhail Khludnev added a comment - Hello, Shashank Tyagi , you are welcome, just look at https://wiki.apache.org/solr/HowToContribute . I've checked that the description is still valid and describes where to start well.
          Hide
          sachinimalindi Sachini Malindi added a comment -

          Can i look at this issue

          Show
          sachinimalindi Sachini Malindi added a comment - Can i look at this issue
          Hide
          shanky_ty Shashank Tyagi added a comment -

          Sure, go ahead.

          Show
          shanky_ty Shashank Tyagi added a comment - Sure, go ahead.
          Hide
          gopikannan gopikannan venugopalsamy added a comment -

          Can I work on this?

          Show
          gopikannan gopikannan venugopalsamy added a comment - Can I work on this?
          Hide
          mkhludnev Mikhail Khludnev added a comment -

          absolutely

          Show
          mkhludnev Mikhail Khludnev added a comment - absolutely
          Hide
          gopikannan gopikannan venugopalsamy added a comment -

          Great

          Mikhail,
          I am unable to understand from code what exactly is inefficient here, Rewrite and createWeight are called one after another from createNormalizedWeight

          {IndexSearcher.java}

          , Could you please shed some more light at the inefficiency part?

          I debugged below queries, The inefficiency is when join is part of filter query(fq) right?

          http://localhost:8983/solr/techproducts/select?q=*:*&fq=

          {!join%20from=id%20to=id%20score=max}:&fl=score,price

          http://localhost:8983/solr/techproducts/select?q={!join%20from=id%20to=id%20score=max}

          :&fl=score,price

          Show
          gopikannan gopikannan venugopalsamy added a comment - Great Mikhail, I am unable to understand from code what exactly is inefficient here, Rewrite and createWeight are called one after another from createNormalizedWeight {IndexSearcher.java} , Could you please shed some more light at the inefficiency part? I debugged below queries, The inefficiency is when join is part of filter query(fq) right? http://localhost:8983/solr/techproducts/select?q=*:*&fq= {!join%20from=id%20to=id%20score=max} : &fl=score,price http://localhost:8983/solr/techproducts/select?q= {!join%20from=id%20to=id%20score=max} : &fl=score,price
          Hide
          gopikannan gopikannan venugopalsamy added a comment -

          Mikhail Khludnev

          fromSearcher.search(fromQuery, collector) gets executed inside rewrite, Is this the inefficiency?

          [JoinUtil.java]
          private static Query createJoinQuery(boolean multipleValuesPerDocument, String toField, Query fromQuery,
          IndexSearcher fromSearcher, ScoreMode scoreMode, final GenericTermsCollector collector)
          throws IOException {

          fromSearcher.search(fromQuery, collector);

          Show
          gopikannan gopikannan venugopalsamy added a comment - Mikhail Khludnev fromSearcher.search(fromQuery, collector) gets executed inside rewrite, Is this the inefficiency? [JoinUtil.java] private static Query createJoinQuery(boolean multipleValuesPerDocument, String toField, Query fromQuery, IndexSearcher fromSearcher, ScoreMode scoreMode, final GenericTermsCollector collector) throws IOException { fromSearcher.search(fromQuery, collector);
          Hide
          mkhludnev Mikhail Khludnev added a comment - - edited

          gopikannan venugopalsamy, to get the idea you can set showItems=100 at FastLRUCache and LFUCache filterCache, and then execute score and non-score join under q=filter(). Then, you can see in cache stats that score-join entries use 'to'-terms lists as cache entry keys. You can also check it with debugger.
          Also, have a look at org.apache.solr.query.FilterQuery

          Show
          mkhludnev Mikhail Khludnev added a comment - - edited gopikannan venugopalsamy , to get the idea you can set showItems=100 at FastLRUCache and LFUCache filterCache, and then execute score and non-score join under q=filter() . Then, you can see in cache stats that score-join entries use 'to'-terms lists as cache entry keys. You can also check it with debugger. Also, have a look at org.apache.solr.query.FilterQuery
          Hide
          gopikannan gopikannan venugopalsamy added a comment -

          Mikhail Khludnev, Thanks for the explanation, I was trying to assert behavior of join with range query in filter() but it fails during parsing, The same join query works with out filter. Is this known?

          http://localhost:8983/solr/techproducts/select?q=filter(

          {!join%20from=id%20to=id}id:[1%20TO%205])

          org.apache.solr.search.SyntaxError: Cannot parse 'id:[1': Encountered "<EOF>" at line 1, column 5. Was expecting one of: "TO" ... <RANGE_QUOTED> ... <RANGE_GOOP> ...

          This works
          http://localhost:8983/solr/techproducts/select?q={!join%20from=id%20to=id}

          id:[1%20TO%205]

          Show
          gopikannan gopikannan venugopalsamy added a comment - Mikhail Khludnev , Thanks for the explanation, I was trying to assert behavior of join with range query in filter() but it fails during parsing, The same join query works with out filter. Is this known? http://localhost:8983/solr/techproducts/select?q=filter( {!join%20from=id%20to=id}id: [1%20TO%205] ) org.apache.solr.search.SyntaxError: Cannot parse 'id:[1': Encountered "<EOF>" at line 1, column 5. Was expecting one of: "TO" ... <RANGE_QUOTED> ... <RANGE_GOOP> ... This works http://localhost:8983/solr/techproducts/select?q= {!join%20from=id%20to=id} id: [1%20TO%205]
          Hide
          mkhludnev Mikhail Khludnev added a comment - - edited

          gopikannan venugopalsamy, you can reproduce it with q=filter({!join from=id to=id score=none}id:G*) that particular range query with space can be fixed with v=$nested. Yep, the query syntax lacks of predictability.
          If you set <filterCache showItems="100"... and execute that filter({!join score=...}) you'll see {{item_TermsQuery{field=id}:org.apache.solr.search.SortedIntDocSet@​75fee8b0 entry in filterCache and this is a problem since TermsQuery enumerates id values and it might consume a lot of heap. You can see that it's not a case if you run classic join query without score local param.

          Show
          mkhludnev Mikhail Khludnev added a comment - - edited gopikannan venugopalsamy , you can reproduce it with q=filter({!join from=id to=id score=none}id:G*) that particular range query with space can be fixed with v=$nested. Yep, the query syntax lacks of predictability. If you set <filterCache showItems="100"... and execute that filter({!join score=...}) you'll see {{item_TermsQuery{field=id}:org.apache.solr.search.SortedIntDocSet@​75fee8b0 entry in filterCache and this is a problem since TermsQuery enumerates id values and it might consume a lot of heap. You can see that it's not a case if you run classic join query without score local param.
          Hide
          werder Andrey Kudryavtsev added a comment - - edited

          gopikannan venugopalsamy, I think that the idea behind this Jira is to do something like SOLR_9217.patch

          Unfortunately, I'm not good at software development, so you will have to check this patch by yourself -(

          Show
          werder Andrey Kudryavtsev added a comment - - edited gopikannan venugopalsamy , I think that the idea behind this Jira is to do something like SOLR_9217.patch Unfortunately, I'm not good at software development, so you will have to check this patch by yourself -(
          Hide
          gopikannan gopikannan venugopalsamy added a comment - - edited

          Mikhail Khludnev Thanks again, I understood the inefficiency.

          Andrey Kudryavtsev your patch works. Thanks.

          Attached patch [SOLR-9217.patch]

          Show
          gopikannan gopikannan venugopalsamy added a comment - - edited Mikhail Khludnev Thanks again, I understood the inefficiency. Andrey Kudryavtsev your patch works. Thanks. Attached patch [SOLR-9217.patch]
          Hide
          mkhludnev Mikhail Khludnev added a comment -

          Patch looks good. Let it go shortly.

          Show
          mkhludnev Mikhail Khludnev added a comment - Patch looks good. Let it go shortly.
          Hide
          jira-bot ASF subversion and git services added a comment -

          Commit a07ac63357c3ecd817e85a5f392a558709998d05 in lucene-solr's branch refs/heads/master from Mikhail Khludnev
          [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=a07ac63 ]

          SOLR-9217: delay JoinUtil call to createWeight for score join

          Show
          jira-bot ASF subversion and git services added a comment - Commit a07ac63357c3ecd817e85a5f392a558709998d05 in lucene-solr's branch refs/heads/master from Mikhail Khludnev [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=a07ac63 ] SOLR-9217 : delay JoinUtil call to createWeight for score join
          Hide
          jira-bot ASF subversion and git services added a comment -

          Commit c215c780cd0462a9f5e4ea0a5dc80d44234e8149 in lucene-solr's branch refs/heads/branch_6x from Mikhail Khludnev
          [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=c215c78 ]

          SOLR-9217: delay JoinUtil call to createWeight for score join

          Show
          jira-bot ASF subversion and git services added a comment - Commit c215c780cd0462a9f5e4ea0a5dc80d44234e8149 in lucene-solr's branch refs/heads/branch_6x from Mikhail Khludnev [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=c215c78 ] SOLR-9217 : delay JoinUtil call to createWeight for score join

            People

            • Assignee:
              Unassigned
              Reporter:
              mkhludnev Mikhail Khludnev
            • Votes:
              1 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development