Solr
  1. Solr
  2. SOLR-2533

Improve API of ValueSource & FunctionQuery SortField weighting

    Details

    • Type: Improvement Improvement
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: search
    • Labels:
      None

      Description

      Started from LUCENE-2883: Support for sorting by ValueSource and FunctionQueries is done through ValueSource#getSort and the ValueSourceSortField. In order to support VSs containing other Queries, its necessary to allow the Querys to be weighted by an IndexSearcher. Currently this is handled by having ValueSourceSortField implement SolrSortField. In Solr's SolrIndexSearcher, SortFields implementing SolrSortField are then weighted before the Sort is used.

      Sorting by FunctionQuery and ValueSource are invaluable and will become available to all Lucene users in LUCENE-2883. But in order to do so, we need to remove the coupling of this functionality to Solr, and make it more standard.

      Any and all thoughts about how to do this are appreciated.

      1. SOLR-2533.patch
        10 kB
        Chris Male
      2. SOLR-2533.patch
        10 kB
        Chris Male
      3. SOLR-2533.patch
        9 kB
        Chris Male
      4. SOLR-2533.patch
        8 kB
        Chris Male
      5. SOLR-2533.patch
        8 kB
        Chris Male

        Issue Links

          Activity

          Hide
          Michael McCandless added a comment -

          Can we add weighting to Lucene's SortField?

          Show
          Michael McCandless added a comment - Can we add weighting to Lucene's SortField?
          Hide
          Chris Male added a comment -

          How would we expose that, through Sort and just directly on SortField?

          Currently Solr iterates over the SortFields in a Sort and weights each of them if necessary. Weighting then creates new SortFields which are stored in a new Sort. So essentially the initial Sort is being used as a factory to create this new weighted Sort of weighted SortFields.

          So we could go down a line similar to that of Query/Weight/Scorer and have something like SortField which produce WeightedSortField which then create FieldComparators.

          Show
          Chris Male added a comment - How would we expose that, through Sort and just directly on SortField? Currently Solr iterates over the SortFields in a Sort and weights each of them if necessary. Weighting then creates new SortFields which are stored in a new Sort. So essentially the initial Sort is being used as a factory to create this new weighted Sort of weighted SortFields. So we could go down a line similar to that of Query/Weight/Scorer and have something like SortField which produce WeightedSortField which then create FieldComparators.
          Hide
          Chris Male added a comment -

          Or we could just go for something simpler I guess, and basically borrow the same idea and add weight(IndexSearcher) to Sort & SortField which then returns a new Sort if any SortField changes.

          Show
          Chris Male added a comment - Or we could just go for something simpler I guess, and basically borrow the same idea and add weight(IndexSearcher) to Sort & SortField which then returns a new Sort if any SortField changes.
          Hide
          Michael McCandless added a comment -

          I guess add Sort.weight? And we should strongly type it, I think? Ie it returns a WeightedSort? (So we don't have to do instanceof checks). But maybe then the SortField within the Sort don't need a new class...? Not sure.

          And IndexSearcher, before using the Sort, would weight it.

          Show
          Michael McCandless added a comment - I guess add Sort.weight? And we should strongly type it, I think? Ie it returns a WeightedSort? (So we don't have to do instanceof checks). But maybe then the SortField within the Sort don't need a new class...? Not sure. And IndexSearcher, before using the Sort, would weight it.
          Hide
          Yonik Seeley added a comment -

          The query.rewrite model isn't strongly typed and seems to work fine. The advantaged is that 99% of the time the SortField will remain unchanged (and the base-class implementation of "return this" will be used). That's much less of an API change too (rather than moving a whole bunch of Sort methods to WeightedSort).

          Show
          Yonik Seeley added a comment - The query.rewrite model isn't strongly typed and seems to work fine. The advantaged is that 99% of the time the SortField will remain unchanged (and the base-class implementation of "return this" will be used). That's much less of an API change too (rather than moving a whole bunch of Sort methods to WeightedSort).
          Hide
          Chris Male added a comment -

          I agree with Yonik's thoughts here, we really want to limit the effect on the overall API since currently its just for FunctionQuerys. I will create a patch along those lines.

          Show
          Chris Male added a comment - I agree with Yonik's thoughts here, we really want to limit the effect on the overall API since currently its just for FunctionQuerys. I will create a patch along those lines.
          Hide
          Chris Male added a comment -

          Patch adds weight(IndexSearcher) to both Sort & SortField.

          Sort.weight emulates the same algorithm as used to exist in SolrIndexSearcher#getSort, only returning a new Sort if a SortField changes.

          SortField.WEIGHTABLE type is added to SortField to prevent SortFields that need to be weighted being used before they have been.

          Reduces SolrIndexSearch#getSort down. It can probably be removed. Cleans out ValueSource of the SolrSortField idea and dummy comparator.

          Show
          Chris Male added a comment - Patch adds weight(IndexSearcher) to both Sort & SortField. Sort.weight emulates the same algorithm as used to exist in SolrIndexSearcher#getSort, only returning a new Sort if a SortField changes. SortField.WEIGHTABLE type is added to SortField to prevent SortFields that need to be weighted being used before they have been. Reduces SolrIndexSearch#getSort down. It can probably be removed. Cleans out ValueSource of the SolrSortField idea and dummy comparator.
          Hide
          Michael McCandless added a comment -

          OK, I agree: let's not make this strongly typed. So, this 'weighting' is more like Query.rewrite than Query.weight, right?

          Ie, if it needs to, it returns a new Query that "compiled in" something about the top searcher it was passed. If it doesn't need to, it returns itself?

          Patch looks great Chris! Except I wonder if we should rename "weight" to "rewrite", or something else... (compile? compileSearcher?)?

          Show
          Michael McCandless added a comment - OK, I agree: let's not make this strongly typed. So, this 'weighting' is more like Query.rewrite than Query.weight, right? Ie, if it needs to, it returns a new Query that "compiled in" something about the top searcher it was passed. If it doesn't need to, it returns itself? Patch looks great Chris! Except I wonder if we should rename "weight" to "rewrite", or something else... (compile? compileSearcher?)?
          Hide
          Chris Male added a comment -

          So, this 'weighting' is more like Query.rewrite than Query.weight, right?

          Except I wonder if we should rename "weight" to "rewrite"

          Yeah I totally agree. The fact that ValueSources use to weight inner Queries is actually an implementation detail and even then it does a rewrite too. I think for consistency with Query, we should go with rewrite. I'll regenerate the patch.

          Show
          Chris Male added a comment - So, this 'weighting' is more like Query.rewrite than Query.weight, right? Except I wonder if we should rename "weight" to "rewrite" Yeah I totally agree. The fact that ValueSources use to weight inner Queries is actually an implementation detail and even then it does a rewrite too. I think for consistency with Query, we should go with rewrite. I'll regenerate the patch.
          Hide
          Chris Male added a comment -

          Updated patch with the change to rewrite.

          Show
          Chris Male added a comment - Updated patch with the change to rewrite.
          Hide
          Michael McCandless added a comment -

          Looks great – I think it's ready to commit. Can you add a CHANGES?

          Show
          Michael McCandless added a comment - Looks great – I think it's ready to commit. Can you add a CHANGES?
          Hide
          Michael McCandless added a comment -

          I plan to commit this soon.

          Show
          Michael McCandless added a comment - I plan to commit this soon.
          Hide
          Chris Male added a comment -

          New Patch including CHANGES.txt changes.

          Show
          Chris Male added a comment - New Patch including CHANGES.txt changes.
          Hide
          Michael McCandless added a comment -

          Chris now that you are committer you should assign this to yourself and commit it

          Then we can keep going on LUCENE-2883!

          Show
          Michael McCandless added a comment - Chris now that you are committer you should assign this to yourself and commit it Then we can keep going on LUCENE-2883 !
          Hide
          Chris Male added a comment -

          Updated patch to trunk. I plan to commit shortly.

          Show
          Chris Male added a comment - Updated patch to trunk. I plan to commit shortly.
          Hide
          Simon Willnauer added a comment -

          there seems to be a problem with the CHANGES.txt changes in your patch. Intuitively I would say this needs to go into solr/CHANGES.TXT and it should not remove an existing entry no?

          Show
          Simon Willnauer added a comment - there seems to be a problem with the CHANGES.txt changes in your patch. Intuitively I would say this needs to go into solr/CHANGES.TXT and it should not remove an existing entry no?
          Hide
          Chris Male added a comment -

          Fixed bad merge of CHANGES.txt as pointed out by Simon.

          I'm going to leave the changes entry in both CHANGES.txt since overall its a different change to both Solr and Lucene, despite this being a Solr issue.

          Show
          Chris Male added a comment - Fixed bad merge of CHANGES.txt as pointed out by Simon. I'm going to leave the changes entry in both CHANGES.txt since overall its a different change to both Solr and Lucene, despite this being a Solr issue.
          Hide
          Chris Male added a comment -

          Committed revision 1137612

          Show
          Chris Male added a comment - Committed revision 1137612

            People

            • Assignee:
              Chris Male
              Reporter:
              Chris Male
            • Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development