Lucene - Core
  1. Lucene - Core
  2. LUCENE-6702

[suggest] Make Context Query and Field extensible

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 6.0
    • Fix Version/s: 5.3, 6.0
    • Component/s: core/other
    • Labels:
      None
    • Lucene Fields:
      New, Patch Available

      Description

      ContextSuggestField indexes context information along with
      suggestions, which can be used to filter and/or boost suggestions using
      ContextQuery.
      It would be useful to make ContextSuggestField extensible such that subclasses
      can inject context values at index-time, without having to specify the
      contexts in its ctor.
      ContextQuery can be made extensible by allowing sub-classes to override
      how context automaton is created from provided query contexts.
      Currently, ContextQuery uses a context value of "*" to consider all context values,
      It makes sense to have a dedicated addAllContexts() instead.

      1. LUCENE-6702.patch
        20 kB
        Areek Zillur
      2. LUCENE-6702.patch
        20 kB
        Areek Zillur

        Activity

        Hide
        Areek Zillur added a comment -

        Initial patch:

        • add extension points to ContextQuery and ContextSuggestField
        • add dedicated addAllContexts() method to ContextQuery to account for all context values
        • simplify ContextQuery logic for extracting contexts
        Show
        Areek Zillur added a comment - Initial patch: add extension points to ContextQuery and ContextSuggestField add dedicated addAllContexts() method to ContextQuery to account for all context values simplify ContextQuery logic for extracting contexts
        Hide
        Michael McCandless added a comment -

        Thanks Areek Zillur, patch looks good ... can you give some example use cases
        to motivate opening things up via subclassing?

        I saw some minor things:

        +import java.util.*;

        Could you fix your IDE to do keep each import separate? Thanks.

        I think toString is now not-so-friendly vs before, since we cutover to
        IntsRef contexts key? (IntsRef.toString vs CharSequence.toString).

        Show
        Michael McCandless added a comment - Thanks Areek Zillur , patch looks good ... can you give some example use cases to motivate opening things up via subclassing? I saw some minor things: +import java.util.*; Could you fix your IDE to do keep each import separate? Thanks. I think toString is now not-so-friendly vs before, since we cutover to IntsRef contexts key? (IntsRef.toString vs CharSequence.toString).
        Hide
        Areek Zillur added a comment - - edited

        Thanks Michael McCandless for the review!

        can you give some example use cases
        to motivate opening things up via subclassing?

        An use-case for opening up ContextSuggestField#contexts() is to
        get context values from another field in a document. The
        field in question might not be present at context field creation time
        (e.g. due to the field not being added yet) but can be accessed from the
        document when it is being indexed. Opening up contexts() will allow
        sub-classes to steal the context values from the document at index-time.

        After more thought, I have reverted opening up ContextQuery#convertAutomaton
        via sub-classing, it has proven to be difficult to use. Originally, it was meant
        such that sub-classes can modify the context automaton without having to
        worry about all the implementation details.

        saw some minor things:
        +import java.util.*;
        Could you fix your IDE to do keep each import separate? Thanks.

        Fixed the imports.

        I think toString is now not-so-friendly vs before, since we cutover to
        IntsRef contexts key? (IntsRef.toString vs CharSequence.toString)

        Thanks for catching this, now it uses BytesRef.utf8ToString

        Updated Patch:

        • make toString friendly, fix up imports
        • revert opening up ContextQuery#convertAutomaton
        Show
        Areek Zillur added a comment - - edited Thanks Michael McCandless for the review! can you give some example use cases to motivate opening things up via subclassing? An use-case for opening up ContextSuggestField#contexts() is to get context values from another field in a document. The field in question might not be present at context field creation time (e.g. due to the field not being added yet) but can be accessed from the document when it is being indexed. Opening up contexts() will allow sub-classes to steal the context values from the document at index-time. After more thought, I have reverted opening up ContextQuery#convertAutomaton via sub-classing, it has proven to be difficult to use. Originally, it was meant such that sub-classes can modify the context automaton without having to worry about all the implementation details. saw some minor things: +import java.util.*; Could you fix your IDE to do keep each import separate? Thanks. Fixed the imports. I think toString is now not-so-friendly vs before, since we cutover to IntsRef contexts key? (IntsRef.toString vs CharSequence.toString) Thanks for catching this, now it uses BytesRef.utf8ToString Updated Patch: make toString friendly, fix up imports revert opening up ContextQuery#convertAutomaton
        Hide
        Michael McCandless added a comment -

        Thanks for the explanation Areek Zillur, it seems very expert, and thanks for the new patch ... +1 to commit.

        Show
        Michael McCandless added a comment - Thanks for the explanation Areek Zillur , it seems very expert, and thanks for the new patch ... +1 to commit.
        Hide
        ASF subversion and git services added a comment -

        Commit 1693493 from Areek Zillur in branch 'dev/trunk'
        [ https://svn.apache.org/r1693493 ]

        LUCENE-6702: Add a method to inject context values at index time in ContextSuggestField and simplify ContextQuery and add dedicated method to consider all contexts at query time

        Show
        ASF subversion and git services added a comment - Commit 1693493 from Areek Zillur in branch 'dev/trunk' [ https://svn.apache.org/r1693493 ] LUCENE-6702 : Add a method to inject context values at index time in ContextSuggestField and simplify ContextQuery and add dedicated method to consider all contexts at query time
        Hide
        ASF subversion and git services added a comment -

        Commit 1693494 from Areek Zillur in branch 'dev/branches/branch_5x'
        [ https://svn.apache.org/r1693494 ]

        LUCENE-6702: Add a method to inject context values at index time in ContextSuggestField and simplify ContextQuery and add dedicated method to consider all contexts at query time

        Show
        ASF subversion and git services added a comment - Commit 1693494 from Areek Zillur in branch 'dev/branches/branch_5x' [ https://svn.apache.org/r1693494 ] LUCENE-6702 : Add a method to inject context values at index time in ContextSuggestField and simplify ContextQuery and add dedicated method to consider all contexts at query time
        Hide
        Areek Zillur added a comment -

        Thanks Michael McCandless for the review!

        Show
        Areek Zillur added a comment - Thanks Michael McCandless for the review!
        Hide
        Shalin Shekhar Mangar added a comment -

        Bulk close for 5.3.0 release

        Show
        Shalin Shekhar Mangar added a comment - Bulk close for 5.3.0 release

          People

          • Assignee:
            Areek Zillur
            Reporter:
            Areek Zillur
          • Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development