Lucene - Core
  1. Lucene - Core
  2. LUCENE-6268

Replace doc values filters with queries having approximations

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Minor Minor
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 5.1, 6.0
    • Component/s: None
    • Labels:
      None
    • Lucene Fields:
      New

      Description

      We should use approximations in order to deal with queries/filters that have slow iterators such as doc-values based queries/filters.

      1. LUCENE-6268.patch
        135 kB
        Adrien Grand

        Activity

        Hide
        Adrien Grand added a comment -

        Here is a patch, it replaces:

        • FieldValueFilter with FieldValueQuery
        • DocValuesRangeFilter and DocTermsOrdRangeFilter with DocValuesRangeQuery

        These new queries support two-phase iterators with an approximation which matches all documents between 0 and maxDoc-1.

        The new queries do not have the "docsWithField instanceof BitSet" optimization anymore since the 5.0 doc-values format does not use bit sets for any of its docWithField implementations.

        On 5.x we could just deprecate these filters.

        Show
        Adrien Grand added a comment - Here is a patch, it replaces: FieldValueFilter with FieldValueQuery DocValuesRangeFilter and DocTermsOrdRangeFilter with DocValuesRangeQuery These new queries support two-phase iterators with an approximation which matches all documents between 0 and maxDoc-1. The new queries do not have the "docsWithField instanceof BitSet" optimization anymore since the 5.0 doc-values format does not use bit sets for any of its docWithField implementations. On 5.x we could just deprecate these filters.
        Hide
        Adrien Grand added a comment -

        Same file with the conventional name.

        Show
        Adrien Grand added a comment - Same file with the conventional name.
        Hide
        Robert Muir added a comment -

        +1

        For the deprecations, can we still remove the code and implement the deprecated ones with QWF(Query)?
        E.g. is it possible to do it like TermFilter where we just do:

        @Deprecated
        public class FieldValueFilter extends QueryWrapperFilter
        

        This way we don't have to really maintain the code to these old ones.

        Show
        Robert Muir added a comment - +1 For the deprecations, can we still remove the code and implement the deprecated ones with QWF(Query)? E.g. is it possible to do it like TermFilter where we just do: @Deprecated public class FieldValueFilter extends QueryWrapperFilter This way we don't have to really maintain the code to these old ones.
        Hide
        Adrien Grand added a comment -

        One thing I'm concerned about if we do that is that these filters will not expose random-access anymore, which could break some applications?

        Show
        Adrien Grand added a comment - One thing I'm concerned about if we do that is that these filters will not expose random-access anymore, which could break some applications?
        Hide
        Robert Muir added a comment -

        OK I agree, lets just deprecate them as-is for now.

        On a followup issue, maybe we can allow a similar api to be exposed on query/weight/scorer, so that booleanquery can do the optimizations filteredquery and booleanfilter are doing (any optimizations that really help and not hurt). If we did this, then I think we could remove the duplicate impls.

        Show
        Robert Muir added a comment - OK I agree, lets just deprecate them as-is for now. On a followup issue, maybe we can allow a similar api to be exposed on query/weight/scorer, so that booleanquery can do the optimizations filteredquery and booleanfilter are doing (any optimizations that really help and not hurt). If we did this, then I think we could remove the duplicate impls.
        Hide
        ASF subversion and git services added a comment -

        Commit 1661156 from Adrien Grand in branch 'dev/trunk'
        [ https://svn.apache.org/r1661156 ]

        LUCENE-6268: Replace FieldValueFilter and DocValuesRangeFilter with equivalent queries that support approximations.

        Show
        ASF subversion and git services added a comment - Commit 1661156 from Adrien Grand in branch 'dev/trunk' [ https://svn.apache.org/r1661156 ] LUCENE-6268 : Replace FieldValueFilter and DocValuesRangeFilter with equivalent queries that support approximations.
        Hide
        ASF subversion and git services added a comment -

        Commit 1661167 from Adrien Grand in branch 'dev/branches/branch_5x'
        [ https://svn.apache.org/r1661167 ]

        LUCENE-6268: Replace FieldValueFilter and DocValuesRangeFilter with equivalent queries that support approximations.

        Show
        ASF subversion and git services added a comment - Commit 1661167 from Adrien Grand in branch 'dev/branches/branch_5x' [ https://svn.apache.org/r1661167 ] LUCENE-6268 : Replace FieldValueFilter and DocValuesRangeFilter with equivalent queries that support approximations.
        Hide
        Timothy Potter added a comment -

        Bulk close after 5.1 release

        Show
        Timothy Potter added a comment - Bulk close after 5.1 release

          People

          • Assignee:
            Adrien Grand
            Reporter:
            Adrien Grand
          • Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development