Lucene - Core
  1. Lucene - Core
  2. LUCENE-5345

range facets don't work with float/double fields

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 4.7, 5.0
    • Component/s: modules/facet
    • Labels:
      None
    • Lucene Fields:
      New

      Description

      With LUCENE-5297 we generalized range faceting to accept a ValueSource.

      But, when I tried to use this to facet by distance (< 1 km, < 2 km, etc.), it's not working ... the problem is that the RangeAccumulator always uses .longVal and assumes this was a double encoded as a long (via DoubleField).

      1. LUCENE-5345.patch
        5 kB
        Michael McCandless

        Activity

        Hide
        Michael McCandless added a comment -

        Patch w/ simple test showing the bug ... I'm not sure how to fix yet.

        Show
        Michael McCandless added a comment - Patch w/ simple test showing the bug ... I'm not sure how to fix yet.
        Hide
        Shai Erera added a comment -

        I'm not sure either if the problem is in ValueSource which implements longVal() to return (long) doubleVal(doc), or that we need to check the actual range type in RangeAccumulator and somehow pull the right value and pass it to the right method (e.g. maybe adding Range.accept variant for int/float/double)... we could have these variants throw UOE (or return false) at the base Range class and have each Range override that one that it cares about. But then it means we need to duplicate the code which traverses the ranges to work with the right value...

        I'll think about it more.

        At any rate, I think the test should move to demo/ so that we don't introduce the dependency on expressions in the facet module.

        Show
        Shai Erera added a comment - I'm not sure either if the problem is in ValueSource which implements longVal() to return (long) doubleVal(doc) , or that we need to check the actual range type in RangeAccumulator and somehow pull the right value and pass it to the right method (e.g. maybe adding Range.accept variant for int/float/double)... we could have these variants throw UOE (or return false) at the base Range class and have each Range override that one that it cares about. But then it means we need to duplicate the code which traverses the ranges to work with the right value... I'll think about it more. At any rate, I think the test should move to demo/ so that we don't introduce the dependency on expressions in the facet module.
        Hide
        Shai Erera added a comment -

        Or, we could modify the test to use a custom ValueSource which mimics expression's behavior (not encoding the double as long, but just casting).

        Show
        Shai Erera added a comment - Or, we could modify the test to use a custom ValueSource which mimics expression's behavior (not encoding the double as long, but just casting).
        Hide
        Robert Muir added a comment -

        You dont even need a custom valuesource: they all behave this way (e.g. same as java.lang.Number): the bug is as mike describes.

        RangeFacetRequest always uses LongFieldSource, regardless of the type of Range the user passes in.

        Show
        Robert Muir added a comment - You dont even need a custom valuesource: they all behave this way (e.g. same as java.lang.Number): the bug is as mike describes. RangeFacetRequest always uses LongFieldSource, regardless of the type of Range the user passes in.
        Hide
        Shai Erera added a comment -

        I don't think that's the case (but I'm not near the code to double-check). LongFieldSource is only used in the ctor which takes a field name, and as documented - reads the values from the respective NDV field. Mike's test uses Expression and its ValueSource.

        I think a solution could be to separate RangeAccumulator into a NumericRangeAccumulator and ValueSourceRangeAccumulator. The former always reads from an NDV and takes whatever range. The latter could restrict to only taking DoubleRange, and always call vs.doubleVal, since that seems like the only API a ValueSource has to impl. Or, it takes any Range, but always calls vs.doubleVal and converts it to Long, and the Range can convert it back?

        Show
        Shai Erera added a comment - I don't think that's the case (but I'm not near the code to double-check). LongFieldSource is only used in the ctor which takes a field name, and as documented - reads the values from the respective NDV field. Mike's test uses Expression and its ValueSource. I think a solution could be to separate RangeAccumulator into a NumericRangeAccumulator and ValueSourceRangeAccumulator. The former always reads from an NDV and takes whatever range. The latter could restrict to only taking DoubleRange, and always call vs.doubleVal, since that seems like the only API a ValueSource has to impl. Or, it takes any Range, but always calls vs.doubleVal and converts it to Long, and the Range can convert it back?
        Hide
        ASF subversion and git services added a comment -

        Commit 1546809 from Michael McCandless in branch 'dev/branches/lucene5339'
        [ https://svn.apache.org/r1546809 ]

        LUCENE-5345: split RangeFacetCounts into Long/DoubleRangeFacetCounts

        Show
        ASF subversion and git services added a comment - Commit 1546809 from Michael McCandless in branch 'dev/branches/lucene5339' [ https://svn.apache.org/r1546809 ] LUCENE-5345 : split RangeFacetCounts into Long/DoubleRangeFacetCounts
        Hide
        Michael McCandless added a comment -

        On the LUCENE-5339 branch, I just made a change to split RangeFacetCounts into Long/DoubleRangeFacetCounts, and I removed FloatRange (leaving only Long/Double). When the app needs to count ranges against a previously indexed FloatDocValuesField, it can use FloatFieldSource. It seems to work well... the new test passes.

        Show
        Michael McCandless added a comment - On the LUCENE-5339 branch, I just made a change to split RangeFacetCounts into Long/DoubleRangeFacetCounts, and I removed FloatRange (leaving only Long/Double). When the app needs to count ranges against a previously indexed FloatDocValuesField, it can use FloatFieldSource. It seems to work well... the new test passes.
        Hide
        Shai Erera added a comment -

        Is it mandatory to have facet/ depend on expressions? Can't we do it under demo/ only, and use mock ValueSources for testing facet/?

        Show
        Shai Erera added a comment - Is it mandatory to have facet/ depend on expressions? Can't we do it under demo/ only, and use mock ValueSources for testing facet/?
        Hide
        ASF subversion and git services added a comment -

        Commit 1546812 from Michael McCandless in branch 'dev/branches/lucene5339'
        [ https://svn.apache.org/r1546812 ]

        LUCENE-5345: move the distance example to demo

        Show
        ASF subversion and git services added a comment - Commit 1546812 from Michael McCandless in branch 'dev/branches/lucene5339' [ https://svn.apache.org/r1546812 ] LUCENE-5345 : move the distance example to demo
        Hide
        Michael McCandless added a comment -

        Is it mandatory to have facet/ depend on expressions? Can't we do it under demo/ only, and use mock ValueSources for testing facet/?

        You're right, I just moved it to demo/ and left a "custom doubles values source" test in facet/

        Show
        Michael McCandless added a comment - Is it mandatory to have facet/ depend on expressions? Can't we do it under demo/ only, and use mock ValueSources for testing facet/? You're right, I just moved it to demo/ and left a "custom doubles values source" test in facet/
        Hide
        Michael McCandless added a comment -

        This was fixed with LUCENE-5339.

        Show
        Michael McCandless added a comment - This was fixed with LUCENE-5339 .
        Hide
        ASF subversion and git services added a comment -

        Commit 1556952 from Michael McCandless in branch 'dev/trunk'
        [ https://svn.apache.org/r1556952 ]

        LUCENE-5345: add new BlendedInfixSuggester

        Show
        ASF subversion and git services added a comment - Commit 1556952 from Michael McCandless in branch 'dev/trunk' [ https://svn.apache.org/r1556952 ] LUCENE-5345 : add new BlendedInfixSuggester
        Hide
        ASF subversion and git services added a comment -

        Commit 1556954 from Michael McCandless in branch 'dev/branches/branch_4x'
        [ https://svn.apache.org/r1556954 ]

        LUCENE-5345: add new BlendedInfixSuggester

        Show
        ASF subversion and git services added a comment - Commit 1556954 from Michael McCandless in branch 'dev/branches/branch_4x' [ https://svn.apache.org/r1556954 ] LUCENE-5345 : add new BlendedInfixSuggester
        Hide
        Michael McCandless added a comment -

        Woops, above commit was for LUCENE-5354 instead.

        Show
        Michael McCandless added a comment - Woops, above commit was for LUCENE-5354 instead.
        Hide
        Shai Erera added a comment -

        You can try fixing the log message, I wonder if it will associate it then with the right JIRA issue. I think I've done it once while working on NDV updates. See the accepted answer here: http://stackoverflow.com/questions/304383/how-do-i-edit-a-log-message-that-i-already-committed-in-subversion (with a snippet from the SVN documentation).

        Show
        Shai Erera added a comment - You can try fixing the log message, I wonder if it will associate it then with the right JIRA issue. I think I've done it once while working on NDV updates. See the accepted answer here: http://stackoverflow.com/questions/304383/how-do-i-edit-a-log-message-that-i-already-committed-in-subversion (with a snippet from the SVN documentation).
        Hide
        Michael McCandless added a comment -

        OK I just fixed it! Thanks for the tip Shai.

        Show
        Michael McCandless added a comment - OK I just fixed it! Thanks for the tip Shai.

          People

          • Assignee:
            Unassigned
            Reporter:
            Michael McCandless
          • Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development