Solr
  1. Solr
  2. SOLR-1871

Function Query "map" variant that allows "target" to be an arbitrary ValueSource

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 1.4
    • Fix Version/s: 4.7, 6.0
    • Component/s: search
    • Labels:
      None

      Description

      Currently, as documented at http://wiki.apache.org/solr/FunctionQuery#map, the "target" of a map must be a floating point constant. I propose that you should have at least the option of doing a map where the target is an arbitrary ValueSource.

      The particular use case that inspired this is that I want to be able to control how missing date fields affected boosting. In particular, I want to be able to use something like this in my function queries:

      map(mydatefield,0,0,ms(NOW))
      

      But this might have other uses.

      I'll attach an initial implementation.

      1. ASF.LICENSE.NOT.GRANTED--SOLR-1871.patch
        5 kB
        Chris Harris
      2. SOLR-1871.patch
        8 kB
        Shalin Shekhar Mangar
      3. SOLR-1871.patch
        7 kB
        Shalin Shekhar Mangar
      4. SOLR-1871.patch
        6 kB
        Chris Harris
      5. SOLR-1871.patch
        6 kB
        Chris Harris

        Activity

        Hide
        Chris Harris added a comment -

        A first stab at this. In this implementation, we use copy and paste and create a wholly separate function, "mapf" ("f" because this map is more function-oriented).

        mapf is not as efficient as normal map when the target is a constant. If the target is a constant, you now have something like a LongConstValueSource where you used to have just a float. I haven't tried to measure the performance difference.

        RangeMapfFloatFunction.hashCode() may be messed up. It's based on RangeMapFloatFunction.hashCode(), but I threw this patch together without stopping to properly understand that. method first.

        Show
        Chris Harris added a comment - A first stab at this. In this implementation, we use copy and paste and create a wholly separate function, "mapf" ("f" because this map is more function-oriented). mapf is not as efficient as normal map when the target is a constant. If the target is a constant, you now have something like a LongConstValueSource where you used to have just a float. I haven't tried to measure the performance difference. RangeMapfFloatFunction.hashCode() may be messed up. It's based on RangeMapFloatFunction.hashCode(), but I threw this patch together without stopping to properly understand that. method first.
        Hide
        Yonik Seeley added a comment -

        Yep, makes sense. Some of the older FunctionQueries predate the ability of the parser to generalize more (one had to know if one was parsing a constant or a function). It might make sense to generalize the map function rather than have two functions?

        Show
        Yonik Seeley added a comment - Yep, makes sense. Some of the older FunctionQueries predate the ability of the parser to generalize more (one had to know if one was parsing a constant or a function). It might make sense to generalize the map function rather than have two functions?
        Hide
        Chris Harris added a comment -

        Clean up some sloppiness in RangeMapfFloatFunction, where target was still being treated as if it were a value type.

        Show
        Chris Harris added a comment - Clean up some sloppiness in RangeMapfFloatFunction, where target was still being treated as if it were a value type.
        Hide
        Chris Harris added a comment -

        Yep, makes sense. Some of the older FunctionQueries predate the ability of the parser to generalize more (one had to know if one was parsing a constant or a function). It might make sense to generalize the map function rather than have two functions?

        Yes, having a single map sounds less confusing from a user perspective.

        The reason I started by forking map into map and mapf was that I was worried about performance; I assumed (apparently incorrectly?) that hard-coding certain things as float constants had been done as a performance optimization. Are you implying that the approach taken by this fmap code is probably fast enough to be the standard implementation of map?

        And what if it were taken to the extreme of making all arguments for all function queries ValueSource compatible? (That is, what if all functions used fp.parseValueSource() for all their arguments, and parsed none of them with fp.parseFloat()?) Does that start to sound inefficient?

        Show
        Chris Harris added a comment - Yep, makes sense. Some of the older FunctionQueries predate the ability of the parser to generalize more (one had to know if one was parsing a constant or a function). It might make sense to generalize the map function rather than have two functions? Yes, having a single map sounds less confusing from a user perspective. The reason I started by forking map into map and mapf was that I was worried about performance; I assumed (apparently incorrectly?) that hard-coding certain things as float constants had been done as a performance optimization. Are you implying that the approach taken by this fmap code is probably fast enough to be the standard implementation of map? And what if it were taken to the extreme of making all arguments for all function queries ValueSource compatible? (That is, what if all functions used fp.parseValueSource() for all their arguments, and parsed none of them with fp.parseFloat()?) Does that start to sound inefficient?
        Hide
        Chris Harris added a comment -

        Revised patch, with two changes:

        • Now we modify the behavior of "map", rather than introducing a new "mapf" function.
        • Now not only the "target" but also the "defaultValue" can be a ValueSource.
        Show
        Chris Harris added a comment - Revised patch, with two changes: Now we modify the behavior of "map", rather than introducing a new "mapf" function. Now not only the "target" but also the "defaultValue" can be a ValueSource.
        Hide
        Erick Erickson added a comment -

        2013 Old JIRA cleanup

        Show
        Erick Erickson added a comment - 2013 Old JIRA cleanup
        Hide
        Shalin Shekhar Mangar added a comment -

        Patch updated to trunk.

        I found a bug in TestValueSources.assertHits which was using search(Query, int, sort) method and didn't actually calculate scores. The whole test therefore wasn't actually testing anything. I changed it to use the search(Query, Filter, int, Sort, boolean, boolean) method.

        The javadoc for RangeMapFloatFunction also was a copy/paste from LinearFloatFunction. This patch fixes that as well.

        Show
        Shalin Shekhar Mangar added a comment - Patch updated to trunk. I found a bug in TestValueSources.assertHits which was using search(Query, int, sort) method and didn't actually calculate scores. The whole test therefore wasn't actually testing anything. I changed it to use the search(Query, Filter, int, Sort, boolean, boolean) method. The javadoc for RangeMapFloatFunction also was a copy/paste from LinearFloatFunction. This patch fixes that as well.
        Hide
        Shalin Shekhar Mangar added a comment -

        This patch preserves backward compatibility in RangeMapFloatFunction constructor by wrapping a float target or default in a ConstValueSource.

        Show
        Shalin Shekhar Mangar added a comment - This patch preserves backward compatibility in RangeMapFloatFunction constructor by wrapping a float target or default in a ConstValueSource.
        Hide
        ASF subversion and git services added a comment -

        Commit 1546926 from shalin@apache.org in branch 'dev/trunk'
        [ https://svn.apache.org/r1546926 ]

        SOLR-1871: The 'map' function query accepts a ValueSource as target and default value

        Show
        ASF subversion and git services added a comment - Commit 1546926 from shalin@apache.org in branch 'dev/trunk' [ https://svn.apache.org/r1546926 ] SOLR-1871 : The 'map' function query accepts a ValueSource as target and default value
        Hide
        ASF subversion and git services added a comment -

        Commit 1546927 from shalin@apache.org in branch 'dev/branches/branch_4x'
        [ https://svn.apache.org/r1546927 ]

        SOLR-1871: The 'map' function query accepts a ValueSource as target and default value

        Show
        ASF subversion and git services added a comment - Commit 1546927 from shalin@apache.org in branch 'dev/branches/branch_4x' [ https://svn.apache.org/r1546927 ] SOLR-1871 : The 'map' function query accepts a ValueSource as target and default value
        Hide
        Shalin Shekhar Mangar added a comment -

        Thanks Chris!

        Show
        Shalin Shekhar Mangar added a comment - Thanks Chris!

          People

          • Assignee:
            Shalin Shekhar Mangar
            Reporter:
            Chris Harris
          • Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development