Uploaded image for project: 'Solr'
  1. Solr
  2. SOLR-1871

Function Query "map" variant that allows "target" to be an arbitrary ValueSource

    Details

    • Type: Improvement
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 1.4
    • Fix Version/s: 4.7, 6.0
    • Component/s: search
    • Labels:
      None

      Description

      Currently, as documented at http://wiki.apache.org/solr/FunctionQuery#map, the "target" of a map must be a floating point constant. I propose that you should have at least the option of doing a map where the target is an arbitrary ValueSource.

      The particular use case that inspired this is that I want to be able to control how missing date fields affected boosting. In particular, I want to be able to use something like this in my function queries:

      map(mydatefield,0,0,ms(NOW))
      

      But this might have other uses.

      I'll attach an initial implementation.

      1. ASF.LICENSE.NOT.GRANTED--SOLR-1871.patch
        5 kB
        Chris Harris
      2. SOLR-1871.patch
        8 kB
        Shalin Shekhar Mangar
      3. SOLR-1871.patch
        7 kB
        Shalin Shekhar Mangar
      4. SOLR-1871.patch
        6 kB
        Chris Harris
      5. SOLR-1871.patch
        6 kB
        Chris Harris

        Activity

        Hide
        ryguasu Chris Harris added a comment -

        A first stab at this. In this implementation, we use copy and paste and create a wholly separate function, "mapf" ("f" because this map is more function-oriented).

        mapf is not as efficient as normal map when the target is a constant. If the target is a constant, you now have something like a LongConstValueSource where you used to have just a float. I haven't tried to measure the performance difference.

        RangeMapfFloatFunction.hashCode() may be messed up. It's based on RangeMapFloatFunction.hashCode(), but I threw this patch together without stopping to properly understand that. method first.

        Show
        ryguasu Chris Harris added a comment - A first stab at this. In this implementation, we use copy and paste and create a wholly separate function, "mapf" ("f" because this map is more function-oriented). mapf is not as efficient as normal map when the target is a constant. If the target is a constant, you now have something like a LongConstValueSource where you used to have just a float. I haven't tried to measure the performance difference. RangeMapfFloatFunction.hashCode() may be messed up. It's based on RangeMapFloatFunction.hashCode(), but I threw this patch together without stopping to properly understand that. method first.
        Hide
        yseeley@gmail.com Yonik Seeley added a comment -

        Yep, makes sense. Some of the older FunctionQueries predate the ability of the parser to generalize more (one had to know if one was parsing a constant or a function). It might make sense to generalize the map function rather than have two functions?

        Show
        yseeley@gmail.com Yonik Seeley added a comment - Yep, makes sense. Some of the older FunctionQueries predate the ability of the parser to generalize more (one had to know if one was parsing a constant or a function). It might make sense to generalize the map function rather than have two functions?
        Hide
        ryguasu Chris Harris added a comment -

        Clean up some sloppiness in RangeMapfFloatFunction, where target was still being treated as if it were a value type.

        Show
        ryguasu Chris Harris added a comment - Clean up some sloppiness in RangeMapfFloatFunction, where target was still being treated as if it were a value type.
        Hide
        ryguasu Chris Harris added a comment -

        Yep, makes sense. Some of the older FunctionQueries predate the ability of the parser to generalize more (one had to know if one was parsing a constant or a function). It might make sense to generalize the map function rather than have two functions?

        Yes, having a single map sounds less confusing from a user perspective.

        The reason I started by forking map into map and mapf was that I was worried about performance; I assumed (apparently incorrectly?) that hard-coding certain things as float constants had been done as a performance optimization. Are you implying that the approach taken by this fmap code is probably fast enough to be the standard implementation of map?

        And what if it were taken to the extreme of making all arguments for all function queries ValueSource compatible? (That is, what if all functions used fp.parseValueSource() for all their arguments, and parsed none of them with fp.parseFloat()?) Does that start to sound inefficient?

        Show
        ryguasu Chris Harris added a comment - Yep, makes sense. Some of the older FunctionQueries predate the ability of the parser to generalize more (one had to know if one was parsing a constant or a function). It might make sense to generalize the map function rather than have two functions? Yes, having a single map sounds less confusing from a user perspective. The reason I started by forking map into map and mapf was that I was worried about performance; I assumed (apparently incorrectly?) that hard-coding certain things as float constants had been done as a performance optimization. Are you implying that the approach taken by this fmap code is probably fast enough to be the standard implementation of map? And what if it were taken to the extreme of making all arguments for all function queries ValueSource compatible? (That is, what if all functions used fp.parseValueSource() for all their arguments, and parsed none of them with fp.parseFloat()?) Does that start to sound inefficient?
        Hide
        ryguasu Chris Harris added a comment -

        Revised patch, with two changes:

        • Now we modify the behavior of "map", rather than introducing a new "mapf" function.
        • Now not only the "target" but also the "defaultValue" can be a ValueSource.
        Show
        ryguasu Chris Harris added a comment - Revised patch, with two changes: Now we modify the behavior of "map", rather than introducing a new "mapf" function. Now not only the "target" but also the "defaultValue" can be a ValueSource.
        Hide
        erickerickson Erick Erickson added a comment -

        2013 Old JIRA cleanup

        Show
        erickerickson Erick Erickson added a comment - 2013 Old JIRA cleanup
        Hide
        shalinmangar Shalin Shekhar Mangar added a comment -

        Patch updated to trunk.

        I found a bug in TestValueSources.assertHits which was using search(Query, int, sort) method and didn't actually calculate scores. The whole test therefore wasn't actually testing anything. I changed it to use the search(Query, Filter, int, Sort, boolean, boolean) method.

        The javadoc for RangeMapFloatFunction also was a copy/paste from LinearFloatFunction. This patch fixes that as well.

        Show
        shalinmangar Shalin Shekhar Mangar added a comment - Patch updated to trunk. I found a bug in TestValueSources.assertHits which was using search(Query, int, sort) method and didn't actually calculate scores. The whole test therefore wasn't actually testing anything. I changed it to use the search(Query, Filter, int, Sort, boolean, boolean) method. The javadoc for RangeMapFloatFunction also was a copy/paste from LinearFloatFunction. This patch fixes that as well.
        Hide
        shalinmangar Shalin Shekhar Mangar added a comment -

        This patch preserves backward compatibility in RangeMapFloatFunction constructor by wrapping a float target or default in a ConstValueSource.

        Show
        shalinmangar Shalin Shekhar Mangar added a comment - This patch preserves backward compatibility in RangeMapFloatFunction constructor by wrapping a float target or default in a ConstValueSource.
        Hide
        jira-bot ASF subversion and git services added a comment -

        Commit 1546926 from shalin@apache.org in branch 'dev/trunk'
        [ https://svn.apache.org/r1546926 ]

        SOLR-1871: The 'map' function query accepts a ValueSource as target and default value

        Show
        jira-bot ASF subversion and git services added a comment - Commit 1546926 from shalin@apache.org in branch 'dev/trunk' [ https://svn.apache.org/r1546926 ] SOLR-1871 : The 'map' function query accepts a ValueSource as target and default value
        Hide
        jira-bot ASF subversion and git services added a comment -

        Commit 1546927 from shalin@apache.org in branch 'dev/branches/branch_4x'
        [ https://svn.apache.org/r1546927 ]

        SOLR-1871: The 'map' function query accepts a ValueSource as target and default value

        Show
        jira-bot ASF subversion and git services added a comment - Commit 1546927 from shalin@apache.org in branch 'dev/branches/branch_4x' [ https://svn.apache.org/r1546927 ] SOLR-1871 : The 'map' function query accepts a ValueSource as target and default value
        Hide
        shalinmangar Shalin Shekhar Mangar added a comment -

        Thanks Chris!

        Show
        shalinmangar Shalin Shekhar Mangar added a comment - Thanks Chris!

          People

          • Assignee:
            shalinmangar Shalin Shekhar Mangar
            Reporter:
            ryguasu Chris Harris
          • Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development