Solr
  1. Solr
  2. SOLR-2335

FunctionQParser can't handle fieldnames containing whitespace

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Minor Minor
    • Resolution: Fixed
    • Affects Version/s: 1.4.1
    • Fix Version/s: 4.0-ALPHA
    • Component/s: None
    • Labels:
      None

      Description

      FunctionQParser has some simplistic assumptions about what types of field names it will deal with, in particular it can't deal with field names containing whitespaces.

      1. SOLR-2335.patch
        3 kB
        Hoss Man
      2. SOLR-2335.patch
        0.8 kB
        Hoss Man

        Activity

        Hide
        Hoss Man added a comment -

        Updating summary/description based on root of problem.

        Description form original bug reporter...

        We use an external file field configured as dynamic field. The dynamic field name (and so the name of the provided file) may contain spaces. But currently it is not possible to query for such fields. The following query results in a ParseException:
        q=val:(experience_foo\ bar)

        org.apache.lucene.queryParser.ParseException: Cannot parse 'val:(experience_foo\ bar)': Expected ',' at position 15 in 'experience_foo bar'

        We use following configuration for the externalFileField:
        <types>
        ...
        <fieldType name="experienceRankFile" keyField="id" defVal="0" stored="false" indexed="false" class="solr.ExternalFileField"
        valType="float"/>
        </types>
        <fields>
        <dynamicField name="experience_*" type="experienceRankFile" />
        ...
        </field>

        The original reasons for these assumptions in FunctionQParser are still generally good: it helps keep the syntax and the parsing simpler then they would otherwise need to be.

        I think an easy improvement we could make is to leave the current parsing logic the way it is, but provide a new "FieldValueSourceParaser" that expects a single (quoted) string as input, and just returns the FieldValueSource for that field.

        So these two would be equivilent...

        {!func}myFieldName
        {!func}field("myFieldName")
        

        ...but it would also be possible to write...

        {!func}field("1 my wacky Field*Name")
        
        Show
        Hoss Man added a comment - Updating summary/description based on root of problem. Description form original bug reporter... We use an external file field configured as dynamic field. The dynamic field name (and so the name of the provided file) may contain spaces. But currently it is not possible to query for such fields. The following query results in a ParseException: q= val :(experience_foo\ bar) org.apache.lucene.queryParser.ParseException: Cannot parse ' val :(experience_foo\ bar)': Expected ',' at position 15 in 'experience_foo bar' We use following configuration for the externalFileField: <types> ... <fieldType name="experienceRankFile" keyField="id" defVal="0" stored="false" indexed="false" class="solr.ExternalFileField" valType="float"/> </types> <fields> <dynamicField name="experience_*" type="experienceRankFile" /> ... </field> The original reasons for these assumptions in FunctionQParser are still generally good: it helps keep the syntax and the parsing simpler then they would otherwise need to be. I think an easy improvement we could make is to leave the current parsing logic the way it is, but provide a new "FieldValueSourceParaser" that expects a single (quoted) string as input, and just returns the FieldValueSource for that field. So these two would be equivilent... {!func}myFieldName {!func}field( "myFieldName" ) ...but it would also be possible to write... {!func}field( "1 my wacky Field*Name" )
        Hide
        Hoss Man added a comment -

        Quick patch showing what i had in mind.

        only tried it out with the example for about 30 seconds, and it seemed correct – but it needs some real tests.

        Show
        Hoss Man added a comment - Quick patch showing what i had in mind. only tried it out with the example for about 30 seconds, and it seemed correct – but it needs some real tests.
        Hide
        Yonik Seeley added a comment -

        oh, that's clever. I like it!

        Show
        Yonik Seeley added a comment - oh, that's clever. I like it!
        Hide
        Hoss Man added a comment -

        the other thing this should make possible is sorting on fields that historicly havne't been sortable...

        sort=field("1 my wacky Field*Name") desc
        

        ... the sort parsing code could even be optimized to detect when a function sort results in a FieldValueSource and swap it out with a regular sort ... but i'm not sure if there are any gotchas there.

        Show
        Hoss Man added a comment - the other thing this should make possible is sorting on fields that historicly havne't been sortable... sort=field( "1 my wacky Field*Name" ) desc ... the sort parsing code could even be optimized to detect when a function sort results in a FieldValueSource and swap it out with a regular sort ... but i'm not sure if there are any gotchas there.
        Hide
        Hoss Man added a comment -

        updated patch, includes tests showing the parser working – even with external field filed when the field name has whitespace and quote characters in it

        Show
        Hoss Man added a comment - updated patch, includes tests showing the parser working – even with external field filed when the field name has whitespace and quote characters in it
        Hide
        Hoss Man added a comment -

        Committed revision 1091516.

        Show
        Hoss Man added a comment - Committed revision 1091516.

          People

          • Assignee:
            Hoss Man
            Reporter:
            Miriam Doelle
          • Votes:
            1 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development