Uploaded image for project: 'Solr'
  1. Solr
  2. SOLR-12960

mm.autoRelax not working with cascaded qf

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Open
    • Priority: Minor
    • Resolution: Unresolved
    • Affects Version/s: 6.5
    • Fix Version/s: None
    • Component/s: query parsers
    • Labels:
      None

      Description

      Hi,

      The edismax query parser parameter mm.autoRelax won't work in a particular configuration. I consider this as a bug and therefore open this ticket.


      A minimalized brief about the core configuration:
      We maintain a solr core with some localized fields with respective analyzers and different stopword lists, like the schema configuration you can see below.

      <field name="title" type="strings" indexed="false" stored="true"/>
      <field name="title_de" type="text_german" indexed="true" stored="false"/>
      <field name="title_en" type="text_english" indexed="true" stored="false"/>
      ...
      

      The indexed data is processed by langID module and put into the appropriate localized field. The upper field is intended for display only. The lower two will be indexed.

      The search request handler configuration is set up as followed.

      <!-- minimal configuration wich reproduces the problem -->
      <requestHandler name="/select" class="solr.SearchHandler">
        <lst name="defaults">
          <str name="defType">edismax</str>
          <str name="qf">title</str>
          <str name="mm">100%</str> <!-- all clauses must match -->
          <bool name="mm.autoRelax">true</bool>
        </lst>
        <lst name="invariants">
          <str name="f.title.qf">
            title_de
            title_en
        </lst>
      </requestHandler>
      

      For convenient search for a title the qf parameter is also defined on field name title. The field itself is not indexed (not searchable), but the query is delegated to the underlying localized fields.

      As you can see the qf parameter is cascaded.

      qf=title >> title_de title_en
      

      By this the mm.autoRelax parameter doesn't work as expected.

      If true, the number of clauses required (minimum should match) will automatically be relaxed if a clause is removed (by e.g., stopwords filter) from some but not all qf fields.

      Example query and debug:

      q=introduction to measurement
      
      (
        +(
          DisjunctionMaxQuery(
            (
              (
                langid_title_en:introduct | 
                langid_title_de:introduction
              )
            )
          )
          DisjunctionMaxQuery(
            (
              (
                langid_title_de:to
              )
            )
          )
          DisjunctionMaxQuery(
            (
              (
                langid_title_en:measur | 
                langid_title_de:measurement
              )
            )
          )
        )~3
      )
      

      Although one stopword is detected and removed from query the overall number of required clauses remains 3.

      If the localized fields are defined directly in the qf parameter everything will be fine:

      q=introduction to measurement
      qf=title_de title_en
      
      (
        +(
          DisjunctionMaxQuery(
            (
              langid_title_en:introduct | 
              langid_title_de:introduction
            )
          )
          DisjunctionMaxQuery(
            (
              langid_title_de:to
            )
          ) 
          DisjunctionMaxQuery(
            (
              langid_title_en:measur | 
              langid_title_de:measurement
            )
          )
        )~2
      )
      

      To avoid the problem by not cascading qf might be no big deal. But if many localized fields exist and should be searched by default the configuration overhead increases massively. We have 6 languages and 12 localized fields. This leads to 72 fields to be defined in qf.

      <str name="qf">
        <!-- title : localized field -->
        title_de^4
        title_en^4
        title_fr^4
        title_ja^4
        title_cjk^4
        title_txt^4
        <!-- summary : localized field -->
        ...
      </str>
      

      If someone has any questions, feel free to comment. I will reply as soon as possible.

      • Regards

        Attachments

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              toshokanin Marco Remy
            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated: