Solr
  1. Solr
  2. SOLR-3140

Make omitNorms default for all numeric field types

    Details

      Description

      Today norms are enabled for all Solr field types by default, while in Lucene norms are omitted for the numeric types.

      Propose to make the Solr defaults the same as in Lucene, so that if someone occasionally wants index-side boost for a numeric field type they must say omitNorms="false". This lets us simplify the example schema too.

      1. SOLR-3140.patch
        58 kB
        Jan Høydahl
      2. SOLR-3140.patch
        19 kB
        Jan Høydahl
      3. SOLR-3140.patch
        18 kB
        Jan Høydahl
      4. SOLR-3140-3x.patch
        51 kB
        Jan Høydahl

        Issue Links

          Activity

          Hide
          Jan Høydahl added a comment -

          Opinions on this?

          Show
          Jan Høydahl added a comment - Opinions on this?
          Hide
          Erik Hatcher added a comment -

          +1, simpler is better.

          Show
          Erik Hatcher added a comment - +1, simpler is better.
          Hide
          Tommaso Teofili added a comment -

          yes big +1

          Show
          Tommaso Teofili added a comment - yes big +1
          Hide
          Jan Høydahl added a comment -

          First patch. Introduces a new NumericFieldType which sets omitNorms=true in init(). All numeric fields inherit from this and if schema version >1.4 (patch bumps it to 1.5) the new default will be used.

          I could not find a way to set this default in constructor, as we do not know yet the schema version, before init() is called, right?

          Open issues:

          • Is there a better place to set this default than in init() in the new base class?
          • Should StrField or other fields also have omitNorms as default?
          Show
          Jan Høydahl added a comment - First patch. Introduces a new NumericFieldType which sets omitNorms=true in init(). All numeric fields inherit from this and if schema version >1.4 (patch bumps it to 1.5) the new default will be used. I could not find a way to set this default in constructor, as we do not know yet the schema version, before init() is called, right? Open issues: Is there a better place to set this default than in init() in the new base class? Should StrField or other fields also have omitNorms as default?
          Hide
          Hoss Man added a comment -

          Is there a better place to set this default than in init() in the new base class?

          probably not

          Should StrField or other fields also have omitNorms as default?

          I don't think so? if you search on a multivalued string field like "keywords" or "tags" it's reasonable to want length normalization to be a factor to prevent keyword stuffing.

          Show
          Hoss Man added a comment - Is there a better place to set this default than in init() in the new base class? probably not Should StrField or other fields also have omitNorms as default? I don't think so? if you search on a multivalued string field like "keywords" or "tags" it's reasonable to want length normalization to be a factor to prevent keyword stuffing.
          Hide
          Tommaso Teofili added a comment -

          Is there a better place to set this default than in init() in the new base class?

          I agree that's the method responsible for doing this kind of stuff

          I don't think so? if you search on a multivalued string field like "keywords" or "tags" it's reasonable to want length normalization to be a factor to prevent keyword stuffing.

          good point

          Show
          Tommaso Teofili added a comment - Is there a better place to set this default than in init() in the new base class? I agree that's the method responsible for doing this kind of stuff I don't think so? if you search on a multivalued string field like "keywords" or "tags" it's reasonable to want length normalization to be a factor to prevent keyword stuffing. good point
          Hide
          Jan Høydahl added a comment -

          I don't know if calling the BoolField a "NumericFieldType" is stretching it. That was the name I could think of.

          Show
          Jan Høydahl added a comment - I don't know if calling the BoolField a "NumericFieldType" is stretching it. That was the name I could think of.
          Hide
          Tommaso Teofili added a comment -

          maybe something like PrimitiveFieldType (that should recall Java primitive types)

          Show
          Tommaso Teofili added a comment - maybe something like PrimitiveFieldType (that should recall Java primitive types)
          Hide
          Yonik Seeley added a comment -

          Although length normalization on StringType fields could be useful, we should think about the best default. The user can always switch it of course.

          But on the other hand, the "string" type defined by the example schema already specifies omitNorms=true, and as long as we don't change that I don't think the default for the java class is a big deal either way. Keeping it the same has the slight benefit of making it easier for the minority of people who defined their own string fields and purposely wanted norms.

          Show
          Yonik Seeley added a comment - Although length normalization on StringType fields could be useful, we should think about the best default. The user can always switch it of course. But on the other hand, the "string" type defined by the example schema already specifies omitNorms=true, and as long as we don't change that I don't think the default for the java class is a big deal either way. Keeping it the same has the slight benefit of making it easier for the minority of people who defined their own string fields and purposely wanted norms.
          Hide
          Jan Høydahl added a comment -

          New patch renaming base class to PrimitiveFieldType. StrField now also defaults to omitNorms="true".

          So now, starting from 3.6 people will not need to remember to always say omitNorms=true for their own fields - primitive types defined without omitNorms specified will still get norms as default if schema version >=1.5

          Show
          Jan Høydahl added a comment - New patch renaming base class to PrimitiveFieldType. StrField now also defaults to omitNorms="true". So now, starting from 3.6 people will not need to remember to always say omitNorms=true for their own fields - primitive types defined without omitNorms specified will still get norms as default if schema version >=1.5
          Hide
          Jan Høydahl added a comment -

          New patch with tests. Gonna commit and backport

          Show
          Jan Høydahl added a comment - New patch with tests. Gonna commit and backport
          Hide
          Jan Høydahl added a comment -

          Patch for branch_3x

          Show
          Jan Høydahl added a comment - Patch for branch_3x
          Hide
          Jan Høydahl added a comment -

          Committed in trunk, merged to 3x

          Show
          Jan Høydahl added a comment - Committed in trunk, merged to 3x

            People

            • Assignee:
              Jan Høydahl
              Reporter:
              Jan Høydahl
            • Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development