Uploaded image for project: 'Solr'
  1. Solr
  2. SOLR-5362

SolrCell's order of field operation with lowernames=true

Attach filesAttach ScreenshotAdd voteVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments



      This follows from SOLR-1634.

      I am not sure if SOLR-1856 completely fixes SOLR-1634, particularly when lowernames=true comes in to the picture. Consider a case where:

      1. Tika generated field Category=Foo for a doc (e.g., this comes from user-defined document properties).

      2. literalsOverride=true.

      3. lowernames=true.

      4. User supplied literal.category=bar.

      According to the rules, literalsOverride is applied before lowernames and, thus, will have no effect here since the field Category from Tika and literal.category are considered different fields at this stage before lowernames=true kicks in. And when lowernames=true kicks in, it has the effect of merging Category into category, giving it both values Foo and bar.

      Adding fmap.Category=tika_category does not help because fmap is applied even later, by that time category already contains both Foo and bar.

      Adding fmap.Category=tika_category and with lowernames=false would do (regardless of literalsOverride), but what if we need lowernames=true and what if the capitalization of Category can vary (e.g., CATEGORY).

      Would it make sense to have an option to apply the rules in the order that they are specified in the config file or URL params rather than always in a static order?


      PS. Marking this as Major because there seems to be no easy workaround (condition for Minor).


      Response from Jan H√łydahl (link):

      To me it sounds like a potential, very simple solution would be to apply lowercasing at several places if lowernames=true

      Agreed. Particularly, to apply lowernames=true as soon as Tika has extracted a field, before literalsOverride is even considered.




            • Assignee:
              Sit Manovit Chaiyasit (Sit) Manovit


              • Created:

                Issue deployment