Uploaded image for project: 'Solr'
  1. Solr
  2. SOLR-13255

LanguageIdentifierUpdateProcessor broken for documents sent with SolrJ/javabin

Attach filesAttach ScreenshotVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Blocker
    • Resolution: Fixed
    • Affects Version/s: 7.7
    • Fix Version/s: 7.7.1, 8.0
    • Component/s: contrib - LangId
    • Labels:
      None

      Description

      7.7 changed the object type of string field values that are passed to UpdateRequestProcessor implementations from java.lang.String to ByteArrayUtf8CharSequence. SOLR-12992 was mentioned on solr-user as cause.

      The LangDetectLanguageIdentifierUpdateProcessor still expects String values, does not work for CharSequences, and logs warnings instead. For example:

      2019-02-14 13:14:47.537 WARN  (qtp802600647-19) [   x:studio] o.a.s.u.p.LangDetectLanguageIdentifierUpdateProcessor Field name_tokenized not a String value, not including in detection
      

      I'm not sure, but there could be further places where the changed type for string values needs to be handled. (Our custom UpdateRequestProcessor are broken as well since 7.7 and it would be great to have a proper upgrade note as part of the release notes)

        Attachments

        1. SOLR-13255.patch
          8 kB
          Noble Paul
        2. SOLR-13255.patch
          3 kB
          Noble Paul
        3. SOLR-13255.patch
          3 kB
          Jan Høydahl

        Issue Links

          Activity

            People

            • Assignee:
              noble.paul Noble Paul
              Reporter:
              ahubold Andreas Hubold

              Dates

              • Created:
                Updated:
                Resolved:

                Issue deployment