Uploaded image for project: 'Solr'
  1. Solr
  2. SOLR-13850

Atomic Updates with PreAnalyzedField

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Critical
    • Resolution: Unresolved
    • 7.7.2, 8.2
    • None
    • None
    • Ubuntu 16.04 LTS / Java 8 (Zulu), Windows 10 / Java 11 (Oracle)

    Description

      If you try to update non pre-analyzed fields in a document using atomic updates, data in pre-analyzed fields (if there is any) will be lost.

      Steps to reproduce

      1. Index this document into techproducts

      {
        "id": "a",
        "n_s": "s1",
        "pre": "{\"v\":\"1\",\"str\":\"Alaska\",\"tokens\":[{\"t\":\"alaska\",\"s\":0,\"e\":6,\"i\":1}]}"
      }
      

      2. Query the document

      {
        "response":{"numFound":1,"start":0,"maxScore":1.0,"docs":[
          {
            "id":"a",
            "n_s":"s1",
            "pre":"Alaska",
            "_version_":1647475215142223872}]
      }}
      

      3. Update using atomic syntax

      {
        "add": {
          "doc": {
            "id": "a",
            "n_s": {"set": "s2"}
      }}}
      

      4. Observe the warning in solr log
      UI:

       WARN x:techproducts_shard2_replica_n6 PreAnalyzedField Error parsing pre-analyzed field 'pre'
      

      solr.log:

      WARN (qtp1384454980-23) [c:techproducts s:shard2 r:core_node8 x:techproducts_shard2_replica_n6] o.a.s.s.PreAnalyzedField Error parsing pre-analyzed field 'pre' => java.io.IOException: Invalid JSON type java.lang.String, expected Map
       at org.apache.solr.schema.JsonPreAnalyzedParser.parse(JsonPreAnalyzedParser.java:86)
      

      5. Query the document again

      {
        "response":{"numFound":1,"start":0,"maxScore":1.0,"docs":[
          {
            "id":"a",
            "n_s":"s2",
            "_version_":1647475461695995904}]
      }}
      

      Result: There is no 'pre' field in the document anymore.

      My thoughts on it

      1. Data loss can be prevented if the warning will be replaced with error (re-throwing exception). Atomic updates for such documents still won't work, but updates will be explicitly rejected.

      2. Solr tries to read the document from index, merge it with input document and re-index the document, but when it reads indexed pre-analyzed fields the format is different, so Solr cannot parse and re-index those fields properly.

      Attachments

        Activity

          People

            Unassigned Unassigned
            drapushko Oleksandr Drapushko
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated: