Uploaded image for project: 'Solr'
  1. Solr
  2. SOLR-17274

atomic update error when using json w/ multiple modifiers

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 9.0, 9.1, 9.2, 9.3, 9.4, 9.5, 9.6
    • 9.7
    • update
    • None
    • Observed first in 9.5.0 w/ OpenJDK 11, then reproduced this simple test case in 9.6.0.

    Description

      I ran into a problem doing a json atomic update that tries to both `add` and `remove` a value for a multivalued field in a single update. I saw it initially in an instance that runs 9.5.0, and reproduced a minimal example using Solr 9.6.0.

      The only fields defined in the schema are:

        <field name="id" type="string" multiValued="false" indexed="true" required="true" stored="true"/>
        <field name="_version_" type="plong" indexed="false" stored="false"/>
        <field name="name" type="strings" indexed="true" stored="true"/>

      `<uniqueKey>id</uniqueKey>` is also present, so I'm supplying docs with just an `id` and a multivalued `name` field. The real setup is more complex, but this is a minimal test case to illustrate the problem.

      Starting with an empty index, I add the following doc to the index successfully:

      {"id": "1", "name": ["John Doe", "Jane Doe"]}

      And I can query it, seeing the expected result:

      {
        "responseHeader":{
          "status":0,
          "QTime":23,
          "params":{
            "q":"name:*"
          }
        },
        "response":{
          "numFound":1,
          "start":0,
          "numFoundExact":true,
          "docs":[{
            "id":"1",
            "name":["John Doe","Jane Doe"],
            "_version_":1797873599884820480
          }]
        }
      }

      Next, I send an atomic update to modify the `name` field of that document by removing `Jane Doe` and adding `Janet Doe`:

      {"id": "1", "name": {"add": "Janet Doe", "remove": "Jane Doe"}}

      This atomic update that does both an `add` and a `remove` is something that used to work for us under Solr 6.6, but we just noticed that it fails in our production 9.5 instance and in 9.6, which I just downloaded to test against.

      The error in the solr.log indicates that Solr mistakenly interprets the `{"add": "Janet Doe", "remove": "Jane Doe"}` as a nested document and then throws an exception because our schema doesn't have the `root` field that would be expected if we were using nested docs (which we don't use).

      Here's the full stack trace from `solr.log` (version 9.6.0):

      2024-05-01 17:49:02.479 ERROR (qtp2059461664-30-0.0.0.0-3) [c: s: r: x:atris t:0.0.0.0-3] o.a.s.h.RequestHandlerBase Client exception => org.apache.solr.common.SolrException: Unable to index docs with children: the schema must include definitions for both a uniqueKey field and the '_root_' field, using the exact same fieldType
              at org.apache.solr.update.DocumentBuilder.unexpectedNestedDocException(DocumentBuilder.java:369)
      org.apache.solr.common.SolrException: Unable to index docs with children: the schema must include definitions for both a uniqueKey field and the '_root_' field, using the exact same fieldType
              at org.apache.solr.update.DocumentBuilder.unexpectedNestedDocException(DocumentBuilder.java:369) ~[?:?]
              at org.apache.solr.update.DocumentBuilder.toDocument(DocumentBuilder.java:153) ~[?:?]
              at org.apache.solr.update.AddUpdateCommand.makeLuceneDocs(AddUpdateCommand.java:213) ~[?:?]
              at org.apache.solr.update.DirectUpdateHandler2.updateDocOrDocValues(DirectUpdateHandler2.java:1056) ~[?:?]
              at org.apache.solr.update.DirectUpdateHandler2.doNormalUpdate(DirectUpdateHandler2.java:421) ~[?:?]
              at org.apache.solr.update.DirectUpdateHandler2.addDoc0(DirectUpdateHandler2.java:375) ~[?:?]
              at org.apache.solr.update.DirectUpdateHandler2.addDoc(DirectUpdateHandler2.java:312) ~[?:?]
              at org.apache.solr.update.processor.RunUpdateProcessorFactory$RunUpdateProcessor.processAdd(RunUpdateProcessorFactory.java:76) ~[?:?]
              at org.apache.solr.update.processor.UpdateRequestProcessor.processAdd(UpdateRequestProcessor.java:54) ~[?:?]
              at org.apache.solr.update.processor.DistributedUpdateProcessor.doLocalAdd(DistributedUpdateProcessor.java:270) ~[?:?]
              at org.apache.solr.update.processor.DistributedUpdateProcessor.doVersionAdd(DistributedUpdateProcessor.java:533) ~[?:?]
              at org.apache.solr.update.processor.DistributedUpdateProcessor.lambda$versionAdd$0(DistributedUpdateProcessor.java:358) ~[?:?]
              at org.apache.solr.update.VersionBucket.runWithLock(VersionBucket.java:43) ~[?:?]
              at org.apache.solr.update.processor.DistributedUpdateProcessor.versionAdd(DistributedUpdateProcessor.java:355) ~[?:?]
              at org.apache.solr.update.processor.DistributedUpdateProcessor.processAdd(DistributedUpdateProcessor.java:236) ~[?:?]
              at org.apache.solr.update.processor.LogUpdateProcessorFactory$LogUpdateProcessor.processAdd(LogUpdateProcessorFactory.java:111) ~[?:?]
              at org.apache.solr.handler.loader.JsonLoader$SingleThreadedJsonLoader.handleAdds(JsonLoader.java:553) ~[?:?]
              at org.apache.solr.handler.loader.JsonLoader$SingleThreadedJsonLoader.processUpdate(JsonLoader.java:183) ~[?:?]
              at org.apache.solr.handler.loader.JsonLoader$SingleThreadedJsonLoader.load(JsonLoader.java:151) ~[?:?]
              at org.apache.solr.handler.loader.JsonLoader.load(JsonLoader.java:86) ~[?:?]
              at org.apache.solr.handler.UpdateRequestHandler$1.load(UpdateRequestHandler.java:102) ~[?:?]
              at org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:100) ~[?:?]
              at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:226) ~[?:?]
              at org.apache.solr.core.SolrCore.execute(SolrCore.java:2886) ~[?:?]
              at org.apache.solr.servlet.HttpSolrCall.executeCoreRequest(HttpSolrCall.java:910) ~[?:?]
              at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:596) ~[?:?]
              at org.apache.solr.servlet.SolrDispatchFilter.dispatch(SolrDispatchFilter.java:262) ~[?:?]
              at org.apache.solr.servlet.SolrDispatchFilter.lambda$doFilter$0(SolrDispatchFilter.java:219) ~[?:?]
              at org.apache.solr.servlet.ServletUtils.traceHttpRequestExecution2(ServletUtils.java:249) ~[?:?]
              at org.apache.solr.servlet.ServletUtils.rateLimitRequest(ServletUtils.java:215) ~[?:?]
              at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:213) ~[?:?]
              at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:195) ~[?:?]
              at org.eclipse.jetty.servlet.FilterHolder.doFilter(FilterHolder.java:210) ~[jetty-servlet-10.0.20.jar:10.0.20]
              at org.eclipse.jetty.servlet.ServletHandler$Chain.doFilter(ServletHandler.java:1635) ~[jetty-servlet-10.0.20.jar:10.0.20]
              at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:527) ~[jetty-servlet-10.0.20.jar:10.0.20]
              at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:131) ~[jetty-server-10.0.20.jar:10.0.20]
              at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:598) ~[jetty-security-10.0.20.jar:10.0.20]
              at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:122) ~[jetty-server-10.0.20.jar:10.0.20]
              at org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:223) ~[jetty-server-10.0.20.jar:10.0.20]
              at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:1580) ~[jetty-server-10.0.20.jar:10.0.20]
              at org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:221) ~[jetty-server-10.0.20.jar:10.0.20]
              at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1384) ~[jetty-server-10.0.20.jar:10.0.20]
              at org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:176) ~[jetty-server-10.0.20.jar:10.0.20]
              at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:484) ~[jetty-servlet-10.0.20.jar:10.0.20]
              at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1553) ~[jetty-server-10.0.20.jar:10.0.20]
              at org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:174) ~[jetty-server-10.0.20.jar:10.0.20]
              at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1306) ~[jetty-server-10.0.20.jar:10.0.20]
              at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:129) ~[jetty-server-10.0.20.jar:10.0.20]
              at org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:149) ~[jetty-server-10.0.20.jar:10.0.20]
              at org.eclipse.jetty.server.handler.InetAccessHandler.handle(InetAccessHandler.java:228) ~[jetty-server-10.0.20.jar:10.0.20]
              at org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:141) ~[jetty-server-10.0.20.jar:10.0.20]
              at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:122) ~[jetty-server-10.0.20.jar:10.0.20]
              at org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(RewriteHandler.java:301) ~[jetty-rewrite-10.0.20.jar:10.0.20]
              at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:122) ~[jetty-server-10.0.20.jar:10.0.20]
              at org.eclipse.jetty.server.handler.gzip.GzipHandler.handle(GzipHandler.java:822) ~[jetty-server-10.0.20.jar:10.0.20]
              at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:122) ~[jetty-server-10.0.20.jar:10.0.20]
              at org.eclipse.jetty.server.Server.handle(Server.java:563) ~[jetty-server-10.0.20.jar:10.0.20]
              at org.eclipse.jetty.server.HttpChannel$RequestDispatchable.dispatch(HttpChannel.java:1598) ~[jetty-server-10.0.20.jar:10.0.20]
              at org.eclipse.jetty.server.HttpChannel.dispatch(HttpChannel.java:753) ~[jetty-server-10.0.20.jar:10.0.20]
              at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:501) ~[jetty-server-10.0.20.jar:10.0.20]
              at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:287) ~[jetty-server-10.0.20.jar:10.0.20]
              at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:314) ~[jetty-io-10.0.20.jar:10.0.20]
              at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:100) ~[jetty-io-10.0.20.jar:10.0.20]
              at org.eclipse.jetty.io.SelectableChannelEndPoint$1.run(SelectableChannelEndPoint.java:53) ~[jetty-io-10.0.20.jar:10.0.20]
              at org.eclipse.jetty.util.thread.strategy.AdaptiveExecutionStrategy.runTask(AdaptiveExecutionStrategy.java:421) ~[jetty-util-10.0.20.jar:10.0.20]
              at org.eclipse.jetty.util.thread.strategy.AdaptiveExecutionStrategy.consumeTask(AdaptiveExecutionStrategy.java:390) ~[jetty-util-10.0.20.jar:10.0.20]
              at org.eclipse.jetty.util.thread.strategy.AdaptiveExecutionStrategy.tryProduce(AdaptiveExecutionStrategy.java:277) ~[jetty-util-10.0.20.jar:10.0.20]
              at org.eclipse.jetty.util.thread.strategy.AdaptiveExecutionStrategy.produce(AdaptiveExecutionStrategy.java:193) ~[jetty-util-10.0.20.jar:10.0.20]
              at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:969) ~[jetty-util-10.0.20.jar:10.0.20]
              at org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.doRunJob(QueuedThreadPool.java:1194) ~[jetty-util-10.0.20.jar:10.0.20]
              at org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:1149) ~[jetty-util-10.0.20.jar:10.0.20]
              at java.base/java.lang.Thread.run(Thread.java:829) [?:?]

       
      In case it matters, the update handler was called with `commitWithin=1000`.
       
      I believe this is happening because as part of commit 5a9a34daedb3 on 2021-06-12  ("SOLR-15044: JSON Loading: nested docs don't need ID"), the `isChildDoc` method of solr/core/src/java/org/apache/solr/handler/loader/JsonLoader.java was changed:

      +    /** Is this a child doc (true) or a partial update (false)? */
           private boolean isChildDoc(SolrInputDocument extendedFieldValue) {
      -      return extendedFieldValue.containsKey(req.getSchema().getUniqueKeyField().getName());
      +      if (extendedFieldValue.size() != 1) {
      +        return true;
      +      }
      +      // if the only key is a field in the schema, assume it's a child doc
      +      final String fieldName = extendedFieldValue.iterator().next().getName();
      +      return req.getSchema().getFieldOrNull(fieldName) != null;
      +      // otherwise, assume it's "set" or some other verb for a partial update.
      +      // NOTE: it's fundamentally ambiguous with JSON; this is a best effort try.
           }

      In case that diff gets reformatted and garbled, it would previously have checked  if the `id` field was a  key in the `extendedFieldValue`, which would have been `false`, but it now returns `true` if the `extendedFieldValue.size()` is not equal to `1`, which is the case here because I have both an `add` and a `remove` in the extended field value. It does work fine if I submit two separate updates, one for each of the two operations.

      For now, we're going to do the `add` and `remove` in separate updates and make sure we don't have other cases of multiple update operations, but it would be nice if users weren't forced to do that.

       This was actually the second time we were affected by the `isChildDoc`change I mentioned above. The first issue, when we upgraded from 6.6 to 9.5 last year, we worked around by renaming an existing field from `set` to `sset`. The problem  was that an atomic update like `{"id": "1", "name": {"set": ["J Doe"]}}` would fail with the same error message because since `set` was a field name, `return req.getSchema().getFieldOrNull(fieldName) != null` would evaluate to `true` for `isChildDoc`. After we renamed the field to `sset`, then `req.getSchema().getFieldOrNull(fieldName)` would be null for `set` and so
      `isChildDoc` would evaluate to `false`, as desired.

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              casmith Calvin Smith
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 3h 20m
                  3h 20m