Solr
  1. Solr
  2. SOLR-1752

SolrJ fails with exception when passing document ADD and DELETEs in the same request using XML request writer (but not binary request writer)

    Details

    • Type: Bug Bug
    • Status: Open
    • Priority: Major Major
    • Resolution: Unresolved
    • Affects Version/s: 1.4
    • Fix Version/s: None
    • Component/s: clients - java, update
    • Labels:
      None

      Description

      Add this test to SolrExampleTests.java and it will fail when using the XML Request Writer (now default), but not if you change the SolrExampleJettyTest to use the BinaryRequestWriter.

       public void testAddDeleteInSameRequest() throws Exception {
          SolrServer server = getSolrServer();
      
          SolrInputDocument doc3 = new SolrInputDocument();
          doc3.addField( "id", "id3", 1.0f );
          doc3.addField( "name", "doc3", 1.0f );
          doc3.addField( "price", 10 );
          UpdateRequest up = new UpdateRequest();
          up.add( doc3 );
          up.deleteById("id001");
          up.setWaitFlush(false);
          up.setWaitSearcher(false);
      
          up.process( server );
        }
      

      terminates with exception:

      Feb 3, 2010 8:55:34 AM org.apache.solr.common.SolrException log
      SEVERE: org.apache.solr.common.SolrException: Illegal to have multiple roots (start tag in epilog?).
       at [row,col {unknown-source}]: [1,125]
      	at org.apache.solr.handler.XMLLoader.load(XMLLoader.java:72)
      	at org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:54)
      	at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
      	at org.apache.solr.core.SolrCore.execute(SolrCore.java:1316)
      	at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:338)
      	at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:241)
      	at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1089)
      	at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:365)
      	at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:181)
      	at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:712)
      	at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:139)
      	at org.mortbay.jetty.Server.handle(Server.java:285)
      	at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:502)
      	at org.mortbay.jetty.HttpConnection$RequestHandler.content(HttpConnection.java:835)
      	at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:723)
      	at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:202)
      	at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:378)
      	at org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.java:226)
      	at org.mortbay.thread.BoundedThreadPool$PoolThread.run(BoundedThreadPool.java:442)
      Caused by: com.ctc.wstx.exc.WstxParsingException: Illegal to have multiple roots (start tag in epilog?).
       at [row,col {unknown-source}]: [1,125]
      	at com.ctc.wstx.sr.StreamScanner.constructWfcException(StreamScanner.java:630)
      	at com.ctc.wstx.sr.StreamScanner.throwParseError(StreamScanner.java:461)
      	at com.ctc.wstx.sr.BasicStreamReader.handleExtraRoot(BasicStreamReader.java:2155)
      	at com.ctc.wstx.sr.BasicStreamReader.nextFromProlog(BasicStreamReader.java:2070)
      	at com.ctc.wstx.sr.BasicStreamReader.nextFromTree(BasicStreamReader.java:2647)
      	at com.ctc.wstx.sr.BasicStreamReader.next(BasicStreamReader.java:1019)
      	at org.apache.solr.handler.XMLLoader.processUpdate(XMLLoader.java:90)
      	at org.apache.solr.handler.XMLLoader.load(XMLLoader.java:69)
      	... 18 more
      
      1. SOLR-1752.patch
        13 kB
        Mike Mattozzi
      2. SOLR-1752_2.patch
        13 kB
        Mike Mattozzi
      3. SOLR-1752.patch
        11 kB
        Mike Mattozzi

        Issue Links

          Activity

          Hide
          Shalin Shekhar Mangar added a comment -

          Jayson, Solr's update XML does not define a container tag so we are constrained to only one of add/delete/commit/optimize at a time. Binary format of course does not have this problem. So unless we decide to add a root tag to the update XML, this exception will happen.

          So I guess we have the following options:

          1. Disallow more than one type of operation for any request writer
          2. Document this behavior in the UpdateRequest javadocs.

          I'd prefer #2 even though it is inconsistent.

          Show
          Shalin Shekhar Mangar added a comment - Jayson, Solr's update XML does not define a container tag so we are constrained to only one of add/delete/commit/optimize at a time. Binary format of course does not have this problem. So unless we decide to add a root tag to the update XML, this exception will happen. So I guess we have the following options: Disallow more than one type of operation for any request writer Document this behavior in the UpdateRequest javadocs. I'd prefer #2 even though it is inconsistent.
          Hide
          Hoss Man added a comment -

          Long term, we could evolve the Solr XML Update format to allow both adds and deletes (and we probably should) but that seems like a seperate issue.

          Given the current state of hte XML Syntax allowed, it does seem like there is a bug here in that SolrJ will attempt to send illegal XML when it gets an UpdateRequest that contains both adds and deletes.

          At a minimum SolrJ should notice when it's configured to use XML and the UpdateRequest contains mixed commands and generate a more specific error message before ever attempting to format the commands as XML and send them to a server.

          It might conceivable make sense to convert the UpdateRequest into multiple server calls – but i haven't thought that through very far and i'm not sure what that would entail (the error handling would probably be a bit tricky)

          Show
          Hoss Man added a comment - Long term, we could evolve the Solr XML Update format to allow both adds and deletes (and we probably should) but that seems like a seperate issue. Given the current state of hte XML Syntax allowed, it does seem like there is a bug here in that SolrJ will attempt to send illegal XML when it gets an UpdateRequest that contains both adds and deletes. At a minimum SolrJ should notice when it's configured to use XML and the UpdateRequest contains mixed commands and generate a more specific error message before ever attempting to format the commands as XML and send them to a server. It might conceivable make sense to convert the UpdateRequest into multiple server calls – but i haven't thought that through very far and i'm not sure what that would entail (the error handling would probably be a bit tricky)
          Hide
          Markus Jelsma added a comment -

          This isn't limited to SolrJ. The following curl command will trigger the same error in 1.4.1
          curl http://127.0.0.1:8983/solr/update/?commit=true -H "Content-Type: text/xml" --data-binary '<add><doc><field name="id">17</field></doc></add><delete><id>1234</id></delete>';

          Show
          Markus Jelsma added a comment - This isn't limited to SolrJ. The following curl command will trigger the same error in 1.4.1 curl http://127.0.0.1:8983/solr/update/?commit=true -H "Content-Type: text/xml" --data-binary '<add><doc><field name="id">17</field></doc></add><delete><id>1234</id></delete>';
          Hide
          Mike Mattozzi added a comment -

          I ran into this problem today and was surprised it hadn't been fixed. I've attached a patch to UpdateRequest that maintains an ordered list that can be a mix of SolrInputDocuments to add, ids to delete, and delete queries.

          There's a few places where my patch iterates over documents instead of doing an addAll so there may be some inefficiencies. It seems like these would be outweighed by the ability to group up update operations, but I could always optimize more.

          Show
          Mike Mattozzi added a comment - I ran into this problem today and was surprised it hadn't been fixed. I've attached a patch to UpdateRequest that maintains an ordered list that can be a mix of SolrInputDocuments to add, ids to delete, and delete queries. There's a few places where my patch iterates over documents instead of doing an addAll so there may be some inefficiencies. It seems like these would be outweighed by the ability to group up update operations, but I could always optimize more.
          Hide
          Mike Mattozzi added a comment -

          Here's a second version of my previous patch that keeps Collections of documents together instead of wrapping each one in an add document operation. Should save a bit on object creation and iteration compared to the previous patch I attached.

          Show
          Mike Mattozzi added a comment - Here's a second version of my previous patch that keeps Collections of documents together instead of wrapping each one in an add document operation. Should save a bit on object creation and iteration compared to the previous patch I attached.
          Hide
          Mike Mattozzi added a comment -

          This issue still applies to trunk (4.0-SNAPSHOT). This patch is modified to merge cleanly with the latest SolrExampleTests file, the UpdateRequest modifications remain the same.

          Patch should be applied inside the http://svn.apache.org/repos/asf/lucene/dev/trunk/solr directory.

          Show
          Mike Mattozzi added a comment - This issue still applies to trunk (4.0-SNAPSHOT). This patch is modified to merge cleanly with the latest SolrExampleTests file, the UpdateRequest modifications remain the same. Patch should be applied inside the http://svn.apache.org/repos/asf/lucene/dev/trunk/solr directory.

            People

            • Assignee:
              Shalin Shekhar Mangar
              Reporter:
              Jayson Minard
            • Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

              • Created:
                Updated:

                Development