Details
-
Bug
-
Status: Open
-
Major
-
Resolution: Unresolved
-
None
-
None
-
None
-
None
Description
Diagnoses/hypothosis summarized/re-worded from comment below...
https://issues.apache.org/jira/browse/SOLR-13471?focusedCommentId=16840999&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16840999
- Assume request execution happens successfully
- then, when the QueryResponseWriter goes to marshal the response, assume there is some exception writing the results – perhaps due to the Resolver w/ ResultContext + Searcher (could be an IOException)
- the Exception from attempting to write the response may propogate all the way up to the "try" in HttpSolrCall.call() ... which once caught is then passed to "sendError"
- sendError creates a completely new SolrQueryResponse, sets the exception on it, and asks the QueryResponseWriter to write it out
- BUT the OutputStream has already had a bunch of data written to it ... which may be resulting in a byte sequence that confuses the unmarshal code and may result in weird exceptions – or worse: validly structured, but corrupt data
The fundemental problem is that when HttpSolrCall to "start over" writtng the response – but the JavaBinCodec doesn't have any way to recognize that ... it can read partial data and then be confused by the "new" data that comes after it
Perhaps there should be a special "TAG" that means "Ignore everything you've already recieved and start over" that we should emit at the begining of every marshal() call?
Original bug report
I haven't dug into this, or been able to reproduce it (let alone get any additional logging/debugging/breakpoint info) but in a few very sporadic, very rare, instances of running TestReplicationHandlerDiskOverFlow, I triggered ClassCastException's in JavaBinCodec unmarshal code – indicating that there is some disconnect in expectations between the marshal & unmarshal code paths...
[junit4] 2> 13342 ERROR (Thread-19) [ ] o.a.s.h.TestReplicationHandlerDiskOverFlow Query Thread Failure [junit4] 2> => org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error from server at http://127.0.0.1:54505/solr/collection1: class java.lang.Boolean cannot be cast to class java.lang.String (java.lang.Boolean and java.lang.String are in module java.base of loader 'bootstrap') [junit4] 2> at org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:622) [junit4] 2> org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error from server at http://127.0.0.1:54505/solr/collection1: class java.lang.Boolean cannot be cast to class java.lang.String (java.lang.Boolean and java.lang.String are in module java.base of loader 'bootstrap') [junit4] 2> at org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:622) ~[java/:?] [junit4] 2> at org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:255) ~[java/:?] [junit4] 2> at org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:244) ~[java/:?] [junit4] 2> at org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:207) ~[java/:?] [junit4] 2> at org.apache.solr.client.solrj.SolrClient.query(SolrClient.java:987) ~[java/:?] [junit4] 2> at org.apache.solr.client.solrj.SolrClient.query(SolrClient.java:1002) ~[java/:?] [junit4] 2> at org.apache.solr.handler.TestReplicationHandlerDiskOverFlow.lambda$testDiskOverFlow$1(TestReplicationHandlerDiskOverFlow.java:154) ~[test/:?] [junit4] 2> at java.lang.Thread.run(Thread.java:834) [?:?] [junit4] 2> Caused by: java.lang.ClassCastException: class java.lang.Boolean cannot be cast to class java.lang.String (java.lang.Boolean and java.lang.String are in module java.base of loader 'bootstrap') [junit4] 2> at org.apache.solr.common.util.JavaBinCodec.readOrderedMap(JavaBinCodec.java:223) ~[java/:?] [junit4] 2> at org.apache.solr.common.util.JavaBinCodec.readObject(JavaBinCodec.java:298) ~[java/:?] [junit4] 2> at org.apache.solr.common.util.JavaBinCodec.readVal(JavaBinCodec.java:280) ~[java/:?] [junit4] 2> at org.apache.solr.common.util.JavaBinCodec.unmarshal(JavaBinCodec.java:193) ~[java/:?] [junit4] 2> at org.apache.solr.client.solrj.impl.BinaryResponseParser.processResponse(BinaryResponseParser.java:50) ~[java/:?] [junit4] 2> at org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:620) ~[java/:?] [junit4] 2> ... 7 more
It's possible that something about the test code (or code being tested) is doing something it should not be doing, and adding something "unexpeted" to the response object – but either the unmarshal code needs to be as forgiving as marshal code, or the marshal code should fail fast – not produce a binary stream that the unmarshal code can't parse.
Attachments
Attachments
Issue Links
- causes
-
SOLR-13470 SolrException msg not always propogated to HttpClient if exception occurs during response writting
- Open
- is blocked by
-
SOLR-13539 Atomic Update Multivalue remove does not work for field types UUID, Enums, Bool and Binary
- Resolved
-
SOLR-13331 Atomic Update Multivalue remove does not work
- Closed