Details
-
Bug
-
Status: Closed
-
Blocker
-
Resolution: Fixed
-
8.1
-
None
-
None
Description
Discussed on the mailing list "Possible data corruption in JavaBinCodec in Solr 8.3 during distributed update?"
In summary, after moving to 8.3 we had a consistent (but non-deterministic) set of failing tests where the data being sent in intranode requests was sometimes corrupted. For example if the well formed data was
'fieldName':"this is a long string"
The error we saw from Solr might be that
unknown field 'fieldNamis a long string"
The change that indirectly caused to this issue to materialize was from SOLR-13682 which meant that org.apache.solr.common.SolrInputDocument.writeMap(EntryWriter) would call org.apache.solr.common.SolrInputField.getValue() rather than org.apache.solr.common.SolrInputField.getRawValue() as it had before.
getRawValue for a string calls org.apache.solr.common.util.ByteArrayUtf8CharSequence._getStr() which in this context calls
org.apache.solr.common.util.JavaBinCodec.getStringProvider()
JavaBinCodec has a CharArr, arr, which is modified in two different locations, but only one of which is protected with a synchronized block
getStringProvider() synchronizes on arr:
https://github.com/apache/lucene-solr/blob/master/solr/solrj/src/java/org/apache/solr/common/util/JavaBinCodec.java#L966
but _readStr() doesn't:
https://github.com/apache/lucene-solr/blob/master/solr/solrj/src/java/org/apache/solr/common/util/JavaBinCodec.java#L930
The two methods are called concurrently, but wheren't prior to SOLR-13682.
Adding a synchronized block into readStr() around the modification of _arr fixes the problem as far as I can see.
Also, the problem does not seem to occur when using the dynamic schema mode of autoCreateFields=true in the updateRequestProcessorChain.
Attachments
Attachments
Issue Links
- is caused by
-
SOLR-13171 Make it possible to stream query result without creating java objects
- Closed