Uploaded image for project: 'Solr'
  1. Solr
  2. SOLR-13963

JavaBinCodec has concurrent modification of CharArr resulting in corrupt intranode updates

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Blocker
    • Resolution: Fixed
    • Affects Version/s: 8.1
    • Fix Version/s: 8.3.1
    • Component/s: None
    • Labels:
      None

      Description

      Discussed on the mailing list "Possible data corruption in JavaBinCodec in Solr 8.3 during distributed update?"

       

      In summary, after moving to 8.3 we had a consistent (but non-deterministic) set of failing tests where the data being sent in intranode requests was sometimes corrupted. For example if the well formed data was
      'fieldName':"this is a long string"
      The error we saw from Solr might be that
      unknown field 'fieldNamis a long string"
       
      The change that indirectly caused to this issue to materialize was from SOLR-13682 which meant that org.apache.solr.common.SolrInputDocument.writeMap(EntryWriter) would call org.apache.solr.common.SolrInputField.getValue() rather than org.apache.solr.common.SolrInputField.getRawValue() as it had before.
       
      getRawValue for a string calls org.apache.solr.common.util.ByteArrayUtf8CharSequence._getStr() which in this context calls
      org.apache.solr.common.util.JavaBinCodec.getStringProvider()

       
      JavaBinCodec has a CharArr, arr, which is modified in two different locations, but only one of which is protected with a synchronized block
       
      getStringProvider() synchronizes on arr:
      https://github.com/apache/lucene-solr/blob/master/solr/solrj/src/java/org/apache/solr/common/util/JavaBinCodec.java#L966
       
      but  _readStr() doesn't:
      https://github.com/apache/lucene-solr/blob/master/solr/solrj/src/java/org/apache/solr/common/util/JavaBinCodec.java#L930
       
      The two methods are called concurrently, but wheren't prior to SOLR-13682.
       
      Adding a synchronized block into readStr() around the modification of _arr fixes the problem as far as I can see.

       

      Also, the problem does not seem to occur when using the dynamic schema mode of autoCreateFields=true in the updateRequestProcessorChain.

        Attachments

        1. SOLR-13963.patch
          34 kB
          Colvin Cowie
        2. JavaBinCodec.java
          39 kB
          Colvin Cowie
        3. SOLR-13963.patch
          34 kB
          Noble Paul

          Issue Links

            Activity

              People

              • Assignee:
                noble.paul Noble Paul
                Reporter:
                cjcowie Colvin Cowie
              • Votes:
                0 Vote for this issue
                Watchers:
                11 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: