Uploaded image for project: 'Solr'
  1. Solr
  2. SOLR-5407

Strange error condition with cloud replication not working quite right



    • Bug
    • Status: Closed
    • Major
    • Resolution: Abandoned
    • 4.5
    • None
    • None


      I have a clodu deployment of 4.5 on EC2. Architecture is 3 dedicated ZK nodes, and a pair of solr nodes. I'll apologize in advance that this error report is not going to have a lot of detail, I'm really hoping that the scenario/description will trigger some "likely" possible explanation.

      The situation I got into was that the server had decided to fail over, so my app servers were all taking to what should have been the primary for most of the shards/collections, but actually was the replica.

      Here's where it gets odd - no errors being returned to the client code for any of the searches or document updates - and the current primary server was definitely receiving all of the updates - even though they were being submitted to the inactive/replica node. (clients talking to solr-p1, which was not primary at the time, and writes were being passed through to solr-r1, which was primary at the time.)

      All sounds good so far right? Except - the replica server at the time, through which the writes were passing - never got any of those content updates. It had an old unmodified copy of the index.

      I restarted solr-p1 (was the replica at the time) - no change in behavior. Behavior did not change until I killed and restarted the current primary (solr-r1) to force it to fail over.

      At that point, everything was all happy again and working properly.

      Until this morning, when one of the developers provisioned a new collection, which happened to put it's primary on solr-r1. Again, clients all pointing at solr-p1. The developer reported that the documents were going into the index, but not visible on the replica server.




            Unassigned Unassigned
            nneul Nathan Neulinger
            0 Vote for this issue
            1 Start watching this issue