Uploaded image for project: 'Solr'
  1. Solr
  2. SOLR-4744

Version conflict error during shard split test

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Minor
    • Resolution: Fixed
    • 4.3
    • 4.3.1, 4.4
    • SolrCloud
    • None

    Description

      ShardSplitTest fails sometimes with the following error:

      [junit4:junit4]   1> INFO  - 2013-04-14 19:05:26.861; org.apache.solr.cloud.Overseer$ClusterStateUpdater; Update shard state invoked for collection: collection1
      [junit4:junit4]   1> INFO  - 2013-04-14 19:05:26.861; org.apache.solr.cloud.Overseer$ClusterStateUpdater; Update shard state shard1 to inactive
      [junit4:junit4]   1> INFO  - 2013-04-14 19:05:26.861; org.apache.solr.cloud.Overseer$ClusterStateUpdater; Update shard state shard1_0 to active
      [junit4:junit4]   1> INFO  - 2013-04-14 19:05:26.861; org.apache.solr.cloud.Overseer$ClusterStateUpdater; Update shard state shard1_1 to active
      [junit4:junit4]   1> INFO  - 2013-04-14 19:05:26.873; org.apache.solr.update.processor.LogUpdateProcessor; [collection1] webapp= path=/update params={wt=javabin&version=2} {add=[169 (1432319507166134272)]} 0 2
      [junit4:junit4]   1> INFO  - 2013-04-14 19:05:26.877; org.apache.solr.common.cloud.ZkStateReader$2; A cluster state change: WatchedEvent state:SyncConnected type:NodeDataChanged path:/clusterstate.json, has occurred - updating... (live nodes size: 5)
      [junit4:junit4]   1> INFO  - 2013-04-14 19:05:26.877; org.apache.solr.common.cloud.ZkStateReader$2; A cluster state change: WatchedEvent state:SyncConnected type:NodeDataChanged path:/clusterstate.json, has occurred - updating... (live nodes size: 5)
      [junit4:junit4]   1> INFO  - 2013-04-14 19:05:26.877; org.apache.solr.common.cloud.ZkStateReader$2; A cluster state change: WatchedEvent state:SyncConnected type:NodeDataChanged path:/clusterstate.json, has occurred - updating... (live nodes size: 5)
      [junit4:junit4]   1> INFO  - 2013-04-14 19:05:26.877; org.apache.solr.common.cloud.ZkStateReader$2; A cluster state change: WatchedEvent state:SyncConnected type:NodeDataChanged path:/clusterstate.json, has occurred - updating... (live nodes size: 5)
      [junit4:junit4]   1> INFO  - 2013-04-14 19:05:26.877; org.apache.solr.common.cloud.ZkStateReader$2; A cluster state change: WatchedEvent state:SyncConnected type:NodeDataChanged path:/clusterstate.json, has occurred - updating... (live nodes size: 5)
      [junit4:junit4]   1> INFO  - 2013-04-14 19:05:26.877; org.apache.solr.common.cloud.ZkStateReader$2; A cluster state change: WatchedEvent state:SyncConnected type:NodeDataChanged path:/clusterstate.json, has occurred - updating... (live nodes size: 5)
      [junit4:junit4]   1> INFO  - 2013-04-14 19:05:26.884; org.apache.solr.update.processor.LogUpdateProcessor; [collection1_shard1_1_replica1] webapp= path=/update params={distrib.from=http://127.0.0.1:41028/collection1/&update.distrib=FROMLEADER&wt=javabin&distrib.from.parent=shard1&version=2} {} 0 1
      [junit4:junit4]   1> INFO  - 2013-04-14 19:05:26.885; org.apache.solr.update.processor.LogUpdateProcessor; [collection1] webapp= path=/update params={distrib.from=http://127.0.0.1:41028/collection1/&update.distrib=FROMLEADER&wt=javabin&distrib.from.parent=shard1&version=2} {add=[169 (1432319507173474304)]} 0 2
      [junit4:junit4]   1> ERROR - 2013-04-14 19:05:26.885; org.apache.solr.common.SolrException; shard update error StdNode: http://127.0.0.1:41028/collection1_shard1_1_replica1/:org.apache.solr.common.SolrException: version conflict for 169 expected=1432319507173474304 actual=-1
      [junit4:junit4]   1> 	at org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:404)
      [junit4:junit4]   1> 	at org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:181)
      [junit4:junit4]   1> 	at org.apache.solr.update.SolrCmdDistributor$1.call(SolrCmdDistributor.java:332)
      [junit4:junit4]   1> 	at org.apache.solr.update.SolrCmdDistributor$1.call(SolrCmdDistributor.java:306)
      [junit4:junit4]   1> 	at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
      [junit4:junit4]   1> 	at java.util.concurrent.FutureTask.run(FutureTask.java:166)
      [junit4:junit4]   1> 	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
      [junit4:junit4]   1> 	at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
      [junit4:junit4]   1> 	at java.util.concurrent.FutureTask.run(FutureTask.java:166)
      [junit4:junit4]   1> 	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1146)
      [junit4:junit4]   1> 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
      [junit4:junit4]   1> 	at java.lang.Thread.run(Thread.java:679)
      [junit4:junit4]   1> 
      [junit4:junit4]   1> INFO  - 2013-04-14 19:05:26.886; org.apache.solr.update.processor.DistributedUpdateProcessor; try and ask http://127.0.0.1:41028 to recover
      

      The failure is hard to reproduce and very timing sensitive. These kind of failures have always been seen right after "updateshardstate" action.

      Attachments

        1. SOLR-4744__no_more_NPE.patch
          1 kB
          Chris M. Hostetter
        2. SOLR-4744.patch
          19 kB
          Shalin Shekhar Mangar
        3. SOLR-4744.patch
          14 kB
          Shalin Shekhar Mangar

        Issue Links

          Activity

            People

              shalin Shalin Shekhar Mangar
              shalin Shalin Shekhar Mangar
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: