Uploaded image for project: 'Solr'
  1. Solr
  2. SOLR-4744

Version conflict error during shard split test

Attach filesAttach ScreenshotVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Minor
    • Resolution: Fixed
    • 4.3
    • 4.3.1, 4.4
    • SolrCloud
    • None

    Description

      ShardSplitTest fails sometimes with the following error:

      [junit4:junit4]   1> INFO  - 2013-04-14 19:05:26.861; org.apache.solr.cloud.Overseer$ClusterStateUpdater; Update shard state invoked for collection: collection1
      [junit4:junit4]   1> INFO  - 2013-04-14 19:05:26.861; org.apache.solr.cloud.Overseer$ClusterStateUpdater; Update shard state shard1 to inactive
      [junit4:junit4]   1> INFO  - 2013-04-14 19:05:26.861; org.apache.solr.cloud.Overseer$ClusterStateUpdater; Update shard state shard1_0 to active
      [junit4:junit4]   1> INFO  - 2013-04-14 19:05:26.861; org.apache.solr.cloud.Overseer$ClusterStateUpdater; Update shard state shard1_1 to active
      [junit4:junit4]   1> INFO  - 2013-04-14 19:05:26.873; org.apache.solr.update.processor.LogUpdateProcessor; [collection1] webapp= path=/update params={wt=javabin&version=2} {add=[169 (1432319507166134272)]} 0 2
      [junit4:junit4]   1> INFO  - 2013-04-14 19:05:26.877; org.apache.solr.common.cloud.ZkStateReader$2; A cluster state change: WatchedEvent state:SyncConnected type:NodeDataChanged path:/clusterstate.json, has occurred - updating... (live nodes size: 5)
      [junit4:junit4]   1> INFO  - 2013-04-14 19:05:26.877; org.apache.solr.common.cloud.ZkStateReader$2; A cluster state change: WatchedEvent state:SyncConnected type:NodeDataChanged path:/clusterstate.json, has occurred - updating... (live nodes size: 5)
      [junit4:junit4]   1> INFO  - 2013-04-14 19:05:26.877; org.apache.solr.common.cloud.ZkStateReader$2; A cluster state change: WatchedEvent state:SyncConnected type:NodeDataChanged path:/clusterstate.json, has occurred - updating... (live nodes size: 5)
      [junit4:junit4]   1> INFO  - 2013-04-14 19:05:26.877; org.apache.solr.common.cloud.ZkStateReader$2; A cluster state change: WatchedEvent state:SyncConnected type:NodeDataChanged path:/clusterstate.json, has occurred - updating... (live nodes size: 5)
      [junit4:junit4]   1> INFO  - 2013-04-14 19:05:26.877; org.apache.solr.common.cloud.ZkStateReader$2; A cluster state change: WatchedEvent state:SyncConnected type:NodeDataChanged path:/clusterstate.json, has occurred - updating... (live nodes size: 5)
      [junit4:junit4]   1> INFO  - 2013-04-14 19:05:26.877; org.apache.solr.common.cloud.ZkStateReader$2; A cluster state change: WatchedEvent state:SyncConnected type:NodeDataChanged path:/clusterstate.json, has occurred - updating... (live nodes size: 5)
      [junit4:junit4]   1> INFO  - 2013-04-14 19:05:26.884; org.apache.solr.update.processor.LogUpdateProcessor; [collection1_shard1_1_replica1] webapp= path=/update params={distrib.from=http://127.0.0.1:41028/collection1/&update.distrib=FROMLEADER&wt=javabin&distrib.from.parent=shard1&version=2} {} 0 1
      [junit4:junit4]   1> INFO  - 2013-04-14 19:05:26.885; org.apache.solr.update.processor.LogUpdateProcessor; [collection1] webapp= path=/update params={distrib.from=http://127.0.0.1:41028/collection1/&update.distrib=FROMLEADER&wt=javabin&distrib.from.parent=shard1&version=2} {add=[169 (1432319507173474304)]} 0 2
      [junit4:junit4]   1> ERROR - 2013-04-14 19:05:26.885; org.apache.solr.common.SolrException; shard update error StdNode: http://127.0.0.1:41028/collection1_shard1_1_replica1/:org.apache.solr.common.SolrException: version conflict for 169 expected=1432319507173474304 actual=-1
      [junit4:junit4]   1> 	at org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:404)
      [junit4:junit4]   1> 	at org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:181)
      [junit4:junit4]   1> 	at org.apache.solr.update.SolrCmdDistributor$1.call(SolrCmdDistributor.java:332)
      [junit4:junit4]   1> 	at org.apache.solr.update.SolrCmdDistributor$1.call(SolrCmdDistributor.java:306)
      [junit4:junit4]   1> 	at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
      [junit4:junit4]   1> 	at java.util.concurrent.FutureTask.run(FutureTask.java:166)
      [junit4:junit4]   1> 	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
      [junit4:junit4]   1> 	at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
      [junit4:junit4]   1> 	at java.util.concurrent.FutureTask.run(FutureTask.java:166)
      [junit4:junit4]   1> 	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1146)
      [junit4:junit4]   1> 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
      [junit4:junit4]   1> 	at java.lang.Thread.run(Thread.java:679)
      [junit4:junit4]   1> 
      [junit4:junit4]   1> INFO  - 2013-04-14 19:05:26.886; org.apache.solr.update.processor.DistributedUpdateProcessor; try and ask http://127.0.0.1:41028 to recover
      

      The failure is hard to reproduce and very timing sensitive. These kind of failures have always been seen right after "updateshardstate" action.

      Attachments

        1. SOLR-4744__no_more_NPE.patch
          1 kB
          Chris M. Hostetter
        2. SOLR-4744.patch
          19 kB
          Shalin Shekhar Mangar
        3. SOLR-4744.patch
          14 kB
          Shalin Shekhar Mangar

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            shalin Shalin Shekhar Mangar
            shalin Shalin Shekhar Mangar
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment