Uploaded image for project: 'Solr'
  1. Solr
  2. SOLR-10233

Add support for different replica types in Solr

    Details

    • Type: New Feature
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 7.0
    • Component/s: SolrCloud
    • Security Level: Public (Default Security Level. Issues are Public)
    • Labels:
      None

      Description

      For the majority of the cases, current SolrCloud's distributed indexing is great. There is a subset of use cases for which the legacy Master/Slave replication may fit better:

      • Don’t require NRT
      • LIR can become an issue, prefer availability of reads vs consistency or NRT
      • High number of searches (requiring many search nodes)

      SOLR-9835 is adding replicas that don’t do indexing, just update their transaction log. This Jira is to extend that idea and provide the following replica types:

      • Realtime: Writes updates to transaction log and indexes locally. Replicas of type “realtime” support NRT (soft commits) and RTG. Any realtime replica can become a leader. This is the only type supported in SolrCloud at this time and will be the default.
      • Append: Writes to transaction log, but not to index, uses replication. Any append replica can become leader (by first applying all local transaction log elements). If a replica is of type append but is also the leader, it will behave as a realtime. This is exactly what SOLR-9835 is proposing (non-live replicas)
      • Passive: Doesn’t index or writes to transaction log. Just replicates from realtime or append replicas. Passive replicas can’t become shard leaders (i.e., if there are only passive replicas in the collection at some point, updates will fail same as if there is no leaders, queries continue to work), so they don’t even participate in elections.

      When the leader replica of the shard receives an update, it will distribute it to all realtime and append replicas, the same as it does today. It won't distribute to passive replicas.

      By using a combination of append and passive replicas, one can achieve an equivalent of the legacy Master/Slave architecture in SolrCloud mode with most of its benefits, including high availability of writes.

      API (v1 style)

      /admin/collections?action=CREATE…&realtimeReplicas=X&appendReplicas=Y&passiveReplicas=Z
      /admin/collections?action=ADDREPLICA…&type=[realtime/append/passive]

      • “replicationFactor=” will translate to “realtime=“ for back compatibility
      • if passive > 0, append or realtime need to be >= 1 (can’t be all passives)

      Placement Strategies

      By using replica placement rules, one should be able to dedicate nodes to search-only and write-only workloads. For example:

      shard:*,replica:*,type:passive,fleet:slaves
      

      where “type” is a new condition supported by the rule engine, and “fleet:slaves” is a regular tag. Note that rules are only applied when the replicas are created, so a later change in tags won't affect existing replicas. Also, rules are per collection, so each collection could contain it's own different rules.
      Note that on the server side Solr also needs to know how to distribute the shard requests (maybe ShardHandler?) if we want to hit only a subset of replicas (i.e. *passive *replicas only, or similar rules)

      SolrJ

      SolrCloud client could be smart to prefer passive replicas for search requests when available (and if configured to do so). Passive replicas can’t respond RTG requests, so those should go to realtime replicas.

      Cluster/Collection state

      {"gettingstarted":{
        "replicationFactor":"1",
        "router":{"name":"compositeId"},
        "maxShardsPerNode":"2",
        "autoAddReplicas":"false",
        "shards":{
          "shard1":{
            "range":"80000000-ffffffff",
            "state":"active",
            "replicas":{
              "core_node5":{
                "core":"gettingstarted_shard1_replica1",
                "base_url":"http://127.0.0.1:8983/solr",
                "node_name":"127.0.0.1:8983_solr",
                "state":"active",
                "leader":"true",
                **"type": "realtime"**},
              "core_node10":{
                "core":"gettingstarted_shard1_replica2",
                "base_url":"http://127.0.0.1:7574/solr",
                "node_name":"127.0.0.1:7574_solr",
                "state":"active",
                **"type": "passive"**}},
            }},
          "shard2":{
            ...
      

      Back compatibility

      We should be able to support back compatibility by assuming replicas without a “type” property are realtime replicas.

      Failure Scenarios for passive replicas

      Replica-Leader partition

      In SolrCloud today, in this scenario the replica would be placed in LIR. With passive replicas, replicas may not be able to replicate from some time (and fall behind with the index) but queries can still be served. Once the connection is re-established the replication will continue.

      Replica ZooKeeper partition

      Passive replica will leave the cluster. “Smart clients” and other replicas (e.g. for distributed search) won’t find it and won’t query on it. Direct search requests to the replica may still succeed.

      Passive replica dies (or is unreachable)

      Replica won’t be query-able. On restart, replica will recover from the leader, following the same flow as realtime replicas: set state to DOWN, then RECOVERING, and finally ACTIVE. Passive replicas will use a different RecoveryStrategy implementation, that omits preparerecovery, and peer sync attempt, it will jump to replication . If the leader didn't change, or if the other replicas are of type “append”, replication should be incremental. Once the first replication is done, passive replica will declare itself active and start serving traffic.

      Leader dies

      Passive replica won’t be able to replicate. The cluster won’t take updates until a new leader is elected. Once a new leader is elected, updates will be back to normal. Passive replicas will remain active and serving query traffic during the “write outage”. Once the new leader is elected the replication will restart (maybe from a different node)

      Leader ZooKeeper partition

      Same as today. Leader will abandon leadership and a new replica will be elected as leader.

      Q&A

      Can I use a combination of passive + realtime?

      You could. The problem is that, since realtime generate their own index, any change of leadership could trigger a full replication from all the passive replicas. The biggest benefits of append replicas is that they share the same index files, which means that even if the leader changes, the number of segments to replicate will remain low. For that reason, using append replicas is recommended when using passive.

      Can I use passive + append + realtime?

      The issue with mixing realtime replicas with append replicas is that if a different realtime replica becomes the leader, the whole purpose of using append replicas is defeated, since they will all have to replicate the full index.

      What happens if replication from passives fail?

      TBD: In general we want those replicas to continue serving search traffic, but we may want to have a way to say “If can’t replicate after X hours put yourself in recovery” or something similar.
      Varun Thacker suggested that we include in the response time since the last successful replication, and then the client can choose what to do with the results (in a multi-shard request, this date would be the oldest of all shards).

      Do passive replicas need to replicate from the leader only?

      This is not necessary. Passive replicas can replicate from any realtime or append replicas, although this would add some extra waiting time for the last updates. Replicating from a realtime replica may not be a good idea, see the question “Can I use a combination of passive + realtime?”

      What if I need NRT?

      Then you can’t query append or passive replicas. You should use all realtime replicas

      Will new passive replicas start receiving traffic immediately after added?

      passive replicas will have the same states as realtime/append replicas, they’ll join the cluster as “DOWN” and be moved to “RECOVERY” until they can replicate from the leader. Then they’ll start the replication process and become “ACTIVE”, at this point they’ll start responding queries. They'll use a different RecoveryStrategy that skips peer sync and buffering of docs, and just replicates.

      What if a passive replica receives an update?

      This will work the same as today with non-leader replicas, it will just forward the update to the correct leader.

      What is the difference between using active + passive with legacy master/slave?

      These are just some I can think of:

      • You now need ZooKeeper to run in SolrCloud mode
      • High availability for writes, as long as you have more than 1 active replica
      • Shard management by Solr at index time and query time.
      • Full support for Collections and Collections API
      • SolrCloudClient support

      I'd like to get some thoughts on this proposal.

      1. 11431.consoleText.txt
        836 kB
        Steve Rowe
      2. SOLR-10233.patch
        200 kB
        Tomás Fernández Löbbe
      3. SOLR-10233.patch
        209 kB
        Tomás Fernández Löbbe
      4. SOLR-10233.patch
        204 kB
        Tomás Fernández Löbbe
      5. SOLR-10233.patch
        123 kB
        Tomás Fernández Löbbe
      6. SOLR-10233.patch
        123 kB
        Tomás Fernández Löbbe

        Issue Links

          Activity

          Hide
          tomasflobbe Tomás Fernández Löbbe added a comment -

          Here is an initial patch, that adds the the Type enum to Replica and some handling of passive replicas. It relies on code from SOLR-9835 (an older patch, I'll update that next). Also full of nocommits.

          Show
          tomasflobbe Tomás Fernández Löbbe added a comment - Here is an initial patch, that adds the the Type enum to Replica and some handling of passive replicas. It relies on code from SOLR-9835 (an older patch, I'll update that next). Also full of nocommits.
          Hide
          noble.paul Noble Paul added a comment -

          the parameters are not explicit when you create the collection realtime=X&append=Y&passive=Z

          let's make it realtimeReplicas=X&appendReplicas=Y&passiveReplicas=Z

          Show
          noble.paul Noble Paul added a comment - the parameters are not explicit when you create the collection realtime=X&append=Y&passive=Z let's make it realtimeReplicas=X&appendReplicas=Y&passiveReplicas=Z
          Hide
          tomasflobbe Tomás Fernández Löbbe added a comment -

          let's make it realtimeReplicas=X&appendReplicas=Y&passiveReplicas=Z

          SGTM. I'll change the names

          Show
          tomasflobbe Tomás Fernández Löbbe added a comment - let's make it realtimeReplicas=X&appendReplicas=Y&passiveReplicas=Z SGTM. I'll change the names
          Hide
          tomasflobbe Tomás Fernández Löbbe added a comment -

          Here is the same patch, but updated to master and the latest changes of SOLR-9835 (as of yesterday).
          Cao Manh Dat, not sure what you want to do with SOLR-9835, since it seems to be ready. I personally think that since this is changing the API completely I'd hold on committing, but I'll leave that decision to you (master may be fine).
          I'll create a branch for this feature too, to allow collaboration.
          In this patch, TestPassiveReplica passes consistently, but I didn't check other tests, there were lot's of conflicts, since this patch and SOLR-9835 both use "REALTIME_REPLICAS" in a different way.
          I'll spend some time now to make the work of SOLR-9835 use the API defined here.

          Show
          tomasflobbe Tomás Fernández Löbbe added a comment - Here is the same patch, but updated to master and the latest changes of SOLR-9835 (as of yesterday). Cao Manh Dat , not sure what you want to do with SOLR-9835 , since it seems to be ready. I personally think that since this is changing the API completely I'd hold on committing, but I'll leave that decision to you (master may be fine). I'll create a branch for this feature too, to allow collaboration. In this patch, TestPassiveReplica passes consistently, but I didn't check other tests, there were lot's of conflicts, since this patch and SOLR-9835 both use "REALTIME_REPLICAS" in a different way. I'll spend some time now to make the work of SOLR-9835 use the API defined here.
          Hide
          caomanhdat Cao Manh Dat added a comment -

          Tomás Fernández Löbbe Sounds good to me. I'm planning to do more work on the test ( can take one or two days ) before commit it to master.

          Show
          caomanhdat Cao Manh Dat added a comment - Tomás Fernández Löbbe Sounds good to me. I'm planning to do more work on the test ( can take one or two days ) before commit it to master.
          Hide
          janhoy Jan Høydahl added a comment -

          Should we use another word than "passive"? When talking about replicas, we already have the active/inactive state, and inactive~=passive so may cause confusion? Also, should we reserve the word "realtime" to a future version where we might introduce a proper realtime feature? An alternative terminology could perhaps be: type=push/sync/pull? Just an input, feel free to discard.

          How do we control the poll interval for passive/pull replicas? A collection/replica property? Or must be separately set on replicationHandler?

          How to convert a replica from a type to another? Should that be possible through MODIFYCOLLECTION?

          Show
          janhoy Jan Høydahl added a comment - Should we use another word than "passive"? When talking about replicas, we already have the active/inactive state, and inactive~=passive so may cause confusion? Also, should we reserve the word "realtime" to a future version where we might introduce a proper realtime feature? An alternative terminology could perhaps be: type= push/sync/pull ? Just an input, feel free to discard. How do we control the poll interval for passive/pull replicas? A collection/replica property? Or must be separately set on replicationHandler? How to convert a replica from a type to another? Should that be possible through MODIFYCOLLECTION?
          Hide
          tomasflobbe Tomás Fernández Löbbe added a comment -

          Thanks for your comments Jan Høydahl

          Should we use another word than "passive"? When talking about replicas, we already have the active/inactive state, and inactive~=passive so may cause confusion?

          Naming has been difficult for this issue. For passive, other options considered were "read-only replica" or "slave replica". I originally considered also calling "active replica" instead of "append", but that definitely would cause confusion.

          Also, should we reserve the word "realtime" to a future version where we might introduce a proper realtime feature?

          The thinking was that the replication+indexing is happening in "realtime" (before responding). It can definitely be confusing since the search is not really realtime as you said.

          An alternative terminology could perhaps be: type=push/sync/pull?

          Not sure I like those much. "Push" would be misleading since only the leader pushes. I'd like the name to reflect what the replica is going to do with the documents. Maybe "indexer" or "writer" instead of "realtime"? "realtime" was also suggested in SOLR-9835.

          How do we control the poll interval for passive/pull replicas? A collection/replica property? Or must be separately set on replicationHandler?

          At this point I'm just using what was done for SOLR-9835, which sets the interval to autoCommitMaxTime/2 or 3 seconds. Later we can have a configuration in solrconfig.xml (note that the same as SOLR-9835, this code uses a new instance of ReplicationHandler)

          How to convert a replica from a type to another? Should that be possible through MODIFYCOLLECTION?

          In this first iteration this would not be supported for simplicity. In the future I think this could be done via API (would require a core reload for sure), although I would say it should be something that targets the replica, not the collection? Also, note my comment in the patch:

          -        String coreName = collectionName + "_" + position.shard + "_replica" + (position.index + 1);
          +        // TODO: Adding the suffix is great for debugging, but may be an issue if at some point we want to support a way to change replica type
          +        String coreName = collectionName + "_" + position.shard + "_replica" + position.suffix + (position.index + 1);
          

          I can revert that, but for debugging purposes this is actually very good.

          Show
          tomasflobbe Tomás Fernández Löbbe added a comment - Thanks for your comments Jan Høydahl Should we use another word than "passive"? When talking about replicas, we already have the active/inactive state, and inactive~=passive so may cause confusion? Naming has been difficult for this issue. For passive, other options considered were "read-only replica" or "slave replica". I originally considered also calling "active replica" instead of "append", but that definitely would cause confusion. Also, should we reserve the word "realtime" to a future version where we might introduce a proper realtime feature? The thinking was that the replication+indexing is happening in "realtime" (before responding). It can definitely be confusing since the search is not really realtime as you said. An alternative terminology could perhaps be: type=push/sync/pull? Not sure I like those much. "Push" would be misleading since only the leader pushes. I'd like the name to reflect what the replica is going to do with the documents. Maybe "indexer" or "writer" instead of "realtime"? "realtime" was also suggested in SOLR-9835 . How do we control the poll interval for passive/pull replicas? A collection/replica property? Or must be separately set on replicationHandler? At this point I'm just using what was done for SOLR-9835 , which sets the interval to autoCommitMaxTime/2 or 3 seconds. Later we can have a configuration in solrconfig.xml (note that the same as SOLR-9835 , this code uses a new instance of ReplicationHandler) How to convert a replica from a type to another? Should that be possible through MODIFYCOLLECTION? In this first iteration this would not be supported for simplicity. In the future I think this could be done via API (would require a core reload for sure), although I would say it should be something that targets the replica, not the collection? Also, note my comment in the patch: - String coreName = collectionName + "_" + position.shard + "_replica" + (position.index + 1); + // TODO: Adding the suffix is great for debugging, but may be an issue if at some point we want to support a way to change replica type + String coreName = collectionName + "_" + position.shard + "_replica" + position.suffix + (position.index + 1); I can revert that, but for debugging purposes this is actually very good.
          Hide
          janhoy Jan Høydahl added a comment -

          Thanks for commenting. Perhaps the "passive" could be "copy", indicating that it is a copy of the index, not a locally produced one, while the append could be "hotcopy" indicating that it is still a copy but with ability to become leader (hot) Well, I'll stop nit-picking for now, solid work man!

          Show
          janhoy Jan Høydahl added a comment - Thanks for commenting. Perhaps the "passive" could be "copy", indicating that it is a copy of the index, not a locally produced one, while the append could be "hotcopy" indicating that it is still a copy but with ability to become leader (hot) Well, I'll stop nit-picking for now, solid work man!
          Hide
          tomasflobbe Tomás Fernández Löbbe added a comment -

          I now have some time to work on this feature again. Here is a new patch:

          • Changes the API of "onlyLeaderIndexes" to the proposed "num*Replicas" style.
          • Added some more tests for passive replicas
          • Added some tests for "Append" replicas.
          • Updated to current master.
            Some tests are failing (after the change onlyLeaderIndexes->append replicas). I'm going to work on fixing those next.
            Naming is still open to change
          Show
          tomasflobbe Tomás Fernández Löbbe added a comment - I now have some time to work on this feature again. Here is a new patch: Changes the API of "onlyLeaderIndexes" to the proposed "num*Replicas" style. Added some more tests for passive replicas Added some tests for "Append" replicas. Updated to current master. Some tests are failing (after the change onlyLeaderIndexes->append replicas). I'm going to work on fixing those next. Naming is still open to change
          Hide
          tomasflobbe Tomás Fernández Löbbe added a comment -

          This patch fixes the tests. Biggest TODOs:

          • Cleanup RecoveryStrategy code duplication
          • Passive replicas should not have a transaction log
          • Remove the top level "realtimeReplicas" from the cluster state, no longer needed since now each replica has a type
          • More tests

          Later (maybe separate Jira)

          • Client side changes
          • Replica placement rules update
          Show
          tomasflobbe Tomás Fernández Löbbe added a comment - This patch fixes the tests. Biggest TODOs: Cleanup RecoveryStrategy code duplication Passive replicas should not have a transaction log Remove the top level "realtimeReplicas" from the cluster state, no longer needed since now each replica has a type More tests Later (maybe separate Jira) Client side changes Replica placement rules update
          Hide
          tomasflobbe Tomás Fernández Löbbe added a comment -

          After looking at the discussion and changes in SOLR-9045 I decided to remove the ReplicateOnlyRecoveryStrategy at least for now, and just add a new method to RecoveryStrategy that takes a different recovery path (just replication). I also made RecoveryStragety implement Runnable and Closeable instead of extending Thread, it doesn't seem that it's being used as a Thread anywhere anyway, I'm thinking in separating that change to a different Jira. Uploading the latest patch

          Show
          tomasflobbe Tomás Fernández Löbbe added a comment - After looking at the discussion and changes in SOLR-9045 I decided to remove the ReplicateOnlyRecoveryStrategy at least for now, and just add a new method to RecoveryStrategy that takes a different recovery path (just replication). I also made RecoveryStragety implement Runnable and Closeable instead of extending Thread, it doesn't seem that it's being used as a Thread anywhere anyway, I'm thinking in separating that change to a different Jira. Uploading the latest patch
          Hide
          markrmiller@gmail.com Mark Miller added a comment -

          I'm taking a look through this FYI. I'll post some notes soon.

          Show
          markrmiller@gmail.com Mark Miller added a comment - I'm taking a look through this FYI. I'll post some notes soon.
          Hide
          markrmiller@gmail.com Mark Miller added a comment -

          This is great work Tomás Fernández Löbbe. It's a great extension of SOLR-9835 and both issues will overlap quite a bit with the Unified Replication for HDFS stuff I worked on in the past. I've spent some time going through it, but first some high level comments:

          • We should put this in a branch as you suggested. Tough patch for collaboration.
          • We should also looking at using reviewboard or github for review I think - so much surface area that it's hard to not be able to comment in the source.
          • I mostly like the naming, except I'm also a little wary of realtime because we have always been careful to say nrt or nearRealtime (ok, ok, realtime get, but I suppose that is a little trickier). It would be nice to try and keep that designation somehow.
          • It still might be nice, for those not using these other types, to still be able to just use replicationFactor beyond back compat reasons. If you are not jumping into this world, specifying the number of realtimeReplicas is a bit confusing in comparison.
          • We almost certainly want to integrate this well with the chaos monkey tests. But we don't want to muddle what we have up any - it's hard enough to track these tests and their stability. We should make a new one that extends the existing ones and also uses these new types.
          • I think replicas that cannot handle realtime get should def forward to replicas that can.

          There are other code comments I have here and there, but given there is some work still being done, it probably makes sense to wait and get this in a branch and post those to some kind of source code commenting system.

          Anyway, I wanted to start with some basic feedback. More to come.

          Show
          markrmiller@gmail.com Mark Miller added a comment - This is great work Tomás Fernández Löbbe . It's a great extension of SOLR-9835 and both issues will overlap quite a bit with the Unified Replication for HDFS stuff I worked on in the past. I've spent some time going through it, but first some high level comments: We should put this in a branch as you suggested. Tough patch for collaboration. We should also looking at using reviewboard or github for review I think - so much surface area that it's hard to not be able to comment in the source. I mostly like the naming, except I'm also a little wary of realtime because we have always been careful to say nrt or nearRealtime (ok, ok, realtime get, but I suppose that is a little trickier). It would be nice to try and keep that designation somehow. It still might be nice, for those not using these other types, to still be able to just use replicationFactor beyond back compat reasons. If you are not jumping into this world, specifying the number of realtimeReplicas is a bit confusing in comparison. We almost certainly want to integrate this well with the chaos monkey tests. But we don't want to muddle what we have up any - it's hard enough to track these tests and their stability. We should make a new one that extends the existing ones and also uses these new types. I think replicas that cannot handle realtime get should def forward to replicas that can. There are other code comments I have here and there, but given there is some work still being done, it probably makes sense to wait and get this in a branch and post those to some kind of source code commenting system. Anyway, I wanted to start with some basic feedback. More to come.
          Hide
          markrmiller@gmail.com Mark Miller added a comment -

          I still would like the names to be a bit more descriptive. But honestly I can't think of anything much better yet. My best try is like nrtReplicas, tlogReplicas, and indexReplicas or something, and it's just not satisfying either.

          Show
          markrmiller@gmail.com Mark Miller added a comment - I still would like the names to be a bit more descriptive. But honestly I can't think of anything much better yet. My best try is like nrtReplicas, tlogReplicas, and indexReplicas or something, and it's just not satisfying either.
          Hide
          tomasflobbe Tomás Fernández Löbbe added a comment -

          Thanks for the feedback Mark. I'll update the branch today and find the best way to have review.

          still might be nice, for those not using these other types, to still be able to just use replicationFactor beyond back compat reasons

          I don't love the idea of having multiple ways of indicating the same thing, but I guess if we don't find a good/descriptive way to name realtime/index replicas you are right. "replicationFactor" is pretty standard.

          We almost certainly want to integrate this well with the chaos monkey tests

          Yes, I want to have a new chaos monkey test for different replica types. I want to work on that next. I also want to have tests specific to connection error handling using SocketProxy, those may be easier to debug than the chaos monkey tests, although not as much coverage probably.

          think replicas that cannot handle realtime get should def forward to replicas that can.

          Yes. One of the reasons I want no tlog on the passive replicas is to fail fast if any feature tries to use it.

          Show
          tomasflobbe Tomás Fernández Löbbe added a comment - Thanks for the feedback Mark. I'll update the branch today and find the best way to have review. still might be nice, for those not using these other types, to still be able to just use replicationFactor beyond back compat reasons I don't love the idea of having multiple ways of indicating the same thing, but I guess if we don't find a good/descriptive way to name realtime/index replicas you are right. "replicationFactor" is pretty standard. We almost certainly want to integrate this well with the chaos monkey tests Yes, I want to have a new chaos monkey test for different replica types. I want to work on that next. I also want to have tests specific to connection error handling using SocketProxy , those may be easier to debug than the chaos monkey tests, although not as much coverage probably. think replicas that cannot handle realtime get should def forward to replicas that can. Yes. One of the reasons I want no tlog on the passive replicas is to fail fast if any feature tries to use it.
          Hide
          githubbot ASF GitHub Bot added a comment -

          GitHub user tflobbe opened a pull request:

          https://github.com/apache/lucene-solr/pull/196

          SOLR-10233: Add support for different replica types

          Code is not done yet (although getting close). Opening the PR to make it easier to review.

          You can merge this pull request into a Git repository by running:

          $ git pull https://github.com/apache/lucene-solr jira/solr-10233

          Alternatively you can review and apply these changes as the patch at:

          https://github.com/apache/lucene-solr/pull/196.patch

          To close this pull request, make a commit to your master/trunk branch
          with (at least) the following in the commit message:

          This closes #196


          commit 3d149074e143ec685a3d079e9acf33bd9e0e6b40
          Author: Tomas Fernandez Lobbe <tflobbe@apache.org>
          Date: 2017-04-25T23:37:08Z

          Moved patch to (new) branch. Updated to master

          commit a217dfaaf43950fb229b088745d6207ce5106b6e
          Author: Tomas Fernandez Lobbe <tflobbe@apache.org>
          Date: 2017-04-26T00:10:41Z

          Added error handling tests for Passive Replicas

          commit 0330b4abe5785e509b29d3bc7f461c4e57d153f7
          Author: Tomas Fernandez Lobbe <tflobbe@apache.org>
          Date: 2017-04-26T23:21:40Z

          Sometimes use legacyCloud in tests

          commit 304add6f631494d28d952431055e89b8357c6a5a
          Author: Tomas Fernandez Lobbe <tflobbe@apache.org>
          Date: 2017-04-26T23:28:25Z

          Added ChaosMonkey tests with safe leader for passive replicas

          commit 2c133d4cfb533900dcb72784c12b3829e8277c65
          Author: Tomas Fernandez Lobbe <tflobbe@apache.org>
          Date: 2017-04-27T23:27:46Z

          Added ChaosMonkey test without safe leader for passive replicas

          commit a342edd9eee95c30eabd00824a7c69f1d36ba33a
          Author: Tomas Fernandez Lobbe <tflobbe@apache.org>
          Date: 2017-04-27T23:38:24Z

          Fix ChaosMonkey expire connection and connection loss properties

          commit e7d54fa0b1e31b01be05c479975da36c53259a96
          Author: Tomas Fernandez Lobbe <tflobbe@apache.org>
          Date: 2017-04-28T00:51:52Z

          Added logging to ChaosMonkey

          commit 0f9baa4919840e406122bba4ef87897121be0649
          Author: Tomas Fernandez Lobbe <tflobbe@apache.org>
          Date: 2017-04-28T21:26:26Z

          Minor improvements to ChaosMonkey tests

          commit e7c8cec61c5b27bd9ce40eaa29a2f621a0bf2640
          Author: Tomas Fernandez Lobbe <tflobbe@apache.org>
          Date: 2017-04-28T22:53:12Z

          Some code cleanup


          Show
          githubbot ASF GitHub Bot added a comment - GitHub user tflobbe opened a pull request: https://github.com/apache/lucene-solr/pull/196 SOLR-10233 : Add support for different replica types Code is not done yet (although getting close). Opening the PR to make it easier to review. You can merge this pull request into a Git repository by running: $ git pull https://github.com/apache/lucene-solr jira/solr-10233 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/lucene-solr/pull/196.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #196 commit 3d149074e143ec685a3d079e9acf33bd9e0e6b40 Author: Tomas Fernandez Lobbe <tflobbe@apache.org> Date: 2017-04-25T23:37:08Z Moved patch to (new) branch. Updated to master commit a217dfaaf43950fb229b088745d6207ce5106b6e Author: Tomas Fernandez Lobbe <tflobbe@apache.org> Date: 2017-04-26T00:10:41Z Added error handling tests for Passive Replicas commit 0330b4abe5785e509b29d3bc7f461c4e57d153f7 Author: Tomas Fernandez Lobbe <tflobbe@apache.org> Date: 2017-04-26T23:21:40Z Sometimes use legacyCloud in tests commit 304add6f631494d28d952431055e89b8357c6a5a Author: Tomas Fernandez Lobbe <tflobbe@apache.org> Date: 2017-04-26T23:28:25Z Added ChaosMonkey tests with safe leader for passive replicas commit 2c133d4cfb533900dcb72784c12b3829e8277c65 Author: Tomas Fernandez Lobbe <tflobbe@apache.org> Date: 2017-04-27T23:27:46Z Added ChaosMonkey test without safe leader for passive replicas commit a342edd9eee95c30eabd00824a7c69f1d36ba33a Author: Tomas Fernandez Lobbe <tflobbe@apache.org> Date: 2017-04-27T23:38:24Z Fix ChaosMonkey expire connection and connection loss properties commit e7d54fa0b1e31b01be05c479975da36c53259a96 Author: Tomas Fernandez Lobbe <tflobbe@apache.org> Date: 2017-04-28T00:51:52Z Added logging to ChaosMonkey commit 0f9baa4919840e406122bba4ef87897121be0649 Author: Tomas Fernandez Lobbe <tflobbe@apache.org> Date: 2017-04-28T21:26:26Z Minor improvements to ChaosMonkey tests commit e7c8cec61c5b27bd9ce40eaa29a2f621a0bf2640 Author: Tomas Fernandez Lobbe <tflobbe@apache.org> Date: 2017-04-28T22:53:12Z Some code cleanup
          Hide
          tomasflobbe Tomás Fernández Löbbe added a comment -

          I didn't know if Github mirror was going to allow me to create the PR from the official branch, but apparently it does Feel free to review and comment there.
          Code is not ready, but it's getting close:

          • Added new ChaosMonkey tests ("SafeLeader" and "NothingIsSafe"). I've been running them and the "SafeLeader" is doing great. "NothingIsSafe" found some shard inconsistencies, I'm looking at those now.
          • Cleaned some TODOs/nocommits, there are a couple missing.
          • Added a test to validate some connection error handling for passive replicas
          • Made replica types work with "legacyCloud"

          TODOs:

          • Passive replicas to forward RTG requests to leader
          • More tests (in particular, If anyone has an idea of a way to track replica state changes reliable to validate them, I tried it here, but the collection state watcher can see repeated states or miss states)
          • un-deprecate replication factor, at least for the user APIs
          • Define better names, at least for "REALTIME"
          Show
          tomasflobbe Tomás Fernández Löbbe added a comment - I didn't know if Github mirror was going to allow me to create the PR from the official branch, but apparently it does Feel free to review and comment there. Code is not ready, but it's getting close: Added new ChaosMonkey tests ("SafeLeader" and "NothingIsSafe"). I've been running them and the "SafeLeader" is doing great. "NothingIsSafe" found some shard inconsistencies, I'm looking at those now. Cleaned some TODOs/nocommits, there are a couple missing. Added a test to validate some connection error handling for passive replicas Made replica types work with "legacyCloud" TODOs: Passive replicas to forward RTG requests to leader More tests (in particular, If anyone has an idea of a way to track replica state changes reliable to validate them, I tried it here , but the collection state watcher can see repeated states or miss states) un-deprecate replication factor, at least for the user APIs Define better names, at least for "REALTIME"
          Hide
          tomasflobbe Tomás Fernández Löbbe added a comment -

          I’ve been doing some work on improving the tests in the branch. Also, now RTG requests with distrib=true will avoid PASSIVE and non-leader APPEND replicas. For distrib=false cases, PASSIVE replicas will error, but APPEND will proceed the same as today, since this is needed for leader sync.

          Regarding the naming, here are the options proposed

          • REALTIME/PUSH/NRT/WRITER/INDEXER
          • APPEND/SYNC/TLOG/HOTCOPY/LAZY
          • PASSIVE/PULL/INDEX/COPY/SLAVE/READ-ONLY

          Based on the suggestions and comments, I’m going to rename:

          • REALTIME -> NRT: It makes it obvious that if you want NRT results you need these kinds of replicas. Plus we’ve been saying that Solr provides search in NRT and this is the mode currently supported
          • APPEND -> I’m between LAZY and TLOG. The first one is more high level, the name itself doesn’t say much of how the internals work. It’s less specific than TLOG, but it does give you the idea that it will fall behind. Also, I think it kind of makes sense because the replica will “apply the updates to the index only if required (it becomes leader)”. LAZY may be easily confused with PASSIVE. TLOG is good because it’s clearer than LAZY in what the replica does. The main issue I see is that the REALTIME (or NRT) also has a transaction log, so may be confusing. Since Mark proposed TLOG, I’m guessing he is +1 on that one so I’ll just go with it, since i’m on the fence.
          • PASSIVE -> PULL, It is more obvious than “PASSIVE” on how the replica will work, and it doesn’t get confused with a replica state.

          Feel free to comment if you have some good reason against those names.

          I’ll be doing some more work on the tests this week. My plan is to commit this to master soon to have it for 7.0. Placement strategy and client changes are going to be done as followup tasks, I don’t think those should block this.

          Show
          tomasflobbe Tomás Fernández Löbbe added a comment - I’ve been doing some work on improving the tests in the branch. Also, now RTG requests with distrib=true will avoid PASSIVE and non-leader APPEND replicas. For distrib=false cases, PASSIVE replicas will error, but APPEND will proceed the same as today, since this is needed for leader sync. Regarding the naming, here are the options proposed REALTIME/PUSH/NRT/WRITER/INDEXER APPEND/SYNC/TLOG/HOTCOPY/LAZY PASSIVE/PULL/INDEX/COPY/SLAVE/READ-ONLY Based on the suggestions and comments, I’m going to rename: REALTIME -> NRT: It makes it obvious that if you want NRT results you need these kinds of replicas. Plus we’ve been saying that Solr provides search in NRT and this is the mode currently supported APPEND -> I’m between LAZY and TLOG. The first one is more high level, the name itself doesn’t say much of how the internals work. It’s less specific than TLOG, but it does give you the idea that it will fall behind. Also, I think it kind of makes sense because the replica will “apply the updates to the index only if required (it becomes leader)”. LAZY may be easily confused with PASSIVE. TLOG is good because it’s clearer than LAZY in what the replica does. The main issue I see is that the REALTIME (or NRT) also has a transaction log, so may be confusing. Since Mark proposed TLOG, I’m guessing he is +1 on that one so I’ll just go with it, since i’m on the fence. PASSIVE -> PULL, It is more obvious than “PASSIVE” on how the replica will work, and it doesn’t get confused with a replica state. Feel free to comment if you have some good reason against those names. I’ll be doing some more work on the tests this week. My plan is to commit this to master soon to have it for 7.0. Placement strategy and client changes are going to be done as followup tasks, I don’t think those should block this.
          Hide
          tomasflobbe Tomás Fernández Löbbe added a comment - - edited

          Last changes

          • Added support for CreateShard and Backup/Restore with replica types.
          • Renamed types as explained in my previous comment
          • Merged to master
          • Other minor changes to tests.

          TODO before commit:

          • There are still some TODOs and nocommits in the code to address
          • Cleanup some duplication in the new test code.
            A couple of things Anshum suggested offline:
          • Test DELETENODE and DELETEREPLICA
          • Replication from leader should fail if the leader changes

          My plan is to defer to new Jiras:

          • Add support for replica types in replica placement strategies
          • There should be a way to tell CloudSolrClient to query only passive replicas
          • autoAddReplicas doesn’t work for different replica types
          • When querying a replica of type PULL, it would be nice to get back the time since the last successful replication
          • Add ability for PULL replicas to go into recovery mode after X number of failed replications
          Show
          tomasflobbe Tomás Fernández Löbbe added a comment - - edited Last changes Added support for CreateShard and Backup/Restore with replica types. Renamed types as explained in my previous comment Merged to master Other minor changes to tests. TODO before commit: There are still some TODOs and nocommits in the code to address Cleanup some duplication in the new test code. A couple of things Anshum suggested offline: Test DELETENODE and DELETEREPLICA Replication from leader should fail if the leader changes My plan is to defer to new Jiras: Add support for replica types in replica placement strategies There should be a way to tell CloudSolrClient to query only passive replicas autoAddReplicas doesn’t work for different replica types When querying a replica of type PULL, it would be nice to get back the time since the last successful replication Add ability for PULL replicas to go into recovery mode after X number of failed replications
          Hide
          tomasflobbe Tomás Fernández Löbbe added a comment -

          Just pushed some more changes to the branch. Addressed most of the TODOs, except for "Replication from leader should fail if the leader changes", which I think is fine to defer to another Jira. I plan to commit this soon if I see no problems with the tests. Mark Miller, you had some more comments and were waiting on the PR, do you still plan to review? If so, let me know and I can wait.

          Show
          tomasflobbe Tomás Fernández Löbbe added a comment - Just pushed some more changes to the branch. Addressed most of the TODOs, except for "Replication from leader should fail if the leader changes", which I think is fine to defer to another Jira. I plan to commit this soon if I see no problems with the tests. Mark Miller , you had some more comments and were waiting on the PR, do you still plan to review? If so, let me know and I can wait.
          Hide
          anshumg Anshum Gupta added a comment -

          LGTM, and +1 to get this into master. We should give it as much time as we can on master before the release.

          Here are a few points I think we need to document or address:
          1. How would this work with Replica placement strategy. Considering the auto-scaling stuff would be released with 7.0, we should not be too bothered about supporting this with replica placement strategy.
          2. We should make sure that the auto-scaling effort that’s running parallel to this is sync’ed up.
          3. Integrate this with Version 2 APIs.
          4. Integration with SolrJ for reads, writes, and collection API calls.

          Show
          anshumg Anshum Gupta added a comment - LGTM, and +1 to get this into master. We should give it as much time as we can on master before the release. Here are a few points I think we need to document or address: 1. How would this work with Replica placement strategy. Considering the auto-scaling stuff would be released with 7.0, we should not be too bothered about supporting this with replica placement strategy. 2. We should make sure that the auto-scaling effort that’s running parallel to this is sync’ed up. 3. Integrate this with Version 2 APIs. 4. Integration with SolrJ for reads, writes, and collection API calls.
          Hide
          tomasflobbe Tomás Fernández Löbbe added a comment -

          We should give it as much time as we can on master before the release.

          +1

          How would this work with Replica placement strategy. Considering the auto-scaling stuff would be released with 7.0, we should not be too bothered about supporting this with replica placement strategy.

          Makes sense.

          We should make sure that the auto-scaling effort that’s running parallel to this is sync’ed up.

          Makes sense too. New DSL was suggested for autoscaling/replica placement, we should make sure to include options to work with different replica types. I’ll start the conversation in SOLR-9735 and subtasks.

          Integrate this with Version 2 APIs

          Good point. I don’t want to block this commit since it’s already to difficult to review. I’ll make this as part of another Jira, but should make sure to have that before 7.0 is released

          Integration with SolrJ for reads, writes, and collection API calls.

          Good point…

          • Client side handling of search requests can be improved, but I think it would be fine to leave to another Jira. Ideally, one could tell the client to only query PULL replicas, if you want for example, something like Master/Slave, where masters are never queried. Also RTG requests could go to NRT replicas only (or leader in case of TLOG replicas). People also suggested being able to provide more complex rules of how to handle queries with different replica types.
          • Writes should be handled automatically if you use CloudSolrClient, since the requests is going to the leader. If you use HttpSolrClient, nothing to do there, the request will get to the specified node, and it will be internally forwarded to the leader, same as today with no types.
          • So far I added SolrJ integration for create collection and add shard. I don’t think other SolrJ actions will need modification. MODIFYCOLLECTION currently can’t change “numTYPEReplicas”, but I think that’s fine, since there is no real action happening after a change like that, may be needed for the autoscale work though.
          Show
          tomasflobbe Tomás Fernández Löbbe added a comment - We should give it as much time as we can on master before the release. +1 How would this work with Replica placement strategy. Considering the auto-scaling stuff would be released with 7.0, we should not be too bothered about supporting this with replica placement strategy. Makes sense. We should make sure that the auto-scaling effort that’s running parallel to this is sync’ed up. Makes sense too. New DSL was suggested for autoscaling/replica placement, we should make sure to include options to work with different replica types. I’ll start the conversation in SOLR-9735 and subtasks. Integrate this with Version 2 APIs Good point. I don’t want to block this commit since it’s already to difficult to review. I’ll make this as part of another Jira, but should make sure to have that before 7.0 is released Integration with SolrJ for reads, writes, and collection API calls. Good point… Client side handling of search requests can be improved, but I think it would be fine to leave to another Jira. Ideally, one could tell the client to only query PULL replicas, if you want for example, something like Master/Slave, where masters are never queried. Also RTG requests could go to NRT replicas only (or leader in case of TLOG replicas). People also suggested being able to provide more complex rules of how to handle queries with different replica types. Writes should be handled automatically if you use CloudSolrClient, since the requests is going to the leader. If you use HttpSolrClient, nothing to do there, the request will get to the specified node, and it will be internally forwarded to the leader, same as today with no types. So far I added SolrJ integration for create collection and add shard. I don’t think other SolrJ actions will need modification. MODIFYCOLLECTION currently can’t change “numTYPEReplicas”, but I think that’s fine, since there is no real action happening after a change like that, may be needed for the autoscale work though.
          Hide
          jira-bot ASF subversion and git services added a comment -

          Commit 2fc41d565a4a0408a09856a37d3be7d87414ba3f in lucene-solr's branch refs/heads/master from Tomas Fernandez Lobbe
          [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=2fc41d5 ]

          SOLR-10233: Add support for replica types

          Show
          jira-bot ASF subversion and git services added a comment - Commit 2fc41d565a4a0408a09856a37d3be7d87414ba3f in lucene-solr's branch refs/heads/master from Tomas Fernandez Lobbe [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=2fc41d5 ] SOLR-10233 : Add support for replica types
          Hide
          noble.paul Noble Paul added a comment -

          The current replica placement strategy will be deprecated and the new policy framework will be used everywhere. it makes sense to start discussing the syntax in a separate ticket after SOLR-10278 is committed

          Show
          noble.paul Noble Paul added a comment - The current replica placement strategy will be deprecated and the new policy framework will be used everywhere. it makes sense to start discussing the syntax in a separate ticket after SOLR-10278 is committed
          Hide
          tomasflobbe Tomás Fernández Löbbe added a comment -

          Saw a couple of ChaosMonkeySafeLeaderWithPullReplicasTest failures in Jenkins. I'll look into those.

          Show
          tomasflobbe Tomás Fernández Löbbe added a comment - Saw a couple of ChaosMonkeySafeLeaderWithPullReplicasTest failures in Jenkins. I'll look into those.
          Hide
          steve_rowe Steve Rowe added a comment -

          My Jenkins had a non-reproducing failure for TestPullReplica - I'm attaching the log.

          Show
          steve_rowe Steve Rowe added a comment - My Jenkins had a non-reproducing failure for TestPullReplica - I'm attaching the log.
          Hide
          tomasflobbe Tomás Fernández Löbbe added a comment -

          Thanks Steve, looking into fixing that.

          Show
          tomasflobbe Tomás Fernández Löbbe added a comment - Thanks Steve, looking into fixing that.
          Hide
          jira-bot ASF subversion and git services added a comment -

          Commit 1e4d2052e6ce10b4012eda8802a8d32ccadeeba3 in lucene-solr's branch refs/heads/master from Tomas Fernandez Lobbe
          [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=1e4d205 ]

          SOLR-10233: ChaosMonkeySafeLeaderWithPullReplicasTest - Catch SolrException while waiting for the cluster to be ready

          Show
          jira-bot ASF subversion and git services added a comment - Commit 1e4d2052e6ce10b4012eda8802a8d32ccadeeba3 in lucene-solr's branch refs/heads/master from Tomas Fernandez Lobbe [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=1e4d205 ] SOLR-10233 : ChaosMonkeySafeLeaderWithPullReplicasTest - Catch SolrException while waiting for the cluster to be ready
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user tflobbe closed the pull request at:

          https://github.com/apache/lucene-solr/pull/196

          Show
          githubbot ASF GitHub Bot added a comment - Github user tflobbe closed the pull request at: https://github.com/apache/lucene-solr/pull/196
          Hide
          tomasflobbe Tomás Fernández Löbbe added a comment -

          This Jenkins failure[1] may be related to this change. I'll take a look.

          [1] https://jenkins.thetaphi.de/job/Lucene-Solr-master-Linux/19713/

          Show
          tomasflobbe Tomás Fernández Löbbe added a comment - This Jenkins failure [1] may be related to this change. I'll take a look. [1] https://jenkins.thetaphi.de/job/Lucene-Solr-master-Linux/19713/
          Hide
          jira-bot ASF subversion and git services added a comment -

          Commit 8f92fb4722709bec34b4c0330afb7cabba86e350 in lucene-solr's branch refs/heads/master from Tomas Fernandez Lobbe
          [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=8f92fb4 ]

          SOLR-10233: Correctly set maxShardsPerNode in BackupRestoreTestCase when using createNodeSet and replica types

          Show
          jira-bot ASF subversion and git services added a comment - Commit 8f92fb4722709bec34b4c0330afb7cabba86e350 in lucene-solr's branch refs/heads/master from Tomas Fernandez Lobbe [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=8f92fb4 ] SOLR-10233 : Correctly set maxShardsPerNode in BackupRestoreTestCase when using createNodeSet and replica types
          Hide
          tomasflobbe Tomás Fernández Löbbe added a comment -
          Show
          tomasflobbe Tomás Fernández Löbbe added a comment - Looking at ChaosMonkey failure: https://jenkins.thetaphi.de/job/Lucene-Solr-master-Linux/19780/
          Hide
          jira-bot ASF subversion and git services added a comment -

          Commit 46a5ae23a76fcf0cbb98ac3874ae69cdb90173a4 in lucene-solr's branch refs/heads/master from Tomás Fernández Löbbe
          [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=46a5ae2 ]

          SOLR-10233: Some more logging to chaos monkey with replica types tests

          Show
          jira-bot ASF subversion and git services added a comment - Commit 46a5ae23a76fcf0cbb98ac3874ae69cdb90173a4 in lucene-solr's branch refs/heads/master from Tomás Fernández Löbbe [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=46a5ae2 ] SOLR-10233 : Some more logging to chaos monkey with replica types tests
          Hide
          jira-bot ASF subversion and git services added a comment -

          Commit 97655b880c0230c0d42baba314c28831ee729323 in lucene-solr's branch refs/heads/master from Tomás Fernández Löbbe
          [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=97655b8 ]

          SOLR-10233: Cleanup warnings from ReplicateFromLeader

          Show
          jira-bot ASF subversion and git services added a comment - Commit 97655b880c0230c0d42baba314c28831ee729323 in lucene-solr's branch refs/heads/master from Tomás Fernández Löbbe [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=97655b8 ] SOLR-10233 : Cleanup warnings from ReplicateFromLeader
          Hide
          jira-bot ASF subversion and git services added a comment -

          Commit a03c3369e28a1c350842649726801e79285625e7 in lucene-solr's branch refs/heads/master from Tomás Fernández Löbbe
          [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=a03c336 ]

          SOLR-10233: Stop warning users about misconfigured ReplicationHandler when using replica types

          Show
          jira-bot ASF subversion and git services added a comment - Commit a03c3369e28a1c350842649726801e79285625e7 in lucene-solr's branch refs/heads/master from Tomás Fernández Löbbe [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=a03c336 ] SOLR-10233 : Stop warning users about misconfigured ReplicationHandler when using replica types
          Hide
          tomasflobbe Tomás Fernández Löbbe added a comment -

          Marking this as resolved. Any extra related work should have it's own Jira.

          Show
          tomasflobbe Tomás Fernández Löbbe added a comment - Marking this as resolved. Any extra related work should have it's own Jira.

            People

            • Assignee:
              tomasflobbe Tomás Fernández Löbbe
              Reporter:
              tomasflobbe Tomás Fernández Löbbe
            • Votes:
              0 Vote for this issue
              Watchers:
              14 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development