Uploaded image for project: 'Cassandra'
  1. Cassandra
  2. CASSANDRA-16245

Implement repair quality test scenarios

Agile BoardAttach filesAttach ScreenshotAdd voteVotersWatch issueWatchersCreate sub-taskConvert to sub-taskMoveLinkCloneLabelsUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete CommentsNeeds CommitterStart Review
    XMLWordPrintableJSON

    Details

    • Type: Task
    • Status: Patch Available
    • Priority: Normal
    • Resolution: Unresolved
    • Fix Version/s: 4.0.x
    • Component/s: Test/dtest/java
    • Labels:
      None
    • Change Category:
      Quality Assurance
    • Complexity:
      Challenging
    • Platform:
      All
    • Impacts:
      None
    • Test and Documentation Plan:
      Hide

      Perform repairs for a 3 nodes cluster using m5ad.xlarge instances.
      Repaired keyspaces will use RF=3 or RF=2 (the latter is for subranges with different sets of replicas).

      Mode Version Settings Checks
      Full repair trunk Sequential + All token ranges "No anticompaction (repairedAt==0)
      Out of sync ranges > 0
      Subsequent run must show no out of sync range"
      Full repair trunk Parallel + Primary range "No anticompaction (repairedAt==0)
      Out of sync ranges > 0
      Subsequent run must show no out of sync range"
      Full repair trunk Force terminate repair shortly after it was triggered Repair threads must be cleaned up
      Subrange repair trunk Sequential + single token range "No anticompaction (repairedAt==0)
      Out of sync ranges > 0
      Subsequent run must show no out of sync range"
      Subrange repair trunk Parallel + 10 token ranges which have the same replicas "No anticompaction (repairedAt == 0)
      Out of sync ranges > 0
      Subsequent run must show no out of sync range
      A single repair session will handle all subranges at once"
      Subrange repair trunk Parallel + 10 token ranges which have different replicas "No anticompaction (repairedAt==0)
      Out of sync ranges > 0
      Subsequent run must show no out of sync range
      More than one repair session is triggered to process all subranges"
      Incremental repair trunk "Parallel (mandatory)
      No compaction during repair"
      "Anticompaction status (repairedAt != 0) on all SSTables
      No pending repair on SSTables after completion (could require to wait a bit as this will happen asynchronously)
      Out of sync ranges > 0 + Subsequent run must show no out of sync range"
      Incremental repair trunk "Parallel (mandatory)
      Major compaction triggered during repair"
      "Anticompaction status (repairedAt != 0) on all SSTables
      No pending repair on SSTables after completion (could require to wait a bit as this will happen asynchronously)
      Out of sync ranges > 0 + Subsequent run must show no out of sync range"
      Incremental repair trunk Force terminate repair shortly after it was triggered. Repair threads must be cleaned up
      Show
      Perform repairs for a 3 nodes cluster using m5ad.xlarge instances. Repaired keyspaces will use RF=3 or RF=2 (the latter is for subranges with different sets of replicas). Mode Version Settings Checks Full repair trunk Sequential + All token ranges "No anticompaction (repairedAt==0) Out of sync ranges > 0 Subsequent run must show no out of sync range" Full repair trunk Parallel + Primary range "No anticompaction (repairedAt==0) Out of sync ranges > 0 Subsequent run must show no out of sync range" Full repair trunk Force terminate repair shortly after it was triggered Repair threads must be cleaned up Subrange repair trunk Sequential + single token range "No anticompaction (repairedAt==0) Out of sync ranges > 0 Subsequent run must show no out of sync range" Subrange repair trunk Parallel + 10 token ranges which have the same replicas "No anticompaction (repairedAt == 0) Out of sync ranges > 0 Subsequent run must show no out of sync range A single repair session will handle all subranges at once" Subrange repair trunk Parallel + 10 token ranges which have different replicas "No anticompaction (repairedAt==0) Out of sync ranges > 0 Subsequent run must show no out of sync range More than one repair session is triggered to process all subranges" Incremental repair trunk "Parallel (mandatory) No compaction during repair" "Anticompaction status (repairedAt != 0) on all SSTables No pending repair on SSTables after completion (could require to wait a bit as this will happen asynchronously) Out of sync ranges > 0 + Subsequent run must show no out of sync range" Incremental repair trunk "Parallel (mandatory) Major compaction triggered during repair" "Anticompaction status (repairedAt != 0) on all SSTables No pending repair on SSTables after completion (could require to wait a bit as this will happen asynchronously) Out of sync ranges > 0 + Subsequent run must show no out of sync range" Incremental repair trunk Force terminate repair shortly after it was triggered. Repair threads must be cleaned up

      Description

      Implement the following test scenarios in a new test suite for repair integration testing with significant load:

      Generate/restore a workload of ~100GB per node. Medusa should be considered to create the initial backup which could then be restored from an S3 bucket to speed up node population.
      Data should on purpose require repair and be generated accordingly.

      Perform repairs for a 3 nodes cluster with 4 cores each and 16GB-32GB RAM (m5d.xlarge instances would be the most cost efficient type).
      Repaired keyspaces will use RF=3 or RF=2 in some cases (the latter is for subranges with different sets of replicas).

      Mode Version Settings Checks
      Full repair trunk Sequential + All token ranges "No anticompaction (repairedAt==0)
      Out of sync ranges > 0
      Subsequent run must show no out of sync range"
      Full repair trunk Parallel + Primary range "No anticompaction (repairedAt==0)
      Out of sync ranges > 0
      Subsequent run must show no out of sync range"
      Full repair trunk Force terminate repair shortly after it was triggered Repair threads must be cleaned up
      Subrange repair trunk Sequential + single token range "No anticompaction (repairedAt==0)
      Out of sync ranges > 0
      Subsequent run must show no out of sync range"
      Subrange repair trunk Parallel + 10 token ranges which have the same replicas "No anticompaction (repairedAt == 0)
      Out of sync ranges > 0
      Subsequent run must show no out of sync range
      A single repair session will handle all subranges at once"
      Subrange repair trunk Parallel + 10 token ranges which have different replicas "No anticompaction (repairedAt==0)
      Out of sync ranges > 0
      Subsequent run must show no out of sync range
      More than one repair session is triggered to process all subranges"
      Incremental repair trunk "Parallel (mandatory)
      No compaction during repair"
      "Anticompaction status (repairedAt != 0) on all SSTables
      No pending repair on SSTables after completion (could require to wait a bit as this will happen asynchronously)
      Out of sync ranges > 0 + Subsequent run must show no out of sync range"
      Incremental repair trunk "Parallel (mandatory)
      Major compaction triggered during repair"
      "Anticompaction status (repairedAt != 0) on all SSTables
      No pending repair on SSTables after completion (could require to wait a bit as this will happen asynchronously)
      Out of sync ranges > 0 + Subsequent run must show no out of sync range"
      Incremental repair trunk Force terminate repair shortly after it was triggered. Repair threads must be cleaned up

        Attachments

        Issue Links

          Activity

          $i18n.getText('security.level.explanation', $currentSelection) Viewable by All Users
          Cancel

            People

            • Assignee:
              zvo Radovan Zvoncek Assign to me
              Reporter:
              adejanovski Alexander Dejanovski
              Authors:
              Alexander Dejanovski, Radovan Zvoncek

              Dates

              • Created:
                Updated:

                Issue deployment