Uploaded image for project: 'Apache Cassandra'
  1. Apache Cassandra
  2. CASSANDRA-16245

Implement repair quality test scenarios

Agile BoardAttach filesAttach ScreenshotBulk Copy AttachmentsBulk Move AttachmentsAdd voteVotersWatch issueWatchersCreate sub-taskConvert to sub-taskMoveLinkCloneLabelsUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete CommentsNeeds CommitterStart Review
    XMLWordPrintableJSON

Details

    • Task
    • Status: Patch Available
    • Normal
    • Resolution: Unresolved
    • 4.0.x
    • Test/dtest/java
    • None
    • Quality Assurance
    • Challenging
    • All
    • None
    • Hide

      Perform repairs for a 3 nodes cluster using m5ad.xlarge instances.
      Repaired keyspaces will use RF=3 or RF=2 (the latter is for subranges with different sets of replicas).

      Mode Version Settings Checks
      Full repair trunk Sequential + All token ranges "No anticompaction (repairedAt==0)
      Out of sync ranges > 0
      Subsequent run must show no out of sync range"
      Full repair trunk Parallel + Primary range "No anticompaction (repairedAt==0)
      Out of sync ranges > 0
      Subsequent run must show no out of sync range"
      Full repair trunk Force terminate repair shortly after it was triggered Repair threads must be cleaned up
      Subrange repair trunk Sequential + single token range "No anticompaction (repairedAt==0)
      Out of sync ranges > 0
      Subsequent run must show no out of sync range"
      Subrange repair trunk Parallel + 10 token ranges which have the same replicas "No anticompaction (repairedAt == 0)
      Out of sync ranges > 0
      Subsequent run must show no out of sync range
      A single repair session will handle all subranges at once"
      Subrange repair trunk Parallel + 10 token ranges which have different replicas "No anticompaction (repairedAt==0)
      Out of sync ranges > 0
      Subsequent run must show no out of sync range
      More than one repair session is triggered to process all subranges"
      Incremental repair trunk "Parallel (mandatory)
      No compaction during repair"
      "Anticompaction status (repairedAt != 0) on all SSTables
      No pending repair on SSTables after completion (could require to wait a bit as this will happen asynchronously)
      Out of sync ranges > 0 + Subsequent run must show no out of sync range"
      Incremental repair trunk "Parallel (mandatory)
      Major compaction triggered during repair"
      "Anticompaction status (repairedAt != 0) on all SSTables
      No pending repair on SSTables after completion (could require to wait a bit as this will happen asynchronously)
      Out of sync ranges > 0 + Subsequent run must show no out of sync range"
      Incremental repair trunk Force terminate repair shortly after it was triggered. Repair threads must be cleaned up
      Show
      Perform repairs for a 3 nodes cluster using m5ad.xlarge instances. Repaired keyspaces will use RF=3 or RF=2 (the latter is for subranges with different sets of replicas). Mode Version Settings Checks Full repair trunk Sequential + All token ranges "No anticompaction (repairedAt==0) Out of sync ranges > 0 Subsequent run must show no out of sync range" Full repair trunk Parallel + Primary range "No anticompaction (repairedAt==0) Out of sync ranges > 0 Subsequent run must show no out of sync range" Full repair trunk Force terminate repair shortly after it was triggered Repair threads must be cleaned up Subrange repair trunk Sequential + single token range "No anticompaction (repairedAt==0) Out of sync ranges > 0 Subsequent run must show no out of sync range" Subrange repair trunk Parallel + 10 token ranges which have the same replicas "No anticompaction (repairedAt == 0) Out of sync ranges > 0 Subsequent run must show no out of sync range A single repair session will handle all subranges at once" Subrange repair trunk Parallel + 10 token ranges which have different replicas "No anticompaction (repairedAt==0) Out of sync ranges > 0 Subsequent run must show no out of sync range More than one repair session is triggered to process all subranges" Incremental repair trunk "Parallel (mandatory) No compaction during repair" "Anticompaction status (repairedAt != 0) on all SSTables No pending repair on SSTables after completion (could require to wait a bit as this will happen asynchronously) Out of sync ranges > 0 + Subsequent run must show no out of sync range" Incremental repair trunk "Parallel (mandatory) Major compaction triggered during repair" "Anticompaction status (repairedAt != 0) on all SSTables No pending repair on SSTables after completion (could require to wait a bit as this will happen asynchronously) Out of sync ranges > 0 + Subsequent run must show no out of sync range" Incremental repair trunk Force terminate repair shortly after it was triggered. Repair threads must be cleaned up

    Description

      Implement the following test scenarios in a new test suite for repair integration testing with significant load:

      Generate/restore a workload of ~100GB per node. Medusa should be considered to create the initial backup which could then be restored from an S3 bucket to speed up node population.
      Data should on purpose require repair and be generated accordingly.

      Perform repairs for a 3 nodes cluster with 4 cores each and 16GB-32GB RAM (m5d.xlarge instances would be the most cost efficient type).
      Repaired keyspaces will use RF=3 or RF=2 in some cases (the latter is for subranges with different sets of replicas).

      Mode Version Settings Checks
      Full repair trunk Sequential + All token ranges "No anticompaction (repairedAt==0)
      Out of sync ranges > 0
      Subsequent run must show no out of sync range"
      Full repair trunk Parallel + Primary range "No anticompaction (repairedAt==0)
      Out of sync ranges > 0
      Subsequent run must show no out of sync range"
      Full repair trunk Force terminate repair shortly after it was triggered Repair threads must be cleaned up
      Subrange repair trunk Sequential + single token range "No anticompaction (repairedAt==0)
      Out of sync ranges > 0
      Subsequent run must show no out of sync range"
      Subrange repair trunk Parallel + 10 token ranges which have the same replicas "No anticompaction (repairedAt == 0)
      Out of sync ranges > 0
      Subsequent run must show no out of sync range
      A single repair session will handle all subranges at once"
      Subrange repair trunk Parallel + 10 token ranges which have different replicas "No anticompaction (repairedAt==0)
      Out of sync ranges > 0
      Subsequent run must show no out of sync range
      More than one repair session is triggered to process all subranges"
      Incremental repair trunk "Parallel (mandatory)
      No compaction during repair"
      "Anticompaction status (repairedAt != 0) on all SSTables
      No pending repair on SSTables after completion (could require to wait a bit as this will happen asynchronously)
      Out of sync ranges > 0 + Subsequent run must show no out of sync range"
      Incremental repair trunk "Parallel (mandatory)
      Major compaction triggered during repair"
      "Anticompaction status (repairedAt != 0) on all SSTables
      No pending repair on SSTables after completion (could require to wait a bit as this will happen asynchronously)
      Out of sync ranges > 0 + Subsequent run must show no out of sync range"
      Incremental repair trunk Force terminate repair shortly after it was triggered. Repair threads must be cleaned up

      Attachments

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            zvo Radovan Zvoncek Assign to me
            adejanovski Alexander Dejanovski
            Alexander Dejanovski, Radovan Zvoncek

            Dates

              Created:
              Updated:

              Slack

                Issue deployment