Uploaded image for project: 'Apache Cassandra'
  1. Apache Cassandra
  2. CASSANDRA-5550

nodetool repair hanging on multi sites config

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Normal
    • Resolution: Duplicate
    • None
    • Tool/nodetool
    • None
    • Normal

    Description

      We recently added a new datacenter to our cluster, and we want some keyspaces to be replicated to it, but not all.
      So some of our Keyspaces have a replication factor of

      { dcMain: 3, dcNew: 1 }

      and some have

      { dcMain: 3 }

      We then ran a nodetool repair on one of the new servers so the data would be replicated correctly, but it just hung.
      After some investigation, we saw that if a Keyspace didn't have replication info for dcNew, or a replication factor of 0, a nodetool repair from the new servers would hang (nothing concerning repair in the logs, not even in DEBUG level).
      On the other hand, it works properly if the Keyspace has a replica in dcNew or if the repair is called from dcMain.
      We might be doing something wrong, but I don't think nodetool should hang silently even if something is going wrong.

      Nodetool repair -pr also works in all situations, but it seems that it is not copying the data to the new datacenter at all, it runs and logs that the keyspace is up to date after receiving the merkle trees, even though nothing is actually copied on the disk, even after flushing (maybe an unrelated issue, should I open another ticket?).

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              alprema Kévin LOVATO
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: