Currently, most of the repair tasks (taking snapshots, send/receiving merkle tree, compute MT difference, etc) are done on single threaded AntiEntropyStage.
This causes a problem like
CASSANDRA-6415 and likely to cause unnecessary wait.
Also, repair is done one CF at the time. I think we can parallelize this(concurrency is configurable by a user based on # of CF and load of the nodes) for faster processing.