Description
Consider base tables T1 and T2, each of which has a global mutable index, call them IDX1 and IDX2.
RegionServer A hosts a T1 region and an IDX2 region.
RegionServer B hosts a T2 region and an IDX1 region.
Because of prior problems, both IDX1 and IDX2 have lots of unverified rows. Both are under heavy read-load from clients.
IDX1 coprocs try to scan the T1 region on RS A, but can't because the standard RPC queue is full on RS A because of all the IDX2 clients waiting on cross-server read repairs.
IDX2 coprocs try to scan the T2 region on RS B, but can't because the standard RPC queue is full on RS B because of the IDX1 clients waiting on cross-server read repairs .
It's not a permanent deadlock (eventually we'll start throwing RegionTooBusyExceptions), but it would be unpleasant for clients.
If read-repair used the index RPC pool instead of the standard one, this scenario couldn't happen (unless there were also too many mutable index writes, in which case we're just saturated and back to lots of exceptions.)
Attachments
Attachments
Issue Links
- links to