Had some discussions with James R. Taylor, Samarth Jain, and Vincent Poon, during which I suggested that we can possibly eliminate retry loops happening at the server that cause the handler threads to be stuck potentially for quite a while (at least multiple seconds to ride over common scenarios like splits).
Instead we can do the retries at the Phoenix client that.
- The index updates are not retried on the server. (retries = 0)
- A failed index update would set the failed index timestamp but leave the index enabled.
- Now the handler thread is done, it throws an appropriate exception back to the client.
- The Phoenix client can now retry. When those retries fail the index is disabled (if the policy dictates that) and throw the exception back to its caller.
So no more waiting is needed on the server, handler threads are freed immediately.