The crux of why the errors are silently ignored seems to be because DUP.doFinish() only logs a WARN – but does not propogate – any error returned by cmdDistrib.getErrors(); unless the type of Node involved in the request was a RetryNode. The reason given for this being...
The problem aparently being that when a DBQ is forwarded to all leaders, StdNode is (currently) used – but there was no local operation executed, only the forward to the leaders, so there is no local success/failure.
The attached patch changes the DBQ propagation logic to use RetryNode – i'm still running full tests, but at a minimum it makes the new TestCloudDeleteByQuery in the patch start passing.
i don't fully understand the entire ramifications of this change, particularly as it relates to rest of the code in DUP.doFinish and things like forcing leader recovery, but based on the comments on the StdNode / RetryNode classes and the other uses of StdNode / RetryNodeRetryNode (notably: STD->replica vs RETRY->leader) this seems like the most correct fix in general.