Details
-
Improvement
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
None
-
None
-
None
Description
VerifyReplication includes an option "sleepMsBeforeReCompare". This is useful for helping work around replication lag. However, adding a sleep in a hadoop job can drastically slow that job down if there is anything more than a small number of invalid results.
We can mitigate this by doing the recompare in a separate thread. We can limit the thread pool and fallback to doing the recompare in the main thread if the thread pool is full. This way we offload some of the slowness but still retain the same validation guarantees. A configuration can be added to control how many threads per mapper.