Aggregate warnings were not supported in Spark mode yet (hence the e2e Warning test case failures). I aim to enable this now.
In MR/Tez we use counters, and in Spark we rely on Accumulators (a means to support distributed counters).
Pig has some builtin warning enums in PigWarning, and also supports custom warnings for user defined functions.
This latter is problematic with Spark because you cannot register new accumulators on the backend and read their values later in the driver.
A workaround has been implemented in my patch PIG-5186.0.patch whereas we define Map type of Accumulators (beside the Long type we already use). One for the builtin warnings, one for the custom ones. These are passed from driver to backend, where the executors can create entries in the maps or increment preexisting values.
liyunzhang, Nandor Kollar please take look and let me know what you think.