What should be done here is dependent on the downstream consumer. If you have a running count, the downstream system can simply keep track of the last report it got, and subtract the last report from the current report to get the difference. If you have just an incremental count, then the downstream system can keep the running total.
Given that both implementations allow you to derive the same overall information, I agree that it makes more sense to just send incremental updates.
I see a problem with the patch you've provided, though. We fetch the value in one line, and then reset back to 0 in the next. This leads to a race condition where we might fetch the value, then another thread updates the count (reporting happens on a separate thread from SamzaContainer's main thread), then we call clear. In this scenario, there is data loss since we reset the counter after it's been updated again, but before it's been reported.
To solve this problem, I think we need to use getAndSet in Counter.set, and return the old value. Then we can call clear() and atomically get the old value while updating the new value back to 0.