Skipping bad records feature need a way to get a callback for the number of processed records from streaming process. To support this, counters were chosen as that is supported by both pipes and streaming ->https://issues.apache.org/jira/browse/HADOOP-153?focusedCommentId=12610897#action_12610897 (last point)
In particular, if the user updates a counter with the wrong name, bad things will presumably happen...
I see this can only happen if user defines its own counter with the same name. Or is there any other problem which can happen? would it be ok for now to document the framework reserve counter names and perhaps log in the above loop that framework counter is being updated ?
Other alternative if we don't want to use counter for this at all, would be to add a mechanism in streaming and pipes protocol. Streaming can write to stderr something like processedRecords, which would be parsed by the framework. Similarly need to be added to Pipes protocol as well.