Robert Kanter, thank you for the review!
If the idea here is to prevent attacking HDFS with everyone rolling at the same time, I think the default value should not be 0. That basically negates the what we're trying to do here.
In most clusters, this is not needed. It's only the large (1000-ish node) clusters that will need to worry about staggering the rolls. And then how much staggering is required depends heavily on the cluster. I think 0 is a reasonable default.
I'm not sure we should try to conform to HDFS-9821 here at this point.
Perhaps I overstated things a little. I was already allowing for user-specified units when HDFS-9821 was created. I liked the way they proposed to do it better, so I changed my code to work that way instead. I agree that at some point there may be some shared utils to parse the time, but I need to do it now regardless.
And I'm not worried about roll-offset-interval-millis. I think that one should actually stay only in millis.
On a slow system or with some other delay, this could easily cause the test to be flakey...
I see your point, but it would have to be an enormously overloaded system. The thread will run at the top of the second, so it's scheduled to run in less than 1000ms. If it takes more than 500ms to do the flush, things are seriously FUBAR. All it's doing is closing a file. I kinda think it should fail at that point.