Please open a jira to enhance rumen to extract UGI mapping information from job traces.
There is no API documentation of the interface methods in UserResolver.
Newly added conf parameters are not updated in GridMix.printUsage() method.
There seems to be a typo in UserResolver javadoc "Maps users to a set of users on the test cluster." => "Maps users to a set of groups on the test cluster."
That wasn't a typo, but it's clearly not a helpful comment. Replaced with "Maps users in the trace to a set of valid target users on the test cluster." Also added interface and option documentation.
The interface of UserResolver.setTargetUsers() requires the userspec to be stored in a file. How about changing it to "setTargetUsers(List<UserGroupInformation>, Configuration conf)"? [snip] This would also require UserResolver.parseUserList moved to GridMix.run()
This was the first interface I tried, but a UserResolver is really just getting an (optional) URI argument and a conf. How/whether it builds a user list from that resource may be specific to the UserResolver implementation. Since -users is a command-line param, I gave it a Path type and assume that will be sufficient. I left the parsing of the default file format in the base class, so I agree that it's likely the only implementation, but I left it as a member function so it's easy to override.
Also currently EchoUserResolver/SubmitterUserResolver ignores this info, maybe we can have the method to return a boolean to indicate whether the method call takes effect or not?
The driver would probably ignore that feedback, wouldn't it?
Any reason we split GridMix.run() to GridMix.run() and GridMix.start() now?
The command-line parsing was becoming less trivial, but breaking it into a method called by run would have been uglier. Either way is fine w/ me; any preference?