I've seen the light and attached is a patch that goes most of the way towards being a good child of AbstractJob. It refactors the command parsing out of main() into run() and also refactors the guts out of static runJob() into non-static job() which does all the heavy lifting. Job() is also called from run() so the whole thing hangs together pretty well and none of the runJob clients were impacted.
On the DefaultOptionsCreator, I did some constant extraction and inlined any methods that were called only by one other driver. The remaining options are shared and i used addOption(DefaultOptionsCreator.blahOption().create()) rather than exploding them to use the expanded addOption(...). I found a precedent for this in CollocDriver and it seemed like a lot of busy work to refactor all the usages of DefaultOptionsCreator so I did not do that.
I also did not use AbstractJob.prepareJob() but left the individual conf and job initializations in place since they are working. There was also precedent for this. Clustering has several job steps (e.g. runClustering) which do not use reducers and prepareJob doesn't work for them.
In terms of feedback on the AbstractJob design, I find the need to create two constants unwieldy as in:
private static final String NUM_CLUSTERS_OPTION = "numClusters";
public static final String NUM_CLUSTERS_OPTION_KEY = "--" + NUM_CLUSTERS_OPTION;
since the options argument map is keyed by "--" prepended to the option's long name. My preference would be to omit that but I see the code is using Option.preferredName() which is not subject to modification.
All the unit tests run but they don't really test the command line processing. The clustering tests use runJob() with java arguments instead. I admit to being a bit on the lazy side in terms of some possible refactoring but think this is pretty close to the target.