After much fighting with input data and ordering, I have my first little improvement. I've started a WIP branch over on Github. I will regularly rewrite it's history, but if you'd like to follow along, I'll take comments as they come. Once things take shape, I'll squash into a patch and attach here.
The patch posted supports generating HFiles from a table defined using the HBaseStorageHandler. The next improvement here is to actually rewrite the plan to introduce a step that invokes LoadIncrementalHFiles. After that, we can get rid of the need for specifying hfile.family.path, just detect it from the column family from the mapping attribute and write the HFiles to a temporary location before loading.