I agree that there are limitations in using annotations on the processors. I think that where the data is written should be decoupled from the processors. A processor knows how to process data, but it shouldn't also state where the data should be written. Generic processors like TsProcessors could be used repeatedly for different data types, all of which should be written to different table/column-families. Coupling the two with annotations makes this difficult. You end up with empty subclasses used only to configure different data types to table/cfs via overridden annotations.
I suggest we externalize the table/cf mappings from the processors. Instead we could have something like an HBaseRouterFactory (or something perhaps named better) that the OutputCollector and the HBaseWriter interact with. HBaseRouterFactory has a method that takes in a dataType and probably also a ChukwaRecord and knows how to return the Table and ColumnFamily that the data should be written too.
We could then configure that dataType 'foo' should use BarProcessor and write to table 'bat', column family 'biz'.
I don't know how we'd configure 'foo's payload to be written to multiple cfs though. What's the use case for why we'd want to write the same data to two locations?
There's still an unresolved separate problem of how to handle ORM-ish functionality as well, since reduxing the many parameters in the record body back to a single 'body' field can be sub-optimal.