I'd like to avoid creating a ton of packages (or the tendancy to have lots of packages) as I see it more as a rough separation of concerns (like how hadoop has dfs, mr, and common) versus the finer grained functionality separation (where hadoop-common has 20+ modules) as each module means a new jar.
In the short to medium term, I would like to see the following packages materialize out the existing single package:
- hbase-assemble - necessary for building
- hbase-common - common functionality used between the client and server
- hbase-client - functionality just for the client. A general hbase client would just need hbase-common and hbase-client to run
- hbase-server - all server side functionality, including regionserver and master (this could even be separated, but not necessarily)
Other potential things that came up earlier in the process that seemed useful:
- hbase-security - shouldn't be needed if we roll in security, but still an option
- hbase-it - for a single place for higher level integration tests (all those using the mini-cluster) to avoid the maven test-jar dependency issue discussed in
Any more granularity that these pacakges tends to be a bit of a mess and rarely all that useful. Instead, a lot of times its really better to just have a config option to specify the right class and load that from the path. The jar approach is much more heavy weight and only useful for wholesale replacements for which there are multiple (possibly competing) implementations. For instance, async-hbase could roll up into a hbase-client.jar and be a drop-in replacement in your install, but you wouldn't have a whole log-cleaner jar for switching the log cleaner class to use.