Yes, I was meaning that mahout-writables would not depend on mahout-core. If that requires further headway around Cluster (or just leaving the ClusterWritables back in core, and pulling them out later when the surgery is complete), then so be it.
The dependency on hadoop is huge, yes, but if we're running on hadoop (which would be the case if you have mahout-writable as the package in question), then you already depend on that, that's a given.
It is not the question of jar size in MB which matters here, no. The question is of runtime dependencies, and I guess we're just missing understanding each other because I'm not pushing on the original git branch Ted made, but instead the end goal of what would happen once cluster was removed. Yes, the work that should be patched next, in my view, actually, is to post what you get if you pull out all of the easy *Writables (ie. everything except Cluster, I guess?) as a first pass, leaving cluster back in core.
I would personally think that was a positive first step, a) creating a place for writables to go, moving forward, and b) providing a dependency which knew how to deal with many of the common serialized objects of mahout. Step 2 would be to work further iterations around getting all remaining Writables out of core and into this new package.
I don't think Step 1 and 2 need to be done at the same time, however.