Ok, it seems to be an intermittent build artifacts that are getting that big. At any rate, new Mahout package is about 200MB. As far as I can see the main reason is that the package is piling up every single dependency declared in Mahout build. I see protobuf, common-logging, guava, servlet-api, xpp, xstream, etc., etc. Most of these dependencies would exist in Hadoop or Hbase or elsewhere. Spark build used to have the same problem until I fixed their Maven assembly, to avoid redistributing everything. My special grudge is with easymock - it doesn't seem to belong in the product package.
I would suggest we use current build as is just to unblock the Bigtop release and fix it later. Unless someone wants to give it a spin and add the logic into package creation script to remove redundant dependencies and use/link to the ones from other packages.
Andrew Musselman, David Standish - is it possible to address this issue in the consequent release of Mahout, so we can trim the package to some reasonable size?