BIGTOP-1271 adds patterns into the data that gaurantee that a meaningfull type of product recommendation can be given for at least some customers, since we know that there are going to be many customers who only bought 1 product, and also customers that bought 2 or more products – even in a dataset size of 10. due to the gaussian distribution of purchases that is also in the dataset generator.
The current mahout recommender code is statically valid: It runs to completion in local unit tests if a hadoop 1x tarball is present but otherwise it hasn't been tested at scale. So, lets get it working. this JIRA also will comprise:
- deciding wether to use mahout 2x for unit tests (default on mahout maven repo is the 1x impl) and wether or not bigtop should host a mahout 2x jar? After all, bigtop builds a mahout 2x jar as part of its packaging process, and BigPetStore might thus need a mahout 2x jar in order to test against the right same of bigtop releases.