- We've already seen that the mahout smoke tests are fragile with respect to requiring many external input data sets.
- Also in BigPetStore
BIGTOP-1089, we are building custom fake data generators so that we can build arbitrarily large data sets of customer transactions with patterns in them.
So – lest either (1) build a framework or (2) adopt one, that is modular enough to extend for different smoke test scenarios.
- VM tests can run the exact same smokes that real tests run , and just generate smaller input data sets. Right now, we cant do this with static external data sets .
- We can start eliminating fragile external dependencies of smoke tests (i.e. the mahout ones), and replace them with own data sets on the fly, no need for wgetting them from 3rd parties
- BigPetStore can focus on demo'ing the bigtop based hadoop ecosystem deployment, rather than on generating data.