Bigtop
  1. Bigtop
  2. BIGTOP-1212

Pick or build a framework for building fake data sets

    Details

    • Type: New Feature New Feature
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.7.0
    • Fix Version/s: 0.8.0
    • Component/s: blueprints
    • Labels:
      None

      Description

      • We've already seen that the mahout smoke tests are fragile with respect to requiring many external input data sets.
      • Also in BigPetStore BIGTOP-1089 , we are building custom fake data generators so that we can build arbitrarily large data sets of customer transactions with patterns in them.

      So – lest either (1) build a framework or (2) adopt one, that is modular enough to extend for different smoke test scenarios.

      ADVANTAGES:

      • VM tests can run the exact same smokes that real tests run , and just generate smaller input data sets. Right now, we cant do this with static external data sets .
      • We can start eliminating fragile external dependencies of smoke tests (i.e. the mahout ones), and replace them with own data sets on the fly, no need for wgetting them from 3rd parties
      • BigPetStore can focus on demo'ing the bigtop based hadoop ecosystem deployment, rather than on generating data.

        Issue Links

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              jay vyas
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development