  Hadoop Common
  HADOOP-5621

MapReducer to run junit tests under Hadoop

      This is something I mentioned to some people last week, thought I would start a discussion on it.

      We could run junit tests as a MapReduce job with

      1. a mapper that takes a list of classes, one per line
      2. extracts the test suite from each class, and then invokes each test method. This would be a new junit test runner.
      3. saves the result (and any exceptions) as the output. Also saves any machine specific details.
      4. It also needs to grab the System.out and System.err channels, to map them to specific tests.
      5. Measure how long the tests took (incuding setup/teardown time)
      6. Add an ant task <listresources> to take filesets and other patterns, and generate text files from the contents (with stripping of prefixes and suffices, directory separator substition, file begin/end values, etc, etc). I have this with tests already.

      The result would be that you could point listresources at a directory tree and create a text file listing all tests to run. These could be executed across multiple hosts and the results correlated. It would be, initially, a MapExpand, as the output would be bigger than the input

      Feature creep then becomes the analysis

      1. Add another MR class which runs through all failing tests and creates a new list of test classes that failed. This could be rescheduled on different runs, and makes for a faster cycle (only run failing tests until they work)
      2. Add something to only get failing tests, summarise them (somehow) in a user readable form
      3. Something to get partially failing tests and highlight machine differences.
      4. Add something to compare tests over time, detect those which are getting slower?
      5. an MR to regenerate the classic Ant junit XML reports, for presentation in other tools (like hudson)


            Steve Loughran
