Description
While testing ACCUMULO-2005, I had unusually fast MR test runs.
Drilling into individual task results showed the run was actually failing while reporting the map task as successful.
2014-03-05 10:52:27,755 INFO org.apache.accumulo.server.test.functional.RunTests: Running test [/usr/bin/python, test/system/auto/run.py, -m, -f, 1, -t, simple.addSplit.AddSplitTest] 2014-03-05 10:52:27,788 INFO org.apache.accumulo.server.test.functional.RunTests: More: Traceback (most recent call last): 2014-03-05 10:52:27,788 INFO org.apache.accumulo.server.test.functional.RunTests: More: File "test/system/auto/run.py", line 29, in <module> 2014-03-05 10:52:27,788 INFO org.apache.accumulo.server.test.functional.RunTests: More: from TestUtils import ACCUMULO_HOME, ACCUMULO_DIR, COBERTURA_HOME, findCoberturaJar 2014-03-05 10:52:27,788 INFO org.apache.accumulo.server.test.functional.RunTests: More: ImportError: No module named TestUtils 2014-03-05 10:52:27,798 INFO org.apache.hadoop.mapred.MapTask: Starting flush of map output 2014-03-05 10:52:27,811 INFO org.apache.hadoop.io.compress.CodecPool: Got brand-new compressor [.snappy] 2014-03-05 10:52:27,814 INFO org.apache.hadoop.mapred.Task: Task:attempt_201401282254_0002_m_000003_0 is done. And is in the process of commiting 2014-03-05 10:52:27,949 INFO org.apache.hadoop.mapred.Task: Task 'attempt_201401282254_0002_m_000003_0' done.
the current test running class ignores the return code of the test process (ref)
Instead, we should check the status and fail the task if it returns an error.
Workaround: Job counters should show Success / Failure / Error count for tests. If none of hte counters appear, consider all tests failed.