Description
Current BigTop test framework has a limitation to handle dynamic generated data. It's flexibility can be improved.
For org.apache.bigtop.itest.hadoopexamples.TestHadoopExamples
Limitation: if someone wants to make any changes, he/she needs to modify
./bigtop-tests/test-artifacts/hadoop/src/main/groovy/org/apache/bigtop/itest/hadoopexamples/TestHadoopExamples.groovy. It requires compilation before running.
For org.apache.bigtop.itest.hadooptests.TestTestCLI. The configuration file,
./build/hadoop/deb/hadoop-1.0.1/src/test/org/apache/hadoop/cli/testConf.xml, has entries like the following:
<test> <!-- TESTED -->
<description>ls: file using relative path</description>
<test-commands>
<command>-fs NAMENODE -touchz file1</command>
<command>-fs NAMENODE -ls file1</command>
</test-commands>
<cleanup-commands>
<command>-fs NAMENODE -rm file1</command>
</cleanup-commands>
<comparators>
<comparator>
<type>TokenComparator</type>
<expected-output>Found 1 items</expected-output>
</comparator>
<comparator>
<type>RegexpComparator</type>
<expected-output>^rw-rr-( )1( )[a-z]( )*supergroup( )*0( )[0-9]
-[0-9]
{2,}-[0-9]{2,}[0-9]
{2,}:[0-9]{2,}( )/user/[a-z]/file1</expected-output>
</comparator>
</comparators>
</test>
Limitation: Put the expected-output, then perform string comparison is good, but still not flexible enough to handle dynamic generated data. For example, a program randomly generate key/value pairs, then submit M/R job to calculate sum (average) for each key. There is no way to calculate the result in advance to put down as expected-output.
I am proposing an improvement for BigTop's integration test. We can put all test cases in a XML file, which contains a list of command-sets; each command-set has command, command-comparator-type, command-comparator-compare-to. The command is for hadoop/hbase/hive command; command-comparator-type to specify Java class to perform comparison; command-comparator-compare-to is used to specify the shell command to generate expected output.
I put down 3 cases below:
<?xml version="1.0" encoding="ISO-8859-1"?>
<bigtop-itest-suite>
<bigtop-itest-suite-test>
<test-name>Calculate summation in MR</test-name>
<test-desc>Here is simple MR test to calculate sum</test-desc>
<test-pre-integration-test>
</test-pre-integration-test>
<test-integration-test>
<command-set>
<command>hadoop jar ./target/LeiBigTop-1.1.jar com.lei.bigtop.hadoop.calsum.CalSum ./data ./output</command>
<command-comparator-type>com.lei.bigtop.hadoop.integration.test.ExtactComparatorIgnoreWhiteSpace</command-comparator-type>
<command-comparator-compare-to><![CDATA[ cat ./output/* ]]></command-comparator-compare-to>
</command-set>
</test-integration-test>
<test-post-integration-test>
</test-post-integration-test>
</bigtop-itest-suite-test>
<bigtop-itest-suite-test>
<test-name>calculate pi</test-name>
<test-desc>calculate pi using hadoop MR</test-desc>
<test-pre-integration-test>
</test-pre-integration-test>
<test-integration-test>
<command-set>
<command>hadoop jar $HADOOP_HOME/hadoop-examples-0.*.0.jar pi 5 5</command>
<command-comparator-type>org.apache.hadoop.cli.util.SubstringComparator</command-comparator-type>
<command-comparator-compare-to><![CDATA[echo "Pi is 3.68"]]></command-comparator-compare-to>
</command-set>
</test-integration-test>
<test-post-integration-test>
</test-post-integration-test>
</bigtop-itest-suite-test>
<bigtop-itest-suite-test>
<test-name>count word in MR</test-name>
<test-desc>count word in Hadoop MR</test-desc>
<test-pre-integration-test>
<command-set><command>rm -rf ./wordcount</command></command-set>
<command-set><command>rm -rf ./wordcount_out</command></command-set>
<command-set><command>mkdir ./wordcount</command></command-set>
<command-set><command><![CDATA[curl http://www.meetup.com/HandsOnProgrammingEvents/events/53837022/ | sed -e :a -e 's/<[^>]>//g;/</N;//ba' | sed 's/ //g' | sed 's/[ \t]//;s/[ \t]$//' | sed '/$/d' | sed '/"http[^"]"/d' > ./wordcount/content]]></command></command-set>
<command-set><command>hadoop fs -mkdir /wordcount</command></command-set>
<command-set><command>hadoop fs -put ./wordcount/* /wordcount</command></command-set>
</test-pre-integration-test>
<test-integration-test>
<command-set><command>hadoop jar $HADOOP_HOME/hadoop-examples-0.*.0.jar wordcount /wordcount /wordcount_out</command></command-set>
<command-set><command>mkdir ./wordcount_out</command></command-set>
<command-set><command>hadoop fs -get /wordcount_out/* ./wordcount_out</command></command-set>
<command-set><command>hadoop fs -rmr /wordcount</command></command-set>
<command-set><command>hadoop fs -rmr /wordcount_out/</command></command-set>
</test-integration-test>
<test-post-integration-test>
<command-set>
<command>cat ./wordcount_out/* | grep Roman | sed 's/[^0-9.]([0-9.]).*/\1/'</command>
<command-comparator-type>com.lei.bigtop.hadoop.integration.test.ExtactComparatorIgnoreWhiteSpace</command-comparator-type>
<command-comparator-compare-to><![CDATA[cat wordcount/* | grep -c Roman]]></command-comparator-compare-to>
</command-set>
</test-post-integration-test>
</bigtop-itest-suite-test>
</bigtop-itest-suite>