Bigtop
  1. Bigtop
  2. BIGTOP-769

Create a generic shell executor iTest driver

    Details

    • Type: Improvement Improvement
    • Status: Open
    • Priority: Blocker Blocker
    • Resolution: Unresolved
    • Affects Version/s: 0.4.0
    • Fix Version/s: backlog
    • Component/s: Tests
    • Labels:
      None

      Description

      It would be nice to have a way of generically wrapping up shell-based tests in iTest framework.

      I imagine a pretty simple implementation (at least initially) where on the iTest side we'd have a parameterized testsuite that would look inside a specific location under resources and instantiate one test per shell script that it finds there (subject to include/exclude filtering constraints). Then the tests will be exec'ed inside of a pre-set UNIX environment one-by-one (no parallel execution for now). If shell returns 0 – the test passes, non 0 – fails (and the stderr/stdout get captured).

      Finally, I don't have any better answer to what the contract for the environment should be, but I'd like folks to chime in with suggestions. We can probably start with populating it with ALL of the properties extracted from Hadoop config files (core-site.xml, hdfs-site.xml, etc.) with obvious transformations (fs.default.name becomes FS_DEFAULT_NAME, etc.). Or we can have a manifest of what's allowed and what tests can rely on.

        Activity

        Hide
        jay vyas added a comment -

        One really simple way to do this without adding too much extra stuff to bigtop is by using a custom test-execution pom file. we do this for big, and copy it in at runtime.

        It simply uses the gmaven hooks which come as part of maven.

        <plugin>
        <groupId>org.codehaus.groovy.maven</groupId>
        <artifactId>gmaven-plugin</artifactId>
        <version>1.0</version>
        <executions>
        <execution>
        <id>check-testslist</id>
        <phase>verify</phase>
        <goals>
        <goal>execute</goal>
        </goals>
        <configuration>
        <source><![CDATA[
        import org.apache.bigtop.itest.*
        import org.apache.bigtop.itest.shell.*
        Shell sh = new Shell();
        sh.exec("pwd > /tmp/hereiampig");
        sh.exec("cd ./pigtests && source ./test.sh");
        pass=(sh.ret==0) ;
        ret=sh.ret ;
        sh.exec("echo " sh.ret " > /tmp/pigtestret");
        if(! pass){
        throw new RuntimeException("Exit code -> $

        {pass}

        - $

        {ret}

        :::");
        }
        ]]>
        </source>
        </configuration>
        </execution>
        </executions>
        </plugin>

        I was actually thinking of putting it in as a patch because bigtop smokes for pig only run on hadoop 2.x, and you need a shell test or other custom test for pig if you want to test before version 11, where they embedded the integration testing into pig source code.

        Show
        jay vyas added a comment - One really simple way to do this without adding too much extra stuff to bigtop is by using a custom test-execution pom file. we do this for big, and copy it in at runtime. It simply uses the gmaven hooks which come as part of maven. <plugin> <groupId>org.codehaus.groovy.maven</groupId> <artifactId>gmaven-plugin</artifactId> <version>1.0</version> <executions> <execution> <id>check-testslist</id> <phase>verify</phase> <goals> <goal>execute</goal> </goals> <configuration> <source><![CDATA[ import org.apache.bigtop.itest.* import org.apache.bigtop.itest.shell.* Shell sh = new Shell(); sh.exec("pwd > /tmp/hereiampig"); sh.exec("cd ./pigtests && source ./test.sh"); pass=(sh.ret==0) ; ret=sh.ret ; sh.exec("echo " sh.ret " > /tmp/pigtestret"); if(! pass){ throw new RuntimeException("Exit code -> $ {pass} - $ {ret} :::"); } ]]> </source> </configuration> </execution> </executions> </plugin> I was actually thinking of putting it in as a patch because bigtop smokes for pig only run on hadoop 2.x, and you need a shell test or other custom test for pig if you want to test before version 11, where they embedded the integration testing into pig source code.
        Hide
        Wing Yew Poon added a comment -

        Johnny, this proposal is simply to offer an easy to use way of executing shell script based tests, much like HiveBulkScriptExecutor is a tool for executing hive query scripts, and TestHiveSmokeBulk is a JUnit parameterized test (written in groovy) that uses HiveBulkScriptExecutor to actually execute the tests.
        So, in order to execute the tests within the iTest framework, you'd still write a (parameterized) JUnit test (in groovy or java), using the shell script executor. On the other hand, the shell scripts themselves could be written in such a way that they could be run directly in a shell (i.e., outside of iTest and maven), after setting the necessary variables. So the intent is to make both ways of execution possible.
        Not all tests fit into this paradigm. However, if a test consists mostly or entirely of Shell#exec() calls and asserts, then they are good candidates for being written as straight shell scripts and executed using this executor. Once more complex logic enters the test, you might not want to write the logic in bash anymore but in java or groovy, and if you need to make use of hadoop/hbase api and so forth, then definitely they are not good candidates for shell script based tests.

        Show
        Wing Yew Poon added a comment - Johnny, this proposal is simply to offer an easy to use way of executing shell script based tests, much like HiveBulkScriptExecutor is a tool for executing hive query scripts, and TestHiveSmokeBulk is a JUnit parameterized test (written in groovy) that uses HiveBulkScriptExecutor to actually execute the tests. So, in order to execute the tests within the iTest framework, you'd still write a (parameterized) JUnit test (in groovy or java), using the shell script executor. On the other hand, the shell scripts themselves could be written in such a way that they could be run directly in a shell (i.e., outside of iTest and maven), after setting the necessary variables. So the intent is to make both ways of execution possible. Not all tests fit into this paradigm. However, if a test consists mostly or entirely of Shell#exec() calls and asserts, then they are good candidates for being written as straight shell scripts and executed using this executor. Once more complex logic enters the test, you might not want to write the logic in bash anymore but in java or groovy, and if you need to make use of hadoop/hbase api and so forth, then definitely they are not good candidates for shell script based tests.
        Hide
        Johnny Zhang added a comment - - edited

        @Roman, what kind of tests will be shell-based ? I am asking since most of the integration tests is currently using groovy.

        Show
        Johnny Zhang added a comment - - edited @Roman, what kind of tests will be shell-based ? I am asking since most of the integration tests is currently using groovy.
        Hide
        Stephen Chu added a comment -

        I agree that populating all the Hadoop config file properties would be useful. Our HDFS tests also look for env variables like HADOOP_HOME, HADOOP_MAPRED_HOME, and HADOOP_CONF_DIR.

        I like the idea of having a manifest of the basic properties that tests can rely on.

        Outside of the basic properties, I think it's easiest if the test user is responsible for making sure all of a test's required environment properties are set. It'll be complicated if we try to make a contract that extends beyond the basic properties. e.g. A WebHDFS test wants to check if DFS_WEBHDFS_ENABLED is true, but the config files from the node the test is running on doesn't include this property while the NameNode does have this property set to true. In this case, I think the test user should be responsible for setting DFS_WEBHDFS_ENABLED before running the shell executor.

        Show
        Stephen Chu added a comment - I agree that populating all the Hadoop config file properties would be useful. Our HDFS tests also look for env variables like HADOOP_HOME, HADOOP_MAPRED_HOME, and HADOOP_CONF_DIR. I like the idea of having a manifest of the basic properties that tests can rely on. Outside of the basic properties, I think it's easiest if the test user is responsible for making sure all of a test's required environment properties are set. It'll be complicated if we try to make a contract that extends beyond the basic properties. e.g. A WebHDFS test wants to check if DFS_WEBHDFS_ENABLED is true, but the config files from the node the test is running on doesn't include this property while the NameNode does have this property set to true. In this case, I think the test user should be responsible for setting DFS_WEBHDFS_ENABLED before running the shell executor.
        Hide
        Johnny Zhang added a comment -

        good idea, also nice to have things like JAVA_HOME detection, mvn, git and so on into the shell environment.

        Show
        Johnny Zhang added a comment - good idea, also nice to have things like JAVA_HOME detection, mvn, git and so on into the shell environment.

          People

          • Assignee:
            Roman Shaposhnik
            Reporter:
            Roman Shaposhnik
          • Votes:
            0 Vote for this issue
            Watchers:
            8 Start watching this issue

            Dates

            • Created:
              Updated:

              Development