OpenSource Framework - STAF and STAX Tutorial
Automated test distribution, execution and reporting with STAF/STAX
Assume you are part of a test team whose goal is to automate the distribution of tests to a large set of clients running on various platforms. You want to run an automated 'smoke test' in the following scenario:
- A nightly build process sends out email notification that a new version of the software is ready to be tested.
- The notification email triggers a 'Start Smoke Test' request sent to a dedicated machine (I will call it the "test management" machine), which coordinates all clients to be tested
- The test management machine somehow tells all clients that version x.y.z of the software is available, then tells all clients to run a test harness and report back the results
- After getting back the test results from all the clients, the test management machine sends out a test summary email containing the overall, failed, and successful test case count
You could try to implement this functionality yourself by writing for example a simple XML-RPC agent that runs on every client and accepts commands from the test management machine, but you soon realize that you need something more robust, something that had already been proved in large test environments.
I will show you how to use the STAF/STAX framework from IBM, which offers all the features listed in the smoke-test scenario just described.
The idea behind STAF is to run a very simple agent on all the machines that participate in the STAF testbed. Every machine can then run services on any other machine, subject to a so-called trust level. In practice, one machine will act as what I called the 'test management' machine, and will coordinate the test runs by sending jobs to the test clients. STAX is one of the services offered on top of the low-level STAF plumbing. It greatly facilitates the distribution of jobs to the test clients and the collection and logging of test results. STAX jobs are XML files spiced up with special <script> tags that contain Python code (actually Jython, but there are no differences for the purpose of this tutorial). This in itself was for us a major reason for choosing STAF over other solutions.
Here is the test environment that I will use in my example:
- 3 clients that will run the test harness: one called win1 running some flavor of Windows, one called linux1 running some flavor of Linux, and one called sol1 running some flavor of Solaris
- 1 test management machine, called mgmt1
- 1 desktop PC, called desktop1
What follows is a step-by-step guide to configuring STAF and STAX on the machines in the example testbed:
Step 1: Install and configure STAF on the test clients
Install STAF on all 5 machines (I refer the readers to the STAF User Guide for details on installing STAF). Here is an example of a STAF configuration file (on Unix, it's usually in /usr/local/staf/bin/STAF.cfg) for one of the 3 client machines:
# Enable TCP/IP connections
interface tcpip
# Turn on tracing of internal errors and deprecated options
trace on error deprecated
serviceloader library STAFDSLS
SET CONNECTTIMEOUT 15000
SET MAXQUEUESIZE 10000
TRUST LEVEL 5 MACHINE mgmt1
Note that the 3 client machines need to increase the trust level (default is 3) for the test management machine, so that the latter can initiate jobs on the clients.
Step 2: Install and configure STAX on the management host
Install the STAX service on the test management machine. In STAF parlance, this machine is called the STAX Service machine (readers are referred to the STAX User's Guide for details on STAX). There are a few things to remember in terms of requirements for this machine:
Java 1.2 or later needs to be installed
The following 2 variables need to be set (for example in .bash_profile):
export CLASSPATH=$CLASSPATH:/usr/local/staf/lib/JSTAF.jar
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/staf/lib
The STAF.cfg configuration file needs to have the STAX service added to it (note the increase to trust level 4 for the desktop1 machine, which will act as the monitoring machine and needs special rights to connect to mgmt1):
# Enable TCP/IP connections
interface tcpip
# Turn on tracing of internal errors and deprecated options
trace on error deprecated
serviceloader library STAFDSLS
SERVICE STAX LIBRARY JSTAF EXECUTE /usr/local/staf/services/STAX/STAX.jar
SET MAXQUEUESIZE 10000
TRUST LEVEL 4 MACHINE desktop1
Step 3: Start the STAF agent
Run STAFProc on all 5 machines. STAFProc is the STAF agent that listens on a specific port (default is 6500) for STAF-specific commands.
Step 4: Create STAX job files
Create the STAX XML job files that will be interpreted by the STAX service on mgmt1. Here is an example of a job file, called client_test_harness.xml, that will run a test harness on our 3 clients
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<!DOCTYPE stax SYSTEM "C:\QA\STAF\stax.dtd">
<stax>
<!--
The following <script> element is overriden if the global_vars.py SCRIPTFILE is used
A SCRIPTFILE can be specified either in the STAX Monitor, or directly when submitting a job to STAX
-->
<script>
VERSION = '1.0.1'
HARNESS_TIMER_DURATION = '60m'
clients_os = { 'win1':'win','sol1':'unix','linux1':'unix'}
harness_path = {'unix': '/qa/harness','win' : 'C:/qa/harness'}
tests_unix = [[ 'unix_perms', 'brv_unix_perms.py' ],[ 'long_names', 'brv_long_names.py' ]]
tests_win = [[ 'unicode_names', 'brv_unicode_names.py' ]]
</script>
<defaultcall function="Main"/>
<function name="Main">
<sequence>
<import machine="'mgmt1'" file="'/QA/STAF/stax_jobs/log_result.xml'"/>
<call function="'ClientTestHarness'">
[clients_os, harness_path, tests_unix, tests_win]
</call>
</sequence>
</function>
<function name="ClientTestHarness">
<function-list-args>
<function-required-arg name='clients_os'/>
<function-required-arg name='harness_path'/>
<function-required-arg name='tests_unix'/>
<function-required-arg name='tests_win'/>
<function-other-args name='args'/>
</function-list-args>
<paralleliterate var="machine" in="clients_os.keys()">
<sequence>
<script>
os_type = clients_os[machine]
tests = {}
if os_type == 'unix':
tests = tests_unix
if os_type == 'win':
tests = tests_win
</script>
<iterate var="test" in="tests">
<sequence>
<script>
test_name = machine + "_" + test[0]
</script>
<testcase name="test_name">
<sequence>
<script>
cmdline = harness_path[os_type] + "/" + test[1]
</script>
<timer duration = "HARNESS_TIMER_DURATION">
<process>
<location>machine</location>
<command>'python'</command>
<parms>cmdline</parms>
<stderr mode="'stdout'" />
<returnstdout />
</process>
</timer>
<call function="'LogResult'">machine</call>
</sequence>
</testcase>
</sequence>
</iterate>
</sequence>
</paralleliterate>
</function>
</stax>
The syntax may seem overwhelming at first, but it turns out to be quite manageable once you get he hang of it. Here are the salient points in the above file:
The first <script> element sets a number of Python variables which are then used in the body of the XML document; think of them as global constants
There is one function called in the element; this function is called Main and is defined in the first element
The Main function imports another XML file (log_result.xml) in order for this job to be able to call a function (LogResult) defined in the imported file
The Main function then calls a function called ClientTestHarness, passing it as arguments four Python variables defined at the top
Almost all the action in this job happens in the ClientTestHarness function, which starts by declaring its required arguments, then proceeds by running a series of tests in parallel on each of our 3 client machines; the parallelism is achieved by means of the element
The <script> element that follows is simple Python code that retrieves the test suite to be run from the global dictionaries, via the machine name
On each machine, the tests in the test suite are executed sequentially, via the element
A element is defined for each test, so that we can easily retrieve the test statistics at the end of the run, via the LogResult function
For each test, the ClientTestHarness function executes a element, which runs a command (for example brv_unix_perms.py) on the target machine; the element is surrounded by a element which will mark the test as failed if the specified time interval reaches its limit
The element also specifies that the command to be executed redirect stderr to stdout, and return stdout
Finally, the ClientTestHarness function calls LogResult, passing it the machine name as the only argument
The LogResult function is defined in the log_result.xml file. Its tasks are to:
interpret the return code (which is a STAF-specific variable called RC) and the output (which is a STAX-specific variable called STAXResult) for each test case
set the result of the test run to PASS or FAIL
log it accordingly
Here is the log_result.xml file:
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<!DOCTYPE stax SYSTEM "C:\QA\STAF\stax.dtd">
<stax>
<function name="LogResult">
<function-list-args>
<function-required-arg name='machine'/>
<function-other-args name='args'/>
</function-list-args>
<if expr="RC != 0">
<sequence>
<tcstatus result="'fail'">'Failed with RC=%s' % RC</tcstatus>
<log level="'error'">'Process failed with RC=%s, Result=%s' % (RC, STAFResult)</log>
</sequence>
<elseif expr="STAXResult != None">
<iterate var="file_info" in="STAXResult" indexvar="i">
<if expr="file_info[0] == 0">
<sequence>
<script>
import re
fail = re.search('FAIL', file_info[1])
log_msg = 'HOST:%s\n\n%s' % (machine,file_info[1])
</script>
<if expr = "fail">
<sequence>
<tcstatus result="'fail'">'Test output contains FAIL'</tcstatus>
<log level="'error'">log_msg</log>
</sequence>
<else>
<sequence>
<tcstatus result="'pass'"></tcstatus>
<log level="'info'">log_msg</log>
</sequence>
</else>
</if>
</sequence>
<else>
<log level="'error'">'Retrieval of file %s contents failed with RC=%s' % (i, file_info[0])</log>
</else>
</if>
</iterate>
</elseif>
<else>
<log level="'info'">'STAXResult is None'</log>
</else>
</if>
</function>
</stax>
Step 4: Run STAX jobs on the test clients
From the desktop1 machine, which in STAX is called the monitoring machine, send a carefully crafted STAF command to the test management machine, telling it to run the client_test_harness.xml job:
STAF mgmt1 STAX EXECUTE FILE /QA/STAF/stax_jobs/client_test_harness.xml MACHINE mgmt1 SCRIPTFILE /QA/STAF/stax_jobs/global_vars.py JOBNAME "CLIENT_TEST_HARNESS" SCRIPT "VERSION='1.0.2'" CLEARLOGS Enabled
The above incantation runs a STAF command by specifying a service (STAX) and a request (EXECUTE), then passing various arguments to the request, the most common ones being a FILE (the path to the job XML file), a MACHINE to run the job file on (mgmt1), and a JOBNAME (which can be any string value).
Two other arguments, entirely optional, are Python-specific:
- A nightly build process sends out email notification that a new version of the software is ready to be tested.
- The notification email triggers a 'Start Smoke Test' request sent to a dedicated machine (I will call it the "test management" machine), which coordinates all clients to be tested
- The test management machine somehow tells all clients that version x.y.z of the software is available, then tells all clients to run a test harness and report back the results
- After getting back the test results from all the clients, the test management machine sends out a test summary email containing the overall, failed, and successful test case count
You could try to implement this functionality yourself by writing for example a simple XML-RPC agent that runs on every client and accepts commands from the test management machine, but you soon realize that you need something more robust, something that had already been proved in large test environments.
I will show you how to use the STAF/STAX framework from IBM, which offers all the features listed in the smoke-test scenario just described.
The idea behind STAF is to run a very simple agent on all the machines that participate in the STAF testbed. Every machine can then run services on any other machine, subject to a so-called trust level. In practice, one machine will act as what I called the 'test management' machine, and will coordinate the test runs by sending jobs to the test clients. STAX is one of the services offered on top of the low-level STAF plumbing. It greatly facilitates the distribution of jobs to the test clients and the collection and logging of test results. STAX jobs are XML files spiced up with special <script> tags that contain Python code (actually Jython, but there are no differences for the purpose of this tutorial). This in itself was for us a major reason for choosing STAF over other solutions.
Here is the test environment that I will use in my example:
- 3 clients that will run the test harness: one called win1 running some flavor of Windows, one called linux1 running some flavor of Linux, and one called sol1 running some flavor of Solaris
- 1 test management machine, called mgmt1
- 1 desktop PC, called desktop1
What follows is a step-by-step guide to configuring STAF and STAX on the machines in the example testbed:
Step 1: Install and configure STAF on the test clients
Install STAF on all 5 machines (I refer the readers to the STAF User Guide for details on installing STAF). Here is an example of a STAF configuration file (on Unix, it's usually in /usr/local/staf/bin/STAF.cfg) for one of the 3 client machines:
# Enable TCP/IP connections
interface tcpip
# Turn on tracing of internal errors and deprecated options
trace on error deprecated
serviceloader library STAFDSLS
SET CONNECTTIMEOUT 15000
SET MAXQUEUESIZE 10000
TRUST LEVEL 5 MACHINE mgmt1
Note that the 3 client machines need to increase the trust level (default is 3) for the test management machine, so that the latter can initiate jobs on the clients.
Step 2: Install and configure STAX on the management host
Install the STAX service on the test management machine. In STAF parlance, this machine is called the STAX Service machine (readers are referred to the STAX User's Guide for details on STAX). There are a few things to remember in terms of requirements for this machine:
Java 1.2 or later needs to be installed
The following 2 variables need to be set (for example in .bash_profile):
export CLASSPATH=$CLASSPATH:/usr/local/staf/lib/JSTAF.jar
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/staf/lib
The STAF.cfg configuration file needs to have the STAX service added to it (note the increase to trust level 4 for the desktop1 machine, which will act as the monitoring machine and needs special rights to connect to mgmt1):
# Enable TCP/IP connections
interface tcpip
# Turn on tracing of internal errors and deprecated options
trace on error deprecated
serviceloader library STAFDSLS
SERVICE STAX LIBRARY JSTAF EXECUTE /usr/local/staf/services/STAX/STAX.jar
SET MAXQUEUESIZE 10000
TRUST LEVEL 4 MACHINE desktop1
Step 3: Start the STAF agent
Run STAFProc on all 5 machines. STAFProc is the STAF agent that listens on a specific port (default is 6500) for STAF-specific commands.
Step 4: Create STAX job files
Create the STAX XML job files that will be interpreted by the STAX service on mgmt1. Here is an example of a job file, called client_test_harness.xml, that will run a test harness on our 3 clients
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<!DOCTYPE stax SYSTEM "C:\QA\STAF\stax.dtd">
<stax>
<!--
The following <script> element is overriden if the global_vars.py SCRIPTFILE is used
A SCRIPTFILE can be specified either in the STAX Monitor, or directly when submitting a job to STAX
-->
<script>
VERSION = '1.0.1'
HARNESS_TIMER_DURATION = '60m'
clients_os = { 'win1':'win','sol1':'unix','linux1':'unix'}
harness_path = {'unix': '/qa/harness','win' : 'C:/qa/harness'}
tests_unix = [[ 'unix_perms', 'brv_unix_perms.py' ],[ 'long_names', 'brv_long_names.py' ]]
tests_win = [[ 'unicode_names', 'brv_unicode_names.py' ]]
</script>
<defaultcall function="Main"/>
<function name="Main">
<sequence>
<import machine="'mgmt1'" file="'/QA/STAF/stax_jobs/log_result.xml'"/>
<call function="'ClientTestHarness'">
[clients_os, harness_path, tests_unix, tests_win]
</call>
</sequence>
</function>
<function name="ClientTestHarness">
<function-list-args>
<function-required-arg name='clients_os'/>
<function-required-arg name='harness_path'/>
<function-required-arg name='tests_unix'/>
<function-required-arg name='tests_win'/>
<function-other-args name='args'/>
</function-list-args>
<paralleliterate var="machine" in="clients_os.keys()">
<sequence>
<script>
os_type = clients_os[machine]
tests = {}
if os_type == 'unix':
tests = tests_unix
if os_type == 'win':
tests = tests_win
</script>
<iterate var="test" in="tests">
<sequence>
<script>
test_name = machine + "_" + test[0]
</script>
<testcase name="test_name">
<sequence>
<script>
cmdline = harness_path[os_type] + "/" + test[1]
</script>
<timer duration = "HARNESS_TIMER_DURATION">
<process>
<location>machine</location>
<command>'python'</command>
<parms>cmdline</parms>
<stderr mode="'stdout'" />
<returnstdout />
</process>
</timer>
<call function="'LogResult'">machine</call>
</sequence>
</testcase>
</sequence>
</iterate>
</sequence>
</paralleliterate>
</function>
</stax>
The syntax may seem overwhelming at first, but it turns out to be quite manageable once you get he hang of it. Here are the salient points in the above file:
The first <script> element sets a number of Python variables which are then used in the body of the XML document; think of them as global constants
There is one function called in the element; this function is called Main and is defined in the first element
The Main function imports another XML file (log_result.xml) in order for this job to be able to call a function (LogResult) defined in the imported file
The Main function then calls a function called ClientTestHarness, passing it as arguments four Python variables defined at the top
Almost all the action in this job happens in the ClientTestHarness function, which starts by declaring its required arguments, then proceeds by running a series of tests in parallel on each of our 3 client machines; the parallelism is achieved by means of the element
The <script> element that follows is simple Python code that retrieves the test suite to be run from the global dictionaries, via the machine name
On each machine, the tests in the test suite are executed sequentially, via the element
A element is defined for each test, so that we can easily retrieve the test statistics at the end of the run, via the LogResult function
For each test, the ClientTestHarness function executes a element, which runs a command (for example brv_unix_perms.py) on the target machine; the element is surrounded by a element which will mark the test as failed if the specified time interval reaches its limit
The element also specifies that the command to be executed redirect stderr to stdout, and return stdout
Finally, the ClientTestHarness function calls LogResult, passing it the machine name as the only argument
The LogResult function is defined in the log_result.xml file. Its tasks are to:
interpret the return code (which is a STAF-specific variable called RC) and the output (which is a STAX-specific variable called STAXResult) for each test case
set the result of the test run to PASS or FAIL
log it accordingly
Here is the log_result.xml file:
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<!DOCTYPE stax SYSTEM "C:\QA\STAF\stax.dtd">
<stax>
<function name="LogResult">
<function-list-args>
<function-required-arg name='machine'/>
<function-other-args name='args'/>
</function-list-args>
<if expr="RC != 0">
<sequence>
<tcstatus result="'fail'">'Failed with RC=%s' % RC</tcstatus>
<log level="'error'">'Process failed with RC=%s, Result=%s' % (RC, STAFResult)</log>
</sequence>
<elseif expr="STAXResult != None">
<iterate var="file_info" in="STAXResult" indexvar="i">
<if expr="file_info[0] == 0">
<sequence>
<script>
import re
fail = re.search('FAIL', file_info[1])
log_msg = 'HOST:%s\n\n%s' % (machine,file_info[1])
</script>
<if expr = "fail">
<sequence>
<tcstatus result="'fail'">'Test output contains FAIL'</tcstatus>
<log level="'error'">log_msg</log>
</sequence>
<else>
<sequence>
<tcstatus result="'pass'"></tcstatus>
<log level="'info'">log_msg</log>
</sequence>
</else>
</if>
</sequence>
<else>
<log level="'error'">'Retrieval of file %s contents failed with RC=%s' % (i, file_info[0])</log>
</else>
</if>
</iterate>
</elseif>
<else>
<log level="'info'">'STAXResult is None'</log>
</else>
</if>
</function>
</stax>
Step 4: Run STAX jobs on the test clients
From the desktop1 machine, which in STAX is called the monitoring machine, send a carefully crafted STAF command to the test management machine, telling it to run the client_test_harness.xml job:
STAF mgmt1 STAX EXECUTE FILE /QA/STAF/stax_jobs/client_test_harness.xml MACHINE mgmt1 SCRIPTFILE /QA/STAF/stax_jobs/global_vars.py JOBNAME "CLIENT_TEST_HARNESS" SCRIPT "VERSION='1.0.2'" CLEARLOGS Enabled
The above incantation runs a STAF command by specifying a service (STAX) and a request (EXECUTE), then passing various arguments to the request, the most common ones being a FILE (the path to the job XML file), a MACHINE to run the job file on (mgmt1), and a JOBNAME (which can be any string value).
Two other arguments, entirely optional, are Python-specific:
- SCRIPTFILE -- points to a Python file whose code will be interpreted after the code in the top-level <script> element of the job file; in my example, the global_vars.py file contains definitions of Python variables that will override the variables defined in the job's <script> element
- SCRIPT -- can contain any inline Python code, which will be interpreted after any code in the job's top-level <script> element, and after any code in the SCRIPTFILE; in my example, the VERSION variable is set to 1.0.2 on the command line via the SCRIPT argument, because it is retrieved from the nightly build email notification, and thus is not known in advance. The value 1.0.2 will override whatever values are given in the <script> element and in global_vars.py
To summarize, a SCRIPTFILE file is commonly used as a "static" repository for Python variables that are used across several job files, whereas the SCRIPT inline code is used to pass "dynamic" values for Python variables on the command line.
The above STAF command, if successful, returns an integer that represents the job ID. Based on this ID, we can query the log service on the STAX machine (mgmt1) by running this command:
STAF mgmt1 LOG QUERY MACHINE mgmt1 LOGNAME STAX_Job_jobID
STAX also offers a GUI monitoring tool called the STAX Job Monitor that is usually run on the monitoring machine (desktop1 in our example). The tool is a Java application that is started via the command line (java -jar STAXMon.jar) in the directory which contains the STAX service jar files. The Job Monitor displays the processes that are run within the job, as well as the test case information (test name, pass/fail status, duration) for each test in the test suite.
Conclusion
I will now show how all these steps fit together and give us the capability to run the automated smoke-test scenario I described in the beginning of this section.
A build completion message is sent to several distribution lists with a subject that contains the new version of the software.
The build message is forwarded via a mail alias to an account on the test management machine
A .procmailrc file on the test management machine triggers a Python script that runs the "STAF mgmt1 STAF EXECUTE " command in step 4. The script then sits in a loop and periodically queries the log file (via the LOG QUERY command) for the new job identified by jobID. When it sees a line containing "Stop|JobID: jobID", the script sends a message with the job log in its body and the test count (overall, pass and fail) in its subject
The PARALLELITERATE and ITERATE constructs available in STAX allow us to achieve both parallel and sequential operations for the test run: we run the test harness in parallel on all clients, then on each client we run the individual tests comprising the harness in a sequential order. Another very useful STAX construct is TIMER, which makes it very easy to time out the failed tests so that the whole test run is not held up
Since all the individual tests are written using our framework, all the test results are also saved in the Firebird database and can be easily inspected via a Web interface
Two more things are worth mentioning:
Two more things are worth mentioning:
- Support for STAF/STAX is top-notch and comes via the staf-users mailing list from the IBM developers working on this project.I had two questions answered within hour of each posting.
- STAF/STAX is used as the test distribution platform for the Linux Test Project. The January 2005 issue of "Linux Journal" has an article on the Linux Test Project that mentions STAF/STAX.