This site will look much better in a browser that supports web standards, but is accessible to any browser or Internet device.

HTTPTest Tutorial

HTTPtest.pl is a powerful tool for regression testing static and dynamic web content. Unfortunately, like most power tools, it requires some training to use most effectively. This tutorial attempts to guide you to the effective use of HTTPtest.pl.

Table of Contents

Overview

This tutorial covers using HTTPtest at two different levels. At one level, it covers the mechanics of building individual tests and how to detect particular kinds of errors. At another level, it describes an approach to testing with HTTPtest that has proven effective so far.

Why use HTTPtest?

I assume that anyone who gets this far probably won't be asking this question. However, I'm going to answer it anyway. HTTPtest helps you regression test your web-based services. To really explain that statement requires a bit of history.

HTTPtest began life as a regression test tool for the lowest level of a three-tier HTTP-based system. The lowest level of this system provided data in a standard format and would only be called by other servers. This means that the data would not have any special formatting designed to make it acceptable to humans. The HTTP protocol was chosen to simplify debugging. Only a browser would be needed to test the interfaces.

Even though the system could be tested using a browser, testing dozens of interfaces with a browser is boring. I created HTTPtest to do the grunt work part of the testing. A set of tests would be run on the system to make certain that basic functionality of the interfaces remained the same. In particular, the tests should be able to answer a simple set of questions about all of the interfaces in the system.

  • Does each interface return successfully?
  • Does each interface return with data?
  • Is the data returned by the interface in the right form?
  • Does each interface deal with missing input correctly?
  • Does each interface deal with invalid input correctly?

From past experience, I know that programmers (including myself) rarely test all of these issues as much as they should. More importantly, even if they are tested appropriately at the time the interface is developed, the tests are rarely repeated for all interfaces at each code change. This repetition of tests is called regression testing. The goal of regression testing is to verify that the code being tested is working the same as it did the last time the tests were run.

With HTTPtest, it is possible to generate a suite of hundreds of tests, over time, that verify many aspects of the system under test. Testing in this fashion reduces the chance that an unintended change remains undetected.

What is regression testing?

In Software Testing Techniques, Boris Beizer defines regression testing as:

Any repetition of tests (usually after software or data change) intended to show that the software's behavior is unchanged except insofar as required by the change to the software or data.

Several important points can be extracted from this definition. The first is the repetition of tests. The second point is the test for unchanged behavior. This implies that the results of one test should be compared to the results of the previous test run. Another point that is not explicit is that regression tests may not be testing for correct behavior, just unchanged behavior. Another point that is not explicit in this definition is the result of the test. Most good regression tests give a simple answer, changed or not.

This definition is at least mildly interesting, but the implementation makes all the difference. Obviously, if we plan to repeat tests many times, automating the tests is going to be critical. If the regression test is not easy to execute, it will never be run. Likewise, if analysis of the output of the test requires extensive time and effort, the test will not be run.

The ultimate regression test would require one action to run, test every behavior in the target application, and return a yes or no answer immediately. A test like this is likely to be run almost every time the programmer touches the software. Why not? It doesn't take any effort, and it provides a useful service to the programmer.

The main downside of most regression test systems is the effort of setting up the tests. The more extensive the tests, the more time is spent building and evaluating tests. Unfortunately, another downside has to do with test complexity. If the tests are too complex, they become subject to bugs, just like the software they are intended to test.

HTTPtest attempts to provide a regression test system for software that communicates using the HTTP protocol.

What to test?

One short answer is everything. Another short answer is whatever you can. In the Extreme Programming methodology, they say you should test anything that can break. The real answer is more interesting. In general, the more tests you have the better. The best kind of tests are those that only fail if there is something wrong with the system under test. Tests that generate false positives or fail even when the system is working correctly should be avoided. In addition, any time a bug is found that was not detected by the battery, a new test should be added to detect the bug. If you follow this procedure every time a bug is found and fixed, you will never again have a user point out that a bug has come back after you fix it.

All of the error conditions that the system is programmed to handle should be tested. In general, these seem to be the easiest tests to write. Each of the major modes of operation should be tested. If an interface can be called two different ways, test each of them. An example will make this clearer.

Example

Assume we have a page to test with the URL http://www.anomaly.org/cgi-bin/testpage.cgi. This page requires a parameter of name that must match a name in a database. The page also requires one, and only one, of two parameters: day or date. If we wanted a complete test of the system, we would need to generate tests for the following.

  • Call with no parameters
  • Call with name but no other parameter
  • Call with name having an invalid value and with day
  • Call with name having an invalid value and with date
  • Call with name having a valid value and with both day and date
  • Call with both day and date but no name
  • Call with day but no name
  • Call with date but no name
  • Call with a valid name and a day
  • Call with a valid name and a date

All but the last two of these are error conditions, so they should be relatively easy to test. We will return to this example in later sections. Although the example is contrived, it serves as a good beginning. To make this example a little more concrete, the complete source of this example is available in source form. The complete battery file described in this tutorial is also available.

Although we do not consider them here, another variable we could change for this test is the HTTP request method. Under some cases, the code might act differently for a GET request than it would for a POST request. If this possibility exists, the tests should be expanded to cover different request methods.

For complicated interfaces, you might want to lay out a matrix for each URL with columns for each expected parameter. Each row would describe a different combination of parameters. This more formal method of specifying the possibilities reduces the possibility that you would miss a test case.

What is the structure of a battery file?

In general, a battery file contains a series of tests. The battery file may also include program options, defined parameters, included files, and code definitions. The items in the file are normally laid out in the following basic structure.

  <?xml version="1.0"?>
  <battery label="My tests">
  <!-- options -->
  
  <!-- parameters -->

  <!-- code libraries -->

  <!-- tests -->
  
  </battery>

The first section contains options that control the way HTTPtest executes. Whether logging is turned on, where logging output goes, how verbose to make the output, etc. This section may not exist in all battery files. The options are declared using the option element. Most options can be overridden by command-line switches.

The second section contains parameters that are used throughout the code to simplify changing individual parameters in the tests. These parameters are defined using the param element. This element declares the name of a parameter and specifies its default value. This value may be changed on the command line. This section is also optional, but highly recommended.

The third section defines code libraries or specifies Perl code to be executed before the tests begin. Both of these functions are provided by the codelib element. Like the other sections, this one is optional.

The last section contains the actual tests. These define a request/response pair that is the subject of the test. This functionality is supplied by the test and its children. Although, this section is also optional, a battery file does nothing without it.

A very simple example battery file with items in each of these sections is shown below.

  <?xml version="1.0"?>
  <battery label="My tests">
  <!-- options -->
    <option name="logging" value="yes"/>
    <option name="logfile" value="./log/test.log"/>

  <!-- parameters -->
    <param name="server" value="localhost"/>
    <param name="user"   value="Fred"/>
    <param name="today"  value=""/>

  <!-- code libraries -->
    <codelib id="get_today"><![CDATA[
      sub  get_today
       {
        my ($d,$m,$y) = (localtime(time))[3,4,5];
        
        sprintf( '%02d/%02d/%04d', $m+1, $d, $y+1900 ); # US-format
       }
  ]]></codelib>
  
    <codelib label="Set Today">$Params{today} = get_today();</codelib>

  <!-- tests -->
    <test label="Main Page">
      <request method="GET" url="http://{{server}}/index.html"/>
      <response>
        <status code="200"/>
        <content match="substr" label="Title check"><![CDATA[
    <title>Main Page</title>
    ]]></content>
        <content match="substr" label="Date Check"><![CDATA[
    <span class="date">{{today}}</span>
    ]]></content>
      </response>
    </test>
  </battery>

In this example, we have specified options that control logging in the first section. The second section contains parameter definitions to be used in tests in the last section. We define a Perl subroutine in a codelib element in the third section and execute that subroutine in another codelib. Finally, we define a test in the last section using the test element.

Creating Battery Files

To create a battery file, start an XML file with the root element of battery. You will add test elements in this container as needed. The initial (empty) file looks like this:

  <?xml version="1.0"?>
  <battery label="testpage tests">
  <!-- options -->

  <!-- parameters -->

  <!-- code libraries -->

  <!-- tests -->

  </battery>

The label can be any string that will help you to identify this battery of tests. Well-chosen labels are important to the understandability of the battery of tests. The label attributes are used on several elements. Choosing good labels makes the tester's job much easier in the case that a test fails. For example, a failure in the test User Login in the battery Shopping Cart tests is much more informative than a failure in Test 13 in the battery Tests.

By convention, I normally name each battery file with the extension .tests. So let's put the text above in the file testpage.tests. Once we have actual tests in the file, we can perform the tests with the command:

perl HTTPtest.pl testpage.tests

If you have any particular options you want to set, they would be entered at the top of the battery element. See the HTTPtest POD documentation for the available options.

Testing Static Content

If the page you are testing is static, building a test for it is very easy. We want to test the response code for the request to verify that the HTTP request completed successfully. We also want to test the content to verify that it matches what we expect. As an example, let's test the page http://www.anomaly.org/wade/HTTPtest/testpage.html. To begin building the test, create a test template in our battery file that looks like this:

  <test label="">
    <request method="" url=""/>
    <response>
    </response>
  </test>

First, we should label the test with a meaningful name. This name is printed when the test is run. (Unless we are running in quiet mode.) A good label for this test is Usage Page. Second, we fill in the request element. Since this request has no parameters, it definitely has a method of GET. The url attribute has a value of the request we are making. In this case, that is http://www.anomaly.org/wade/HTTPtest/testpage.html. The resulting partial test looks like this.

  <test label="Usage Page">
    <request method="GET" url="http://www.anomaly.org/wade/HTTPtest/testpage.html"/>
    <response>
    </response>
  </test>

Next, we need to decide how to validate the HTTP response. The response element contains a series of subtests that are used to validate the HTTP response. The simplest of these is status. Almost every successful HTTP response has a status code of 200. Since there are circumstances in which a status code other than 200 might be appropriate, a status of 200 is not assumed. For this reason, almost every response should begin by testing for a status code of 200. (Unless, of course, another status code is more appropriate.) If the page did not come back successfully, there is no need to do any further testing. Adding the status subtest yields the following:

  <test label="Usage Page">
    <request method="GET" url="http://www.anomaly.org/wade/HTTPtest/testpage.html"/>
    <response>
      <status code="200"/>
    </response>
  </test>

If all you care about is that the page comes back, you are now finished. However, in most cases, you want to verify that the returned page is the one you expect. The content subtest tests the content of the HTTP response. For a static page, you can call up the page in a browser and cut and paste the response directly into your content element. In order for this to be well-formed XML, you may need to surround this text with a CDATA section, as shown below. The completed test looks like this:

  <test label="Usage Page">
    <request method="GET" url="http://www.anomaly.org/wade/HTTPtest/testpage.html"/>
    <response>
      <status code="200"/>
      <content match="exact"><![CDATA[<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN"&amp;//EN">
<html>
<head>
  <title>Usage of testpage.cgi</title>
</head>
<body bgcolor="ivory">
<h1>Overview</h1>

<p>The <code>testpage.cgi</code> page is a small CGI script designed as an
interface to test with HTTPtest.pl. It serves no other function.</p>

<h1>Parameters</h1>
<p>The <code>testpage.cgi</code> script supports the following parameters:</p>

<dl>
  <dt><b>name</b></dt>
  <dd>Name of the user. Must be a valid user name.</dd>

  <dt><b>day</b></dt>
  <dd>Number of days in the future.</dd>

  <dt><b>date</b></dt>
  <dd>Date for difference calculation. Must be of the form <em>mm/dd/yyyy.</em>
  </dd>
</dl>

<p>The <em>name</em> parameter is required. One, and only one, of <em>day</em>
and <em>date</em> must be specified.</p>

</body>
</html>]]></content>
    </response>
  </test>

This test validates the output of the static page testpage.html. The match attribute of the content element tells HTTPtest.pl that the contained text must match exactly. Other possibilities include regexp, which uses the supplied text as a Perl regular expression, and substr, which succeeds if the text is found anywhere within the HTTP response.

Testing a Static Response from a Dynamic Page

If the output of the request you are testing is static, building a test is similar to the case of an actual static page. As an example, let's build a test for the first call in our example above. In our battery file, create a test template that looks like this:

  <test label="">
    <request method="" url=""/>
    <response>
    </response>
  </test>

As always, we should label the test with a meaningful name. A good label for this test is Missing Parameters. We can also fill in the request element. Since this request has no parameters, it definitely has a method of GET. The url attribute has a value of the request we are making. In this case, that is http://www.anomaly.org/cgi-bin/testpage.cgi. The resulting partial test looks like this.

  <test label="Missing Parameters">
    <request method="GET" url="http://www.anomaly.org/cgi-bin/testpage.cgi"/>
    <response>
    </response>
  </test>

Like the static page test above, we need to verify the status code for a successful response. We use the status element for this exactly as before. Adding the status subtest yields the following:

  <test label="Missing Parameters">
    <request method="GET" url="http://www.anomaly.org/cgi-bin/testpage.cgi"/>
    <response>
      <status code="200"/>
    </response>
  </test>

Once again, we use the content element to verify that we received the page content we expect. The completed test looks like this:

  <test label="Missing Parameters">
    <request method="GET" url="http://www.anomaly.org/cgi-bin/testpage.cgi"/>
    <response>
      <status code="200"/>
      <content match="exact"><![CDATA[<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN"&amp;//EN">
<HTML><HEAD><TITLE>Test Page</TITLE>
</HEAD><BODY>
<H1>Warning: No parameters supplied.</H1>
</BODY></HTML>]]></content>
    </response>
  </test>

This test validates the output of testpage.cgi for the case where no parameters are supplied. Again, we supply a value of exact for the match attribute of the content element.

This approach can be applied to each of the error responses described in the example above.

Testing Dynamic Content

When a page changes based on some stimulus such as parameters or internal data, the static approach described in the last section cannot be used. Instead, we must apply a more powerful solution to the problem. As an example, let's take the last response described in the example above. Once again, we begin with an empty test template.

  <test label="">
    <request method="" url=""/>
    <response>
    </response>
  </test>

Next, we label the test Success: name and date and supply the request method (GET) and url. In this case, the URL needs to be a little different. Because the battery file is built in XML, there are limits on the legal characters that may be used. Unfortunately, the & character used to separate parameters in a URL cannot be used directly in an XML file. Instead, it must be encoded as &amp;. Of course, we want to validate the response code with the status subtest. The resulting test now looks like this.

  <test label="Success: name and date">
    <request method="GET"
    url="http://www.anomaly.org/cgi-bin/testpage.cgi?name=Fred&amp;date=08/21/2001"/>
    <response>
      <status code="200"/>
    </response>
  </test>

So far, the test has been much like the previous example. But this is where things begin to change. This page is not static. Part of the content of the page depends on the input parameters. We still want to test the content, but if we use the exact match method as before, we would need to change the test every day and every time we changed the test parameters. So we must use a different match method. In this case, I would choose a value of regexp. This matching method allows for the full range of Perl regular expression features. See the Perl documentation for a full discussion of regular expressions.

  <test label="Success: name and date">
    <request method="GET" url="http://www.anomaly.org/cgi-bin/testpage.cgi?name=Fred&amp;date=08/21/2001"/>
    <response>
      <status code="200"/>
      <content match="regexp"><![CDATA[<HTML><HEAD><TITLE>Test Page</TITLE>
</HEAD><BODY>
<H1>Test Page</H1>
<P>Hello, Fred</P>
<P>08/21/2001 was \d+ days ago</P>
</BODY></HTML>]]></content>
    </response>
  </test>

First of all, since this is not an exact match, we can ignore parts of the page we don't care about. In this case, we leave off the DOCTYPE declaration to illustrate this point. Next, I match most of the page exactly, with the exception of the number of days in the answer. This is matched with the regular expression \d+ which matches one or more digits. This test succeeds whenever the output is of the correct form no matter how many days in the future we go. (As long as Perl's time operator continues to work, that is.).

This method can be applied to most dynamic pages to match the variable data on that page. It does not verify that the data is correct. This is only used to verify the form of the output. Other measures are needed to actually verify the data itself. (See below.)

Factoring out parameters

The test we wrote above is pretty good, but there is a distinct maintenance problem. The name and date occur in two different places. Once in the request and once in the response. If you decide to test a different day, you would probably change the request. If you made this change in the test six months or a year after you wrote the test file, you might forget to change the other spot. This is likely to generate wasted debugging time.

In most programming languages, this problem was solved many years ago through the use of constants and/or variables. In HTTPtest, the solution is the param element. This element gives the name of a replaceable parameter and its default value. This value can be overridden on the command line, when the test is run. These parameters can be referenced anywhere you can put a string in the test. The parameter reference should be surrounded with double curly braces, like this {{name}}.

In this case, we probably want to make parameters for both the name and date parameter values. We define the parameters at the beginning of the battery as follows:

  <?xml version="1.0"?>
  <battery label="testpage tests">
    <!-- parameters -->
    <param name="name"   value="Fred"/>
    <param name="date"   value="08/21/2001"/>
     .
     .
     .
  
  </battery>

Now that we have these parameters defined, we can change the test to the following, more maintainable form.

  <test label="Success: name and date">
    <request method="GET" url="http://www.anomaly.org/cgi-bin/testpage.cgi?name={{name}}&amp;date={{date}}"/>
    <response>
      <status code="200"/>
      <content match="regexp"><![CDATA[<HTML><HEAD><TITLE>Test Page</TITLE>
</HEAD><BODY>
<H1>Test Page</H1>
<P>Hello, {{name}}</P>
<P>{{date}} was \d+ days ago</P>
</BODY></HTML>]]></content>
    </response>
  </test>

Although the time savings of this approach are obvious if we need the same parameter for dozens of tests, one use I've made of parameters stands out above the others. Define a param called server, and give it a reasonable default, like the name of your production server. Then, you use this parameter in every URL you define. You now have the ability to point your entire battery at another server, like your test server, based on a single command line option.

Let's see how that would work. First, we add the parameter.

  <?xml version="1.0"?>
  <battery label="testpage tests">
    <param name="server" value="www.anomaly.org"/>
    <param name="name"   value="Fred"/>
    <param name="date"   value="08/21/2001"/>
     .
     .
     .
  
  </battery>

Next, we modify the test.

  <test label="Success: name and date">
    <request method="GET" url="http://{{server}}/cgi-bin/testpage.cgi?name={{name}}&amp;date={{date}}"/>
    <response>
      <status code="200"/>
      <content match="regexp"><![CDATA[<HTML><HEAD><TITLE>Test Page</TITLE>
</HEAD><BODY>
<H1>Test Page</H1>
<P>Hello, {{name}}</P>
<P>{{date}} was \d+ days ago</P>
</BODY></HTML>]]></content>
    </response>
  </test>

Finally, we can run the battery against the production server with the following command line.

perl HTTPtest.pl testpage.tests

But, if we want to run the tests against a local copy of the script, running on your local test server, you can use the following command line.

perl HTTPtest.pl server=localhost testpage.tests

Suddenly, your test file has become much more flexible and powerful. If you want to run the test on the test server test.somedomain.com and you want to change the test date to 01/01/2000, you can use the following command line.

perl HTTPtest.pl server=test.somedomain.com date=01/01/2000 testpage.tests

Testing Large Pages

The last two sections showed how to make powerful tests for dynamic content. But, if you are looking at a page you need to test that is 100 KB in size with dozens of dynamic elements, you're probably thinking "This guy is out of his mind if he thinks I'm going to write a regular expression to match this." That is definitely not the best way to solve the problem. Instead, HTTPtest allows you to use multiple subtest elements in the response element of a single test. You can define multiple regular expressions to match different parts of the response. You can also define content elements with a match type of substr to match static parts of the response.

As an example, let's take the test we just completed and convert it into a form that would work on a much larger page. The first subtest you might want to apply would be to test the title of the page (if it is HTML). There's no sense in testing the response if it does not appear to be the page you expect to get back.

  <test label="Success: name and date">
    <request method="GET" url="http://{{server}}/cgi-bin/testpage.cgi?name={{name}}&amp;date={{date}}"/>
    <response>
      <status code="200"/>
      <content match="substr"><![CDATA[<TITLE>Test Page</TITLE>]]></content>
    </response>
  </test>

Next, let's test for the heading at the top of the page.

  <test label="Success: name and date">
    <request method="GET" url="http://{{server}}/cgi-bin/testpage.cgi?name={{name}}&amp;date={{date}}"/>
    <response>
      <status code="200"/>
      <content match="substr"><![CDATA[<TITLE>Test Page</TITLE>]]></content>
      <content match="substr"><![CDATA[
<H1>Test Page</H1>
]]></content>
    </response>
  </test>

Now let's test for the easy part of the dynamic content, the greeting.

  <test label="Success: name and date">
    <request method="GET" url="http://{{server}}/cgi-bin/testpage.cgi?name={{name}}&amp;date={{date}}"/>
    <response>
      <status code="200"/>
      <content match="substr"><![CDATA[<TITLE>Test Page</TITLE>]]></content>
      <content match="substr"><![CDATA[
<H1>Test Page</H1>
]]></content>
      <content match="substr"><![CDATA[
<P>Hello, {{name}}</P>
]]></content>
    </response>
  </test>

Last, we'll test for the rest of the dynamic content.

  <test label="Success: name and date">
    <request method="GET" url="http://{{server}}/cgi-bin/testpage.cgi?name={{name}}&amp;date={{date}}"/>
    <response>
      <status code="200"/>
      <content match="substr"><![CDATA[<TITLE>Test Page</TITLE>]]></content>
      <content match="substr"><![CDATA[
<H1>Test Page</H1>
]]></content>
      <content match="substr"><![CDATA[
<P>Hello, {{name}}</P>
]]></content>
      <content match="regexp"><![CDATA[
<P>{{date}} was \d+ days ago</P>
    </response>
]]></content>
  </test>

The resulting test actually matches many pages that are not quite the same since there is nothing verifying that the pieces we test for are in the right order. If we wanted to, we could add more subtests to enforce that requirement. In most cases, I've not found it to be necessary. However, the current test is still not quite complete. If one of these subtests fails, HTTPtest cannot generate a useful report because the different contents are not labeled. A better version would be:

  <test label="Success: name and date">
    <request method="GET" url="http://{{server}}/cgi-bin/testpage.cgi?name={{name}}&amp;date={{date}}"/>
    <response>
      <status code="200"/>
      <content match="substr" label="Title"><![CDATA[<TITLE>Test Page</TITLE>]]></content>
      <content match="substr" label="Heading"><![CDATA[
<H1>Test Page</H1>
]]></content>
      <content match="substr" label="Greeting"><![CDATA[
<P>Hello, {{name}}</P>
]]></content>
      <content match="regexp" label="Answer"><![CDATA[
<P>{{date}} was \d+ days ago</P>
    </response>
]]></content>
  </test>

As you can see, adding more subtests can make the test much larger and more powerful. But, by testing critical portions of your response and ignoring unimportant content, you can make reasonable tests for arbitrarily complex pages.

Combining Tests Logically

All of the subtests in the response must pass in order for the test to pass. In other words, the result of the test is the logical and of the results of all of the subtests. This is a reasonable way to combine the subtests, but there are other ways. For instance, what if the page can have two different greetings, each valid. You could make a regular expression for that, but it would not be as clear as the two separate substr tests. A more difficult problem is how to test for some piece of text not to appear.

To solve problems like these, HTTPtest supports a set of logical operations. The or element succeeds if any one of the subtests it contains succeeds. All subtests after the first success are skipped. This feature is called shortcut semantics. The and element succeeds if all of the subtests it contains succeed. The and also implements shortcut semantics, so all subtests after the first failure are skipped. The not element succeeds if the one subtest it contains fails. (Unlike the other two, not can only contain one subtest.) In addition, the or-all and and-all elements act just like their counterparts above, with one important difference shortcut semantics do not apply. All children of these two elements are tested regardless of the success or failure of any one child subtest.

Using these elements, we can make tests that are even more powerful. For example, imagine that the greeting in the test we just solved could be either Good morning, {{name}} or Good afternoon, {{name}}. We could test for this condition by replacing

      <content match="substr" label="Greeting"><![CDATA[
<P>Hello, {{name}}</P>
]]></content>

with

      <or label="Greeting">
        <content match="substr" label="Morning Greeting"><![CDATA[
<P>Good morning, {{name}}</P>
]]></content>
        <content match="substr" label="Afternoon Greeting"><![CDATA[
<P>Good afternoon, {{name}}</P>
]]></content>
      </or>

Handling Critical Errors

There are some times when you need to report more information to the tester than HTTPtest normally supplies. There are also times when the results of one test can determine whether or not it makes sense to perform any other test. HTTPtest provides the msg element to handle these issues.

The msg element performs two interrelated tasks. It displays a message to the standard output when it is encountered. It also determines whether we should fail the test, stop processing the current battery file, or exit the entire script. When combined with the logical operators described above, this element makes for even more powerful test scripts.

For example, suppose you have a battery of tests that modifies some persistent storage through an HTTP interface, such as a shopping cart application. Before you begin adding and removing things from the cart, you would want to verify its initial state. If you have a feature that allows you to completely empty the cart, you could use that as your first test. Or you could test to see if the cart is empty at the beginning, and always make sure to leave it empty at the end of the battery of tests. In either case, it would be very nice to know if something was in the cart prior to the start of these tests. Having the cart in a non-empty state could mean any of the following things:

  • The previous invocation of the script did not complete successfully.
  • Someone was testing by hand and did not clean up afterwards.
  • A real user is somehow using the account we test with.
  • A mistake in the battery file causes it to point to a real user's shopping cart.
  • A bug in the code is showing the wrong cart.
  • Something else unexpected has happened.

In any case, the tester should probably be notified and the tests halted. There is no reasonable way that HTTPtest could be configured to deal with all of these circumstances. Using a subtest that matches the state we expect and the msg element combined in an or element, we can bail out of the tests if things are not in a reasonable state and allow the tester to take corrective action.

This is where the shortcut semantics of the logical operators is especially useful. Let's say we wanted to verify that our shopping cart was empty and abort with a message if it's not. If the string Your cart is empty should appear when this case is true, the appropriate subtests would be:

      <or label="Cart is empty">
        <content match="substr" label="Empty string">Your cart is empty</content>
        <msg print="always" action="stop"><![CDATA[
    Shopping cart was not empty!
    Stopping battery...
]]></content>
      </or>

Now, if the string Your cart is empty is in the text, the msg element is never reached. If the string is missing, the message is printed and the battery is abandoned.

Another valid use of the msg element is to report some warning condition. Maybe we have a condition that is technically successful but might bear further investigation. In that case, a subtest that matches the warning condition and a msg element with an action of success, wrapped in an and gives the tester the appropriate warning and continues with the other tests.

Testing with Code

No matter how complete the test scripting environment, there is always some test that requires more than the environment can provide. To help overcome this problem, HTTPtest provides the code subtest and the codelib element. The contents of the code element are evaluated as Perl source code. The return from this piece of code is either a true/false value or a list containing the true/false, a short message for display, and a longer display message.

This Perl code can do anything. It can check the current date and time. It can make a call across the Internet to retrieve data to compare against. It can check a local database to compare against what is sent in the response. It can do anything that Perl is capable of.

The codelib is used to either define subroutines or variables for use in one or more tests or to execute code that is not part of a test. A good example of this second case is Perl code that initializes the value of param elements to some value that is not static, but is relatively easily calculated, like today's date.

There are a few pieces of information that make building code subtests easier. First of all, the returned response is available to you in the Perl variable $resp. This variable is a HTTP::Response object, so you need to check the documents for that module. This variable is the real response, if you make any changes to it, those changes are visible in any subtests that follow the code.

In the Perl code, you also have access to the value of the param elements. They are stored in a global hash named %Params. One good use for a code subtest is extracting data from a response and placing it in a parameter for use in another test.

To show how this element works, we'll add a codelib that defines a subroutine and a code element that verifies the answer in the response. Let's start with the simple version of the dynamic test from a few sections ago.

  <test label="Success: name and date">
    <request method="GET" url="http://{{server}}/cgi-bin/testpage.cgi?name={{name}}&amp;date={{date}}"/>
    <response>
      <status code="200"/>
      <content match="regexp"><![CDATA[<HTML><HEAD><TITLE>Test Page</TITLE>
</HEAD><BODY>
<H1>Test Page</H1>
<P>Hello, {{name}}</P>
<P>{{date}} was \d+ days ago</P>
</BODY></HTML>]]></content>
    </response>
  </test>

To this, we add a code element that verifies the answer in the response.

  <test label="Success: name and date">
    <request method="GET" url="http://{{server}}/cgi-bin/testpage.cgi?name={{name}}&amp;date={{date}}"/>
    <response>
      <status code="200"/>
      <content match="regexp"><![CDATA[<HTML><HEAD><TITLE>Test Page</TITLE>
</HEAD><BODY>
<H1>Test Page</H1>
<P>Hello, {{name}}</P>
<P>{{date}} was \d+ days ago</P>
</BODY></HTML>]]></content>
      <code label="Verify Answer"><![CDATA[
          $resp->content =~ /(\d+) days/;
          my $days = $1;

          compare_day_offset( $Params{date}, -$days );
]]></code>
    </response>
  </test>

To make this work, we need the subroutine compare_day_offset() which does the actual comparison. This subroutine is defined in a codelib element that we define at the top of the battery. That element should look like this.

  <codelib label="Compare day offset" id="compare_day_offset"><![CDATA[
      sub  date_from_today
       {
        my $days = shift;
        my $secs_offset = $days * 3600 * 24; # convert days to seconds
          
        # get date of $days offset
        my ($d, $m, $y) = (localtime( time + $secs_offset ))[3..5];
        sprintf '%02d/%02d/%04d', $m+1, $d, $y+1900;
       }

      sub  compare_day_offset
       {
        my $date = shift;
        my $days = shift;
        my $then = date_from_today( $days );
          
        # compare calculated date with supplied date
        $then eq $date;
       }
]]></codelib>

There are several other ways to perform this test. Some could have been accomplished with a single code element and no codelib elements. Others could have been made with just one codelib element and no code elements. In any case, this example is probably as good as any other.

Parameter Expressions

Short Perl expressions can also be used with the parameter reference syntax. This is a good way to dynamically generate parameters if needed. If the text of a parameter reference is not an identifier, it is evaluated as a Perl expression. The value of the expression then replaces the reference in the text. For example, in this request

    <request method="GET" url="http://{{server}}/cgi-bin/testpage.cgi?name={{name}}&amp;day={{3 + 4}}"/>

the value of the day parameter would be 7.

Filtering Output

In one of the early drafts of the HTTPtest specification, a feature was considered that would allow for various kinds of filtering of the response. An attribute of the response element would specify one of several transformations on the response content. These would have included collapsing whitespace, dealing with line endings, case-folding HTML tags, and others. After some thought, this feature was dropped for two reasons:

  • There is no way to list all of the transformations that might be needed.
  • The code element could handle anything that might be needed by modifying the $resp variable directly.

After using the code element this way in a few battery files, I discovered an annoyance and a fundamental flaw in this approach. The annoying part was that I always had to remember to add a 1; at the end of the filter to make it succeed. The fundamental flaw was in maintenance and understanding. These filtering code elements look just like testing code elements, and that is a source for confusion.

So filtering came back in a new form. In order to allow for any kind of transformation, the filter element contains Perl source like a code element. However, a filter element always succeeds and it's name tells you that it is not a normal subtest.

There are two modes of operation for the filter element. The default mode is content which is chosen by either specifying no class attribute or a class attribute with the value content. In this mode, the content of the response is copied to the Perl variable $_. The content of the response is replaced with the value of $_ when the filter completes. For example, the following filter removes trailing whitespace from every line.

  <filter label="Remove trailing whitespace class="content"><![CDATA[
       s/\s+$//smg;
]]></filter>

Notice that, by putting the content in $_, filter expressions can be quite small.

The other mode of operation is activated by a class attribute with the value response. In this mode, you have access to the $resp object and can do whatever you want to it. Not only can you modify the content, but you can also change the headers, status code, or any other part of the object. See the HTTP::Response documentation for further information.

Filters must be used with great care. In some cases, they can drastically simplify the work you need to do in testing. On the other hand, the response after the filter is different than it was before the filter. This can cause the tests that follow the filter to fail if you do not account for the filter's effects.

Including Files

With any text-based script or program, a situation often arises where some of the text may be repeated in multiple locations. Many languages support some form of include mechanism to reduce repeated copies of code. The HTTPtest files supports inclusion as well, through the include element. There are three ways that an HTTPtest battery of tests normally benefits from include elements.

  • Parameter reuse
  • Test reuse
  • Code library reuse

Parameter reuse

There are times when it is useful to separate a battery of tests into separate files so that different batteries of tests can be run at different times. In this case, you may want to put the parameters that are common between the two batteries into a single include file. That file is then included in each battery to maintain consistency between them. For example, if we had multiple battery files for a shopping cart application. We might want to store the user identity and password in a parameter include file like the following.

  <?xml version="1.0"?>
  <inclusion>
    <param name="user"     value="fred"/>
    <param name="password" value="sUperSEcRet"/>
  </inclusion>

If this were saved in the file identity.inc, all of the battery files can include this file by adding the following line to the top of the battery content.

  <include file="identity.inc"/>

Test reuse

On occasion, I have had to write tests that I wished to run in two completely different ways, based on two completely different sets of parameters. In the cases in question, I could have used command line options to redefine the parameters. But this would have required setting around a dozen parameters on the command line. Instead, I factored all of the tests out of the battery file and put them in a single include file. Next, I built two separate battery files, one with each set of parameters defined. Finally, I included the file with the tests.

This resulted in battery files that were slightly harder to understand, but much easier to maintain. This is probably not a technique that will be needed very often. However, it is a good tool to have when you need it.

Code library reuse

Probably the best use for include files is to reuse codelib elements. These elements are designed for defining subroutines that are used in subsequent code and filter elements. It is likely that some of these subroutines will be useful in more than one battery of tests. By factoring these elements out of your main file into a separate library file, you can include these routines wherever you need them.

One important thing to remember when writing codelib include files is the id attribute. If a codelib has an id attribute, HTTPtest compares its value with that of all other id attributes it has encountered during this test run. If HTTPtest finds a match, it does not evaluate the codelib. This allows you to prevent redefinition of subroutines (with its attendant warning message).

Any codelib without an id attribute is always evaluated, even if it has been seen before. This allows us to define include files that define subroutines (if needed) and then execute them in the context of the current battery (always).

Bringing it all together

With every battery I've written for HTTPtest so far, as I wrote tests I discovered other tests that I missed. You will probably find that the same thing happens on any battery files you build. It is a good idea to add these tests if they contribute to the overall quality of the test. Here are the other tests I thought of as I wrote this tutorial and the battery file that goes with it.

  • Call with an invalid day value
  • Call with a negative day value
  • Call with a zero day value
  • Call with an invalid date value
  • Call with a date value having a 2 digit year
  • Call with a date value in the future
  • Call with a date value of today

A completed battery file for this application is available to give you an idea of the results of this explanation. It covers almost every test described in this tutorial. Where multiple versions of a test were described, one was chosen that makes sense to me. Only the tests for testpage.cgi are included, none of the references to the fictitious shopping cart application is in this file.

Glossary

battery
A set of tests for a related set of functionality.
battery file
An XML file that contains a battery of tests.
regression test
A test or set of tests intended to verify that functionality of a system has not changed.
subtest
One of the set of test criteria to apply to a response in order to determine if the response is valid.
test
A request and a set of test criteria to validate a response.
test suite
A set of one or more battery files that verify the functionality of a system.

Further information

For further information, check the HTTPtest Documentation, HTTPtest DTD, or the HTTPtest.pl source code itself. If you have any further questions, feel free to email me at the address below.

Valid XHTML 1.0!