Adding new test cases

For adding any test case, follow these guide lines and then refer to the more specific examples below for a single module test case and a multiple module test case. All test cases should reside under the `testsuite/tests/' directory. From now on we assume that directory as our root.

  1. Find the appropriate place for the test case. The GHC regression suite is generally organised in a "white-box" manner: a regression which originally illustrated a bug in a particular part of the compiler is placed in the directory for that part. For example, typechecker regression test cases go in the typechecker/ directory, parser test cases go in parser/, and so on.

It's not always possible to find a single best place for a test case; in those cases just pick one which seems reasonable.

Under each main directory there are usually up to three subdirectories:

should_compile: test cases which need to compile only
should_fail: test cases which should fail to compile and generate a particular error message
should_run: test cases which should compile, run with some specific input, and generate a particular output.

We don't always divide the test cases up like this, and it's not essential to do so. The directory names have no meaning as far as the test driver is concerned, it is simply a convention.

  1. Having found a suitable place for the test case, give the test case a name. For regression test cases, we often just name the test case after the bug number (e.g. T2047). Alternatively, follow the convention for the directory in which you place the test case: for example, in typecheck/should_compile, test cases are named tc001, tc002, and so on. Suppose you name your test case T, then you'll have the following files:


The source file(s) containing the test case. Details on how to handle single Vs multi source test cases are explained below.

T.stdin (for test cases that run, and optional)

A file to feed the test case as standard input when it runs.

T.stdout (for test cases that run, and optional)

For test cases that run, this file is compared against the standard output generated by the program. If T.stdout does not exist, then the program must not generate anything on stdout.

T.stderr (optional)

For test cases that run, this file is compared against the standard error generated by the program.

For test cases that compile only, this file is compared against the standard error output of the compiler, which is normalised to eliminate bogus differences (eg. absolute pathnames are removed, whitespace differences are ignored, etc.)

  1. Edit all.T in the relevant directory and add a line for the test case. The line is always of the form
    test(<name>, <setup>, <test-fn>, <args...>)
    The format of this line is explained in more detail below as it differs for test case types. It allows you to say if the test case should fail to compile, run fine, run but terminate with a certain exit code... ect. The <args...> argument is a list argument, where the length and format of the list depends on the <test-fn> you use. The choice of <test-fn> is largely dependent on how complex it is to build your test case. The <test-fn> specifies a build method more then anything else.

Note also that the all.T file is simply a python source file that gets executed by the test framework. Hence any Python code in it is valid.

Below we will look at some of the more common test case setups.

A single module test case

A single module test case is very easy. Simply name the Haskell source files the same as your test name (so T.hs in our running example).

Then for a test case that should compile and run fine we would put this line in all.T:

test('cgrun001', normal, compile_and_run, [''])

For a test case that should compile but you don't want run, we would put this line in all.T:

test('cg002', normal, compile, [''])

For a test case that should fail during compilation we would put this line in all.T:

test('drvfail001', normal, compile_fail, [''])

For more detailed control of a test case, see below. \REF

A multiple module test case

A multiple module test case is slightly more complex then a single module one. Firstly we a concerned with how to handle the simplest form of a multiple module test case, that is one where the whole test case can be built in one go using the --make command of GHC. If you have more complex needs (like compiling source files that --make can't handle, and/or need to compile different modules with different GHC arguments, then see below)

Then for a test case that should compile and run fine we would put this line in all.T:

test(multimod001, normal, multimod_compile_and_run, \
              [ 'Main', '-fglasgow-exts', ''])

This example would compile a multiple module test case where the top module is Main.hs and -fglasgow-exts is passed to GHC when compiling.

For a test case that should compile but you don't want run, we would put this line in all.T:

test('T3286', extra_clean(['T3286b.o','T3286b.hi']), 
              multimod_compile, ['T3286', '-v0'])

This example would compile a multiple module test case where the top module is T3286 and before compiling the test the files T3286b.o and T3286b.hi are removed.

For a test case that should fail during compilation we would put this line in all.T:

     extra_clean(['OverA.hi', 'OverA.o',
                  'OverB.hi', 'OverB.o',
                  'OverC.hi', 'OverC.o']),
     ['OverD', '-no-hs-main -c -v0'])

Advanced multiple module test case

If you have a test case that can't be built with the simpler two methods described above then you should try the method described below. The build method below allows you to explicitly provide a list of (source file, GHC flags) tuples. GHC then builds those in the order you specify. This is useful for test cases say that use a .cmm source file or .c source file, these are files that GHC can build but aren't picked up by --make.

Then for a test case that should compile and run fine we would put this line in all.T:

test('cgrun069', omit_ways(['ghci']), multi_compile_and_run,
                 ['cgrun069', [('cgrun069_cmm.cmm', '')], ''])

This test case relies on a .cmm file, hence it can't use the simpler multimod_compile_and_run <test-fn>. We also see here how we can stop a test case running in a certain WAY.

For a test case that should compile but you don't want run, we would put this line in all.T:

test('Check02', normal, multi_compile, ['Check02', [
                                       ('Check02_A.hs', ''),
                                       ('Check02_B.hs', '')
                                       ], '-trust base'])

For a test case that should fail during compilation we would put this line in all.T:

test('Check01', normal, multi_compile_fail, ['Check01', [
                                            ('Check01_A.hs', ''),
                                            ('Check01_B.hs', '-trust base')
                                            ], ''])

This test case must use the multi_compile_fail method as it relies on being able to compile the file Check01_B.hs with the argument '-trust base' but not compile any of the other files with this flag.

Format of the test entries in all.T

Each test in a test.T file is specified by a line the form

test(<name>, <setup>, <test-fn>, <args...>)

Where <args...> is a list of arguments.

The <name> field

<name> is the name of the test, in quotes (' or ").

The <setup> field

<setup> is a function (i.e. any callable object in Python) which allows the options for this test to be changed. There are many pre-defined functions which can be used in this field:

  • normal don't change any options from the defaults
  • skip skip this test
  • omit_ways(ways) skip this test for certain ways
  • only_ways(ways) do this test certain ways only
  • extra_ways(ways) add some ways which would normally be disabled
  • expect_broken(bug) this test is a expected not to work due to the indicated trac bug number
  • expect_broken_for(bug, ways) as expect_broken, but only for the indicated ways
  • set_stdin(file) use a different file for stdin
  • no_stdin use no stdin at all (otherwise use /dev/null)
  • exit_code(n) expect an exit code of 'n' from the prog
  • extra_run_opts(opts) pass some extra opts to the prog
  • extra_clean(files) extra files to clean after the test has completed
  • reqlib(P) requires package P
  • req_profiling requires profiling
  • ignore_output don't try to compare output
  • compile_timeout_multiplier(n) and run_timeout_multiplier(n) modify the default timeout (usually 300s, displayed at the beginning of the testsuite) by a given factor for either the compile or the run part of your test. Note that the timeout program returns with exit code 99 when it kills your test. So if you want a timeout to mean success instead of failure, add exit_code(99) as a setup function.
  • high_memory_usage this test uses a lot of memory (allows the testsuite driver to be intelligent about what it runs in parallel)
  • literate look for a .lhs file instead of a .hs file
  • c_src look for a .c file
  • objc_src look for a .m file
  • objcpp_src look for a .mm file
  • pre_cmd(string) run this command before running the test (this is preferred over the following 3 where it is possible to use it)
  • cmd_prefix(string) prefix this string to the execution command when run
  • cmd_wrapper(f) applies f to the execution command and runs the result instead
  • normalise_slashes convert backslashes to forward slashes before comparing the output
  • when(predicate, f) Do f, but only if predicate is True
  • unless(predicate, f) Do f, but only if predicate is False

There are a number of predicates which can be used:

  • doing_ghci() GHCi is available
  • ghc_dynamic() GHC is compiled with -dynamic (usually via DYNAMIC_GHC_PROGRAMS=YES)
  • fast() the testsuite is running in "fast" mode
  • platform(plat) the testsuite is running on platform plat (which could be 'x86_64-unknown-mingw32' etc)
  • opsys(os) the testsuite is running on operating system os (which could be 'mingw32' etc)
  • arch(a) the testsuite is running on architecture a (which could be 'x86_64' etc)
  • wordsize(w) the testsuite is running on a platform with word size w bits (which could be 32 or 64)
  • msys() the testsuite is running on msys
  • cygwin() the testsuite is running on cygwin
  • have_vanilla() the compiler has built vanilla libraries
  • have_dynamic() the compiler has built dynamic libraries
  • have_profiling() the compiler has built profiling libraries
  • in_tree_compiler() the compiler being tested is in a source tree, as opposed to installed
  • compiler_type(ct) a compiler of type ct (which could be 'ghc', 'hug', etc) is being tested
  • compiler_lt(ct, v) compiler type is ct, and the version is less than v
  • compiler_le(ct, v) compiler type is ct, and the version is less than or equal to v
  • compiler_gt(ct, v) compiler type is ct, and the version is greater than v
  • compiler_ge(ct, v) compiler type is ct, and the version is greater than or equal to v
  • unregisterised() the compiler is unregisterised
  • compiler_profiled() the compiler is build with a profiling RTS
  • compiler_debugged() the compiler is built with -DDEBUG
  • tag(t) the compiler has tag t

The following two setup functions should normally not be used; instead, use the expect_broken* functions above so that the problem or unfinished feature doesn't get forgotten about.

  • expect_fail this test is an expected failure, i.e. the compiler, testdriver, OS or platform is missing a certain feature, and we don't plan to or can't fix it now or in the future. When used, it should usually be in combination with a specific OS or platform type (e.g. when(opsys('mingw32'), expect_fail) or when(platform('i386-unknown-mingw32'), expect_fail)). Otherwise, mark it as expect_broken.
  • expect_fail_for(ways) expect failure for certain ways

There are a number of predefined lists of the ways meeting various criteria:

  • prof_ways ways in which the program is built with profiling enabled
  • threaded_ways ways in which the program is linked with the threaded runtime (or run in ghci)
  • opt_ways ways in which the program is built with optimization enabled
  • llvm_ways ways in which the program is built with the LLVM backend

To use more than one modifier on a test, just put them in a list. For example, to expect an exit code of 3 and omit way 'opt', we could use

[ omit_ways(['opt']), exit_code(3) ]

as the <setup> argument.

Performance tests

Performance tests can specify ranges for certain statistics in the <setup> field. Here's an example test:

     [ compiler_stats_num_field('bytes allocated', [(wordsize(32), 40000000, 10), (wordsize(64), 79110184, 10)]) ],
     compile, [''])

This is testing the performance of GHC itself, and requiring that the statistic 'bytes allocated' for the compiler when compiling the module perf001.hs is +/- 10% of 40000000 bytes (on a 32-bit machine; there is a different baseline for 64-bit machines).

The kinds of constraint that can be used are:

  • compiler_stats_num_field(stat, expecteds) tests the performance of GHC, and should be used with compile or compile_fail tests. stat is one of the following: 'bytes allocated', 'peak_megabytes_allocated', or 'max_bytes_used'; expecteds is a list of triples. Each triple has the form: (predicate, baseline, deviation). predicate is a boolean value indicating which triple to use. In the above example, if the machine word size is 32 bits, the first triple's baseline and deviation values will be used. If the word size is 64 bits, the second triple's values will be used. baseline is the baseline value obtained by running the benchmark, and deviation is the percentage deviation from the baseline that the framework will allow for the test to pass. Setting this constraint will skip the test if -DDEBUG is one (i.e. complier_debugged() is true), as the numbers are worthless then.
  • stats_num_field(stat, expecteds) is the same, but tests the performance of the program, not the compiler. It should be used in conjunction with a compile_and_run test.

The <test-fn> field

<test-fn> is a function which describes how the test should be built and maybe run. It also determines the number of arguments for <args...>. Each function comes in three forms:

  • test-fn: Compiles the program, expecting compilation to succeed.
  • test-fn_fail: Compiles the program, expecting compilation to fail.
  • test-fn_and_run: Compiles the program, expecting it to succeed, and then runs the program.

The test functions mostly differ in how the compile the test case. The simplest test functions can only compile single file test cases, while the most complex test function can compile a multi file test case with different flags for each file. The possible test functions are:

  • compile, compile_fail, compile_and_run:
    This is the simplest test function and can only handle compiling a single module test case. The source file to compile must correspond to the <name> of the test.

<args...> = [<extra_hc_opts>]


<extra_hc_opts>: arguments to pass to GHC when it compiles your test case.

  • multimod_compile, multimod_compile_fail, multimod_compile_and_run:
    Compile a multi-module program using the GHC --make build system.

<args...> = [<topmod>, <extra_hc_opts>]


<topmod>: The top level source file for your test case.
<extra_hc_opts>: arguments to pass to GHC when it compiles your test case.

  • multi_compile, multi_compile_fail, multi_compile_and_run:
    Compile a multi source test case. This is for cases where the GHC --make build system is not enough, such as when you first need to compile a .c or .cmm file before compiling the Haskell top level module.

<args...> = [<topmod>, [(<extra_mod>, <hc_opts>)], <extra_hc_opts>]


<topmod>: The top level source file for your test case.
[(<extra_mod>, <hc_opts>)]: A list of tuples where the first element is a source file for GHC to compile and the second element are arguments GHC should use to compile that particular source file.
<extra_hc_opts>: arguments to pass to GHC when it compiles your test case (applied to all source files).

  • run_command Just run an arbitrary command. The output is checked against T.stdout and T.stderr (unless ignore_output is used). The expected exit code can be changed using exit_code(N). NB: run_command only works in the normal way, so don't use only_ways with it.
  • ghci_script Runs the current compiler, passing --interactive and using the specified script as standard input.

Sample output files

Normally, the sample stdout and stderr for a test T go in the files T.stdout and T.stderr respectively. However, sometimes a test may generate different output depending on the platform or word-size. For this reason the test driver looks for sample output files using this pattern:


Any combination of the optional extensions may be given, but they must be in the order specified. The most specific output file that matches the current configuration will be selected; for example if the platform is i386-unknown-mingw32 then T.stderr-i386-unknown-mingw32 will be picked in preference to T.stderr.

Threaded Considerations

The testsuite has fairly good support for running tests in parallel using a thread pool of size specified by the THREADS=<value>. This does mean you need to be careful when writing test cases to keep them independent of each other. You are usually not able to share files between test cases as they can run in arbitrary order and will easily conflict with each other. If you must write test cases that are dependent on each other, be sure to use the high_memory_usage setup function that insures a test case runs by itself in the main testsuite thread. All dependent test cases should use the high_memory_usage setup function. Try not to do this extensively though as it means we can't easily speed up the testsuite by throwing cores at it.

Last modified 5 hours ago Last modified on Feb 10, 2016 8:05:55 PM