Changes between Version 33 and Version 34 of Building/RunningTests

Jul 8, 2011 10:23:58 PM (6 years ago)

Rewrite of testsuite page


  • Building/RunningTests

    v33 v34  
    1 [[PageOutline]]
    21= GHC Test framework =
    4 NOTE: you need GNU make and Python (any version >= 1.5 will probably do) in order
    5 to use the testsuite. If you want to run the testsuite in parallel then you need Python 2.5.2 or later.
    6 (Avoid Python 2.6.1 as the testsuite tickles a bug in one of the included libraries)
     3GHC includes a comprehensive testsuite for catching any regressions.
    8 If you have not checked out the test suite, first run:
     5The testsuite relies primarily on '''GNU Make''' and '''Python'''. Any version >= 2.5.2 will do although avoid Python 2.6.1 as the testsuite tickles a bug in one of the included libraries.
     7If you have not checked out the testsuite, first run:
    10 ./sync-all --testsuite get
     9$ ./sync-all --testsuite get
    13 If you just want to run the whole test suite, then in the root of the tree running
     12If you just want to run the whole testsuite, then in the root of the GHC tree, typing:
    15 make test
     14$ make test
    1716will do a run in "fast" mode (which gives an idea whether there are major problems), or
    19 make fulltest
     18$ make fulltest
    21 will do a full testsuite run (more thorough, but takes a lot longer).
     20will do a full testsuite run (more thorough, but takes a lot longer). You should expect that there are no test case failures for the "fast" mode as as that is a quality level that all GHC developers are expected to maintain when they check in code. There will usually be some test case failures for the full testsuite run though.
    23 Below we will explain how to get finer control of the test suite.
     22== Using the Testsuite ==
    25 == Detail ==
     24 * [wiki:Building/RunningTests/Running Running the testsuite]
     25 * [wiki:Building/RunningTests/Settings Testsuite Settings and WAYS]
     26 * [wiki:Building/RunningTests/Updating Updating test case results]
     27 * [wiki:Building/RunningTests/Adding Adding new test cases]
    27 To run the test suite against a GHC build in the same source tree:
    28 {{{
    29         cd testsuite/tests/ghc-regress
    30         make
    31 }}}
    32 (from now on, we'll assume that you're in the tests/ghc-regress
    33 directory).
     29== Problems running the testsuite ==
    35 To run a fast version of the testsuite, which should complete in under
    36 5 minutes on a fast machine with an optimised GHC build:
    37 {{{
    38         make fast
    39 }}}
    40 By default the testsuite uses the stage2 compiler. If you want to use another stage
    41 (e.g. because your stage2 compiler doesn't work) then:
    42 {{{
    43         make stage=1
    44 }}}
    45 To run the test suite against a different GHC, say ghc-5.04:
    46 {{{
    47         make TEST_HC=ghc-5.04
    48 }}}
    49 To run an individual test or tests (eg. tc054):
    50 {{{
    51         make TEST=tc054
    52 }}}
    53 (you can also go straight to the directory containing the test and say
    54 'make TEST=tc054' from there, which will save some time).
     31 1. If the testsuite fails mysteriously, make sure that the {{{timeout}}} utility is working properly. This Haskell utility is compiled with the stage 1 compiler and invoked by the python driver, which does not print a nice error report if the utility fails. This can happen if, for example, the compiler produces bogus binaries. A workaround is to compile {{{timeout}}} with a stable {{{ghc}}}.
    56 To run several tests, you just space separate them:
    57 {{{
    58         make TEST="tc054 tc053"
    59 }}}
    61 To run the tests one particular way only (eg. GHCi):
    62 {{{
    63         make WAY=ghci
    64 }}}
    65 To add specific options to the compiler:
    66 {{{
    67         make EXTRA_HC_OPTS='+RTS -K32M -RTS'
    68 }}}
    70 To save disk space you can have temporary files deleted after each test:
    71 {{{
    72         make CLEANUP=1
    73 }}}
    75 If you have python 2.5.2 or later then you can run the testsuite in parallel:
    76 {{{
    77         make THREADS=2
    78 }}}
    80 For more details, see below.
    82 = Running the testsuite with a compiler other than GHC =
    84 This doesn't work at the moment, but if it did then it would probably involve something like:
    85 {{{
    86         cd testsuite
    87         make TEST_HC=nhc98 COMPILER=nhc98
    88 }}}
    90 = Running individual tests or subdirectories of the testsuite =
    92 Most of the subdirectories in the testsuite have a Makefile.  In these
    93 subdirectories you can use 'make' to run the test driver in two
    94 ways:
    95 {{{
    96         make            -- run all the tests in the current directory
    97         make accept     -- run the tests, accepting the current output
    98 }}}
    99 The following variables may be set on the make command line:
    100 {{{
    101         TESTS                   -- specific tests to run
    102         TEST_HC                 -- compiler to use
    103         EXTRA_HC_OPTS           -- extra flags to send to the Haskell compiler
    104         EXTRA_RUNTEST_OPTS      -- extra flags to give the test driver
    105         CONFIG                  -- use a different configuration file
    106         COMPILER                -- stem of a different configuration file
    107                                 -- from the config directory [default: ghc]
    108         WAY                     -- just this way
    109 }}}
    110 The following ways are defined (for GHC, see the file config/ghc for the complete list):
    111 {{{
    112         normal                  -- no special options
    113         llvm                    -- -fllvm
    114         optc                    -- -O -fvia-C
    115         optasm                  -- -O -fasm
    116         optllvm                 -- -O -fllvm
    117         profc                   -- -O -prof -auto-all -fvia-C
    118         profasm                 -- -O -prof -auto-all -fasm
    119         ghci                    -- (run only, not compile) run test under GHCi
    120         extcore                 -- -fext-core
    121         optextcore              -- -O -fext-core
    122         threaded1               -- -threaded -debug
    123         threaded2               -- -threaded -O, and +RTS -N2 at run-time
    124         hpc                     -- -fhpc
    125         dyn                     -- -O -dynamic
    126 }}}
    127 certain ways are enabled automatically if the GHC build in the local
    128 tree supports them.  Ways that are enabled this way are optasm, profc,
    129 profasm, threaded1, threaded2, and ghci.
    131 = Updating tests when the output changes =
    133 If the output of a test has changed, but the new output is still
    134 correct, you can automatically update the sample output to match the
    135 new output like so:
    136 {{{
    137         make accept TEST=<test-name>
    138 }}}
    139 where <test-name> is the name of the test.  In a directory which
    140 contains a single test, or if you want to update *all* the tests in
    141 the current directory, just omit the 'TEST=<test-name>' part.
    143 = Adding a new test =
    145 For a test which can be encapsulated in a single source file, follow
    146 these steps:
    148  1. Find the appropriate place for the test.  The GHC regression suite
    149     is generally organised in a "white-box" manner: a regression which
    150     originally illustrated a bug in a particular part of the compiler
    151     is placed in the directory for that part.  For example, typechecker
    152     regression tests go in the typechecker/ directory, parser tests
    153     go in parser/, and so on. 
    155  It's not always possible to find a single best place for a test;
    156  in those cases just pick one which seems reasonable.
    158  Under each main directory may be up to three subdirectories:
    159        '''should_compile''':   
    160            tests which need to compile only
    161        '''should_fail''':   
    162            tests which should fail to compile and generate a particular error message
    163        '''should_run''':
    164            tests which should compile, run with some specific input, and generate a particular output.
    166  We don't always divide the tests up like this, and it's not
    167  essential to do so (the directory names have no meaning as
    168  far as the test driver is concerned).       
    171  2. Having found a suitable place for the test, give the test a name.
    172     For regression tests, we often just name the test after the bug number (e.g. T2047).
    173     Alternatively, follow the convention for the directory in which you place the
    174     test: for example, in typecheck/should_compile, tests are named
    175     tc001, tc002, and so on.  Suppose you name your test T, then
    176     you'll have the following files:
    178       T.hs
    179         The source file containing the test
    181       T.stdin   (for tests that run, and optional)
    182         A file to feed the test as standard input when it
    183         runs.
    185       T.stdout  (for tests that run, and optional)
    186         For tests that run, this file is compared against
    187         the standard output generated by the program.  If
    188         T.stdout does not exist, then the program must not
    189         generate anything on stdout.
    191       T.stderr  (optional)
    192         For tests that run, this file is compared
    193         against the standard error generated by the program.
    195         For tests that compile only, this file is compared
    196         against the standard error output of the compiler,
    197         which is normalised to eliminate bogus differences
    198         (eg. absolute pathnames are removed, whitespace
    199         differences are ignored, etc.)
    202  3. Edit all.T in the relevant directory and add a line for the test.  The line is always of the form
    203 {{{
    204       test(<name>, <setup>, <test-fn>, <args>)
    205 }}}
    206  The format of these fields is described in the [wiki:Building/RunningTests#Formatofthetestentries next section].
    210 A multi-module test is straightforward.  It usually goes in a
    211 directory of its own (although this isn't essential), and the source
    212 files can be named anything you like.  The test must have a name, in
    213 the same way as a single-module test; and the stdin/stdout/stderr
    214 files follow the name of the test as before.  In the same directory,
    215 place a file 'test.T' containing a line like
    216 {{{
    217    test(multimod001, normal, multimod_compile_and_run, \
    218                  [ 'Main', '-fglasgow-exts', '', 0 ])
    219 }}}
    220 as described above.
    222 For some examples, take a look in tests/ghc-regress/programs.
    224 = Format of the test entries =
    226 Each test in a `test.T` file is specified by a line the form
    227 {{{
    228       test(<name>, <setup>, <test-fn>, <args>)
    229 }}}
    231 == The <name> field ==
    233 ''<name>'' is the name of the test, in quotes (' or ").
    235 == The <setup> field ==
    237 ''<setup>''  is a function (i.e. any callable object in Python)
    238 which allows the options for this test to be changed.
    239 There are many pre-defined functions which can be
    240 used in this field:
    242  * '''normal'''                don't change any options from the defaults
    243  * '''skip'''                  skip this test
    244  * '''skip_if_no_ghci'''       skip unless GHCi is available
    246  * '''skip_if_fast'''          skip if "fast" is enabled
    248  * '''omit_ways(ways)'''       skip this test for certain ways
    250  * '''only_ways(ways)'''       do this test certain ways only
    252  * '''extra_ways(ways)'''      add some ways which would normally be disabled
    254  * '''omit_compiler_types(compilers)'''                           skip this test for certain compilers
    256  * '''only_compiler_types(compilers)'''       do this test for certain compilers only
    258  * '''expect_broken(bug)''' this test is a expected not to work due to the indicated trac bug number
    260  * '''expect_broken_for(bug, ways)''' as expect_broken, but only for the indicated ways
    262  * '''if_compiler_type(compiler_type, f)''' Do `f`, but only for the given compiler type
    264  * '''if_platform(plat, f)'''  Do `f`, but only if we are on the specific platform given
    266  * '''if_tag(tag, f)'''        do `f` if the compiler has a given tag
    268  * '''unless_tag(tag, f)'''    do `f` unless the compiler has a given tag
    270  * '''set_stdin(file)'''       use a different file for stdin
    272  * '''no_stdin'''              use no stdin at all (otherwise use `/dev/null`)
    274  * '''exit_code(n)'''          expect an exit code of 'n' from the prog
    276  * '''extra_run_opts(opts)'''  pass some extra opts to the prog
    278  * '''no_clean'''              don't clean up after this test
    280  * '''extra_clean(files)'''    extra files to clean after the test has completed
    282  * '''reqlib(P)'''             requires package P
    284  * '''req_profiling'''         requires profiling
    286  * '''ignore_output'''         don't try to compare output
    288  * '''alone'''                 don't run this test in parallel with anything else
    290  * '''literate'''              look for a `.lhs` file instead of a `.hs` file
    292  * '''c_src'''                 look for a `.c` file
    294  * '''cmd_prefix(string)'''    prefix this string to the command when run
    296  * '''normalise_slashes'''     convert backslashes to forward slashes before comparing the output
    298 The following should normally not be used; instead, use the `expect_broken*`
    299 functions above so that the problem doesn't get forgotten about, and when we
    300 come back to look at the test later we know whether current behaviour is why
    301 we marked it as expected to fail:
    303  * '''expect_fail'''           this test is an expected failure, i.e. there is a known bug in the compiler, but we don't want to fix it.
    305  * '''expect_fail_for(ways)''' expect failure for certain ways
    307 To use more than one modifier on a test, just put them in a list.
    308 For example, to expect an exit code of 3 and omit way 'opt', we could use
    309 {{{
    310       [ omit_ways(['opt']), exit_code(3) ]
    311 }}}
    312 as the `<setup>` argument.
    314 == The <test-fn> field ==
    316 ''<test-fn>''
    317 is a function which describes how the test should be
    318 run, and determines the form of <args>.  The possible
    319 values are:
    321  * '''compile'''  Just compile the program, the compilation should succeed.
    323  * '''compile_fail'''
    324    Just compile the program, the
    325    compilation should fail (error
    326    messages will be in T.stderr).
    327    This kind of failure is mandated by the language definition - it does '''not''' indicate any bug in the compiler.
    329  * '''compile_and_run'''
    330    Compile the program and run it,
    331    comparing the output against the
    332    relevant files.
    334  * '''multimod_compile'''
    335    Compile a multi-module program
    336    (more about multi-module programs
    337    below).
    339  * '''multimod_compile_fail'''
    340    Compile a multi-module program,
    341    and expect the compilation to fail
    342    with error messages in T.stderr.  This kind of failure does '''not''' indicate a bug in the compiler.
    344  * '''multimod_compile_and_run'''
    345    Compile and run a multi-module
    346    program.
    348  * '''compile_and_run_with_prefix'''
    349    Same as compile_and_run, but with command to use to run the execution of the result binary.
    351  * '''multimod_compile_and_run_with_prefix'''
    352    Same as multimod_compile_and_run, but with command to use to run the execution of the result binary.
    354  * '''run_command'''
    355    Just run an arbitrary command.  The output is checked
    356    against `T.stdout` and `T.stderr` (unless `ignore_output`
    357    is used), and the stdin and expected exit code can be
    358    changed in the same way as for compile_and_run.  NB: run_command only works
    359    in the '''normal''' way, so don't use '''only_ways''' with it.
    361  * '''ghci_script'''
    362    Runs the current compiler, passing
    363    --interactive and using the specified
    364    script as standard input.
    366 == The <args> field ==
    368 ''<args>'' is a list of arguments to be passed to <test-fn>.
    370 For compile, compile_fail and compile_and_run, <args>
    371 is a list with a single string which contains extra
    372 compiler options with which to run the test.  eg.
    373 {{{               
    374     test('tc001', normal, compile, ['-fglasgow-exts'])
    375 }}}
    376 would pass the flag -fglasgow-exts to the compiler
    377 when compiling tc001.
    379 The multimod_ versions of compile and compile_and_run
    380 expect an extra argument on the front of the list: the
    381 name of the top module in the program to be compiled
    382 (usually this will be 'Main').
    386 = Sample output files =
    388 Normally, the sample `stdout` and `stderr` for a test T go in the
    389 files `T.stdout` and `T.stderr` respectively.  However, sometimes a
    390 test may generate different output depending on the platform,
    391 compiler, compiler version, or word-size.  For this reason the test
    392 driver looks for sample output files using this pattern:
    394 {{{
    395  T.stdout[-<compiler>][-<version>][-ws-<wordsize>][-<platform>]
    396 }}}
    398 Any combination of the optional extensions may be given, but they must
    399 be in the order specified.  The most specific output file that matches
    400 the current configuration will be selected; for example if the
    401 platform is `i386-unknown-mingw32` then `T.stderr-i386-unknown-mingw32`
    402 will be picked in preference to `T.stderr`.
    404 Another common example is to give different sample output for an older
    405 compiler version.  For example, the sample `stderr` for GHC 6.8.x would go in the file
    406 `T.stderr-ghc-6.8`.
    408 = The details =
    410 The test suite driver is just a set of Python scripts, as are all of
    411 the .T files in the test suite.  The driver (driver/ first
    412 searches for all the .T files it can find, and then proceeds to
    413 execute each one, keeping a track of the number of tests run, and
    414 which ones succeeded and failed.
    416 The script takes several options:
    418   --config <file>
    420        <file> is just a file containing Python code which is
    421        executed.   The purpose of this option is so that a file
    422        containing settings for the configuration options can
    423        be specified on the command line.  Multiple --config options
    424        may be given.
    426   --rootdir <dir>
    428        <dir> is the directory below which to search for .T files
    429        to run.
    431   --output-summary <file>
    433        In addition to dumping the test summary to stdout, also
    434        put it in <file>.  (stdout also gets a lot of other output
    435        when running a series of tests, so redirecting it isn't 
    436        always the right thing).
    438   --only <test>
    440        Only run tests named <test> (multiple --only options can
    441        be given).  Useful for running a single test from a .T file
    442        containing multiple tests.
    444   -e <stmt>
    446        executes the Python statement <stmt> before running any tests.
    447        The main purpose of this option is to allow certain
    448        configuration options to be tweaked from the command line; for
    449        example, the build system adds '-e config.accept=1' to the
    450        command line when 'make accept' is invoked.
    452 Most of the code for running tests is located in driver/
    453 Take a look.
    455 There is a single Python class (TestConfig) containing the global
    456 configuration for the test suite.  It contains information such as the
    457 kind of compiler being used, which flags to give it, which platform
    458 we're running on, and so on.  The idea is that each platform and
    459 compiler would have its own file containing assignments for elements
    460 of the configuration, which are sourced by passing the appropriate
    461 --config options to the test driver.  For example, the GHC
    462 configuration is contained in the file config/ghc.
    464 A .T file can obviously contain arbitrary Python code, but the general
    465 idea is that it contains a sequence of calls to the function test(),
    466 which resides in  As described above, test() takes four
    467 arguments:
    469       test(<name>, <opt-fn>, <test-fn>, <args>)
    471 The function <opt-fn> is allowed to be any Python callable object,
    472 which takes a single argument of type TestOptions.  TestOptions is a
    473 class containing options which affect the way that the current test is
    474 run: whether to skip it, whether to expect failure, extra options to
    475 pass to the compiler, etc. (see for the definition of the
    476 TestOptions class).  The idea is that the <opt-fn> function modifies
    477 the TestOptions object that it is passed.  For example, to expect
    478 failure for a test, we might do this in the .T file:
    479 {{{
    480    def fn(opts):
    481       opts.expect = 'fail'
    483    test(test001, fn, compile, [''])
    484 }}}
    485 so when fn is called, it sets the instance variable "expect" in the
    486 instance of TestOptions passed as an argument, to the value 'fail'.
    487 This indicates to the test driver that the current test is expected to
    488 fail.
    490 Some of these functions, such as the one above, are common, so rather
    491 than forcing every .T file to redefine them, we provide canned
    492 versions.  For example, the provided function expect_fail does the
    493 same as fn in the example above.  See for all the canned
    494 functions we provide for <opt-fn>.
    496 The argument <test-fn> is a function which performs the test.  It
    497 takes three or more arguments:
    499       <test-fn>( <name>, <way>, ... )
    501 where <name> is the name of the test, <way> is the way in which it is
    502 to be run (eg. opt, optasm, prof, etc.), and the rest of the arguments
    503 are constructed from the list <args> in the original call to test().
    504 The following <test-fn>s are provided at the moment:
    506            compile
    507            compile_fail
    508            compile_and_run
    509            multimod_compile
    510            multimod_compile_fail
    511            multimod_compile_and_run
    512            run_command
    513            run_command_ignore_output
    514            ghci_script
    516 and obviously others can be defined.  The function should return
    517 either 'pass' or 'fail' indicating that the test passed or failed
    518 respectively.
    520 = Problems running the testsuite =
    522  1. If the test suite fails mysteriously, make sure that the {{{timeout}}} utility is working properly. This Haskell utility is compiled with the stage 1 compiler and invoked by the python driver, which does not print a nice error report if the utility fails. This can happen if, for example, the compiler produces bogus binaries. A workaround is to compile {{{timeout}}} with a stable {{{ghc}}}.
    524 = The testsuite and branches =
     33== The testsuite and version control branches ==
    52635It is not clear what to do with the testsuite when branching a compiler; should the testsuite also be branched?
    54756test(tc5, namebase_if_compiler_lt('ghc','6.9', 'tc5-6.8'), ...)