Opened 6 years ago

Last modified 4 months ago

#8096 new task

Add fudge-factor for performance tests run on non-validate builds

Reported by: ezyang Owned by:
Priority: normal Milestone:
Component: Build System (make) Version: 7.7
Keywords: Cc:
Operating System: Unknown/Multiple Architecture: Unknown/Multiple
Type of failure: None/Unknown Test Case:
Blocked By: Blocking:
Related Tickets: #9315 Differential Rev(s):
Wiki Page:

Description

Since I'm not going to get around to this immediately, Trac'ifying for posterity:

These tests have been doing better than expected in the nightlies for some while.

 Unexpected failures:
    perf/compiler  T3064 [stat too good] (normal)
    perf/compiler  T3294 [stat too good] (normal)
    perf/compiler  T5642 [stat too good] (normal)
    perf/haddock   haddock.Cabal [stat too good] (normal)
    perf/haddock   haddock.base [stat too good] (normal)

Unfortunately, fixing them is not a simple matter of shifting the ranges up, since the tests only exceed expectations on a /perf/ build, so on a normal build such as 'quick', these tests all pass normally.

I could bump up the upper bounds so that the builder stops bleating about them; perhaps we could do something more complicated where the expected performance depends on what level of optimization GHC was built with (but I don't know how to implement this.)


The problem with just widening the bounds to cover 2 different types of build is that it increases the chance that performance changes won't actually be noticed by thge person responsible.

Having different bounds for different build configurations is a pain, because (a) the testsuite has to work out which set of bounds to use, and (b) you now have even more wobbly values to keep up-to-date.

I think perhaps the best thing would be to add some sort of (per-test?) fudge factor for non-validate builds. That way validate will still find performance regressions, like it does today, but other builds are less likely to give false positives. (Igloo)

Change History (2)

comment:1 Changed 3 years ago by thomie

See ticket:9315#comment:24 for a proposal to only run performance tests when validating with release settings (-O2).

comment:2 Changed 4 months ago by bgamari

Component: Build SystemBuild System (make)

The new Hadrian build system has been merged. Relabeling the tickets concerning the legacy make build system to prevent confusion.

Note: See TracTickets for help on using tickets.