Opened 11 months ago

Closed 7 months ago

Last modified 7 months ago

#14890 closed task (fixed)

Make Linux slow validate green

Reported by: bgamari Owned by: alpmestan
Priority: normal Milestone:
Component: Compiler Version: 8.2.2
Keywords: Cc:
Operating System: Unknown/Multiple Architecture: Unknown/Multiple
Type of failure: None/Unknown Test Case:
Blocked By: Blocking:
Related Tickets: Differential Rev(s): D4546, D4636, D4712
Wiki Page:

Description

Now since we will soon have a nightly slow validation, let's finally get it passing.

Change History (17)

comment:1 Changed 11 months ago by bgamari

Owner: set to alpmestan

comment:2 Changed 10 months ago by alpmestan

I have a preliminary summary in a gist here. I've started taking actions to address the failures I've looked at (unexpected passes & failures, haven't looked at the stats failures yet), I'll update the gist with links to tickets etc as I address each unexpected test result.

comment:3 Changed 10 months ago by alpmestan

I went ahead and took action for _all_ the unexpected passes and failures. See this commit in my GHC fork on github. I'm just going to ask for Simon PJ's input on 2-3 tests and will then proceed by pushing a diff on phabricator. I can then look into the stats failures if we so desire.

comment:4 Changed 10 months ago by alpmestan

Differential Rev(s): D4546

First batch of test expectation changes at D4546.

comment:5 Changed 9 months ago by Ben Gamari <ben@…>

In d9d80151/ghc:

testsuite: Fix `./validate --slow`

This fixes all unexpected passes and unexpected failures from a
`./validate --slow` run I did last week. I commented on many
tickets and created a few more as I was going through the failing
tests. A summary of the entire process is available at:

  https://gist.github.com/alpmestan/c371840968f086c8dc5b56af8325f0a9

This is part of an attempt to have `./validate --slow` pass,
tracked in #14890. Another patch will be necessary for the unexpected
stats failures.

Test Plan: ./validate --slow (not green yet)

Reviewers: bgamari, simonmar

Subscribers: thomie, carter

Differential Revision: https://phabricator.haskell.org/D4546

comment:6 Changed 8 months ago by Ben Gamari <ben@…>

In ca3d3039/ghc:

Fix another batch of `./validate --slow` failures

A rather detailed summary can be found at:

    https://gist.github.com/alpmestan/be82b47bb88b7dc9ff84105af9b1bb82

This doesn't fix all expectation mismatches yet, but we're down to about
20 mismatches with my previous patch and this one, as opposed to ~150
when I got started.

Test Plan: ./validate --slow

Reviewers: bgamari, erikd, simonmar

Reviewed By: simonmar

Subscribers: thomie, carter

GHC Trac Issues: #14890

Differential Revision: https://phabricator.haskell.org/D4636

comment:7 Changed 8 months ago by alpmestan

Differential Rev(s): D4546D4546, D4636, D4712
Status: newpatch

I managed to get a green validate on circleci when running all the tests that were failing after my last patch (D4636). The patch that gets us there is now up on phab as D4712.

comment:8 Changed 8 months ago by Ben Gamari <ben@…>

In c4219d9/ghc:

Another batch of './validation --slow' tweaks

This finally gets us to a green ./validate --slow on linux for a ghc
checkout from the beginning of this week, see

  https://circleci.com/gh/ghc/ghc/4739

This is hopefully the final (or second to final) patch to
address #14890.

Test Plan: ./validate --slow

Reviewers: bgamari, hvr, simonmar

Reviewed By: bgamari

Subscribers: rwbarton, thomie, carter

GHC Trac Issues: #14890

Differential Revision: https://phabricator.haskell.org/D4712

comment:9 Changed 8 months ago by alpmestan

Ben, I suppose we can now give a shot at enabling the slow validate scenario in our Circle CI config? We want to run it on a nightly basis right? Or per-commit?

For reference, a full slow validate appears to take a little over 2 hours. We have to weight that against the usefulness of having per-commit (or per-merged-patch, I suppose) slow validate status. If I remember correctly, we were leaning towards nightly.

comment:10 Changed 8 months ago by alpmestan

PR for enabling slow validate on linux in the Circle CI script on github.

comment:11 Changed 8 months ago by bgamari

We want to run it on a nightly basis right? Or per-commit?

Right, I think nightly is all we have the capacity for at the moment.

comment:12 Changed 7 months ago by alpmestan

The commit has hit the master branch on github (link), so we should get slow validate reports for master starting tomorrow morning. If we have new failures, I'll address them ASAP and then close this ticket (woohoo).

Last edited 7 months ago by alpmestan (previous) (diff)

comment:13 Changed 7 months ago by alpmestan

Resolution: fixed
Status: patchclosed

We've seen a green slow validate! We will be tracking/reporting/fixing any failure as appropriate but I'm closing this ticket as we will want to track individual problems separately.

comment:14 Changed 7 months ago by osa1

What settings/command are you using to test this? I just validated on my x86_64 Linux laptop, and got 38 failures. Here's the summary file:

Unexpected results from:
TEST="CPUTime001 T10962 T12087 T12733 T13168 T13350 T14052 T14304 T14904b T14936 T2851 T3007 T4334 T7175 T7919 bkpcabal01 bkpcabal02 bkpcabal03 bkpcabal04 bkpcabal05 bkpcabal06 bkpcabal07 bug1465 cabal01 cabal03 cabal04 cabal05 cabal06 cabal08 cabal09 deriving-via-compile gadt11 haddock.Cabal haddock.compiler hpc_fork recomp007 safePkg01 space_leak_001 tcfail155 tcfail176"

SUMMARY for test run started at Sun Jun 17 19:56:38 2018 +03
 0:56:57 spent to go through
    6441 total tests, which gave rise to
   28995 test cases, of which
    5703 were skipped

     117 had missing libraries
   22862 expected passes
     262 expected failures

       0 caused framework failures
       0 caused framework warnings
       0 unexpected passes
      38 unexpected failures
      13 unexpected stat failures

Unexpected failures:
   /tmp/ghctest-y99057f2/test   spaces/./backpack/cabal/bkpcabal02/bkpcabal02.run          bkpcabal02 [bad stderr] (normal)
   /tmp/ghctest-y99057f2/test   spaces/./backpack/cabal/T14304/T14304.run                  T14304 [bad stderr] (normal)
   /tmp/ghctest-y99057f2/test   spaces/./backpack/cabal/bkpcabal03/bkpcabal03.run          bkpcabal03 [bad stderr] (normal)
   /tmp/ghctest-y99057f2/test   spaces/./backpack/cabal/bkpcabal05/bkpcabal05.run          bkpcabal05 [bad stderr] (normal)
   /tmp/ghctest-y99057f2/test   spaces/./backpack/cabal/bkpcabal04/bkpcabal04.run          bkpcabal04 [bad stderr] (normal)
   /tmp/ghctest-y99057f2/test   spaces/./backpack/cabal/bkpcabal01/bkpcabal01.run          bkpcabal01 [bad stderr] (normal)
   /tmp/ghctest-y99057f2/test   spaces/./backpack/cabal/bkpcabal07/bkpcabal07.run          bkpcabal07 [bad stderr] (normal)
   /tmp/ghctest-y99057f2/test   spaces/./backpack/cabal/bkpcabal06/bkpcabal06.run          bkpcabal06 [bad stderr] (normal)
   /tmp/ghctest-y99057f2/test   spaces/./cabal/cabal04/cabal04.run                         cabal04 [bad stderr] (normal)
   /tmp/ghctest-y99057f2/test   spaces/./cabal/T12733/T12733.run                           T12733 [bad stderr] (normal)
   /tmp/ghctest-y99057f2/test   spaces/./cabal/cabal03/cabal03.run                         cabal03 [bad stderr] (normal)
   /tmp/ghctest-y99057f2/test   spaces/./cabal/cabal01/cabal01.run                         cabal01 [bad stderr] (normal)
   /tmp/ghctest-y99057f2/test   spaces/./cabal/cabal09/cabal09.run                         cabal09 [bad stderr] (normal)
   /tmp/ghctest-y99057f2/test   spaces/./cabal/cabal08/cabal08.run                         cabal08 [bad stderr] (normal)
   /tmp/ghctest-y99057f2/test   spaces/./cabal/cabal05/cabal05.run                         cabal05 [bad stderr] (normal)
   /tmp/ghctest-y99057f2/test   spaces/./cabal/cabal06/cabal06.run                         cabal06 [bad stderr] (normal)
   /tmp/ghctest-y99057f2/test   spaces/./deriving/should_compile/deriving-via-compile.run  deriving-via-compile [exit code non-0] (normal)
   /tmp/ghctest-y99057f2/test   spaces/./deriving/should_compile/deriving-via-compile.run  deriving-via-compile [exit code non-0] (hpc)
   /tmp/ghctest-y99057f2/test   spaces/./deriving/should_compile/deriving-via-compile.run  deriving-via-compile [exit code non-0] (optasm)
   /tmp/ghctest-y99057f2/test   spaces/./deriving/should_compile/deriving-via-compile.run  deriving-via-compile [exit code non-0] (profasm)
   /tmp/ghctest-y99057f2/test   spaces/./deriving/should_compile/deriving-via-compile.run  deriving-via-compile [exit code non-0] (optllvm)
   /tmp/ghctest-y99057f2/test   spaces/./deriving/should_fail/T2851.run                    T2851 [stderr mismatch] (normal)
   /tmp/ghctest-y99057f2/test   spaces/./driver/T3007/T3007.run                            T3007 [bad stderr] (normal)
   /tmp/ghctest-y99057f2/test   spaces/./driver/recomp007/recomp007.run                    recomp007 [bad stderr] (normal)
   /tmp/ghctest-y99057f2/test   spaces/./gadt/gadt11.run                                   gadt11 [stderr mismatch] (normal)
   /tmp/ghctest-y99057f2/test   spaces/./gadt/T12087.run                                   T12087 [stderr mismatch] (normal)
   /tmp/ghctest-y99057f2/test   spaces/./numeric/should_run/T10962.run                     T10962 [bad stdout] (optllvm)
   /tmp/ghctest-y99057f2/test   spaces/./patsyn/should_compile/T13350/T13350.run           T13350 [bad stderr] (normal)
   /tmp/ghctest-y99057f2/test   spaces/./safeHaskell/check/pkg01/safePkg01.run             safePkg01 [bad stderr] (normal)
   /tmp/ghctest-y99057f2/test   spaces/./typecheck/T13168/T13168.run                       T13168 [bad stderr] (normal)
   /tmp/ghctest-y99057f2/test   spaces/./typecheck/bug1465/bug1465.run                     bug1465 [bad stderr] (normal)
   /tmp/ghctest-y99057f2/test   spaces/./rts/T7919.run                                     T7919 [bad exit code] (ghci)
   /tmp/ghctest-y99057f2/test   spaces/./typecheck/should_fail/tcfail155.run               tcfail155 [stderr mismatch] (normal)
   /tmp/ghctest-y99057f2/test   spaces/./typecheck/should_fail/tcfail176.run               tcfail176 [stderr mismatch] (normal)
   /tmp/ghctest-y99057f2/test   spaces/./typecheck/should_fail/T7175.run                   T7175 [stderr mismatch] (normal)
   /tmp/ghctest-y99057f2/test   spaces/./typecheck/should_fail/T14904b.run                 T14904b [stderr mismatch] (normal)
   /tmp/ghctest-y99057f2/test   spaces/../../libraries/base/tests/CPUTime001.run           CPUTime001 [bad stdout] (threaded2)
   /tmp/ghctest-y99057f2/test   spaces/../../libraries/hpc/tests/fork/hpc_fork.run         hpc_fork [bad heap profile] (profasm)

Unexpected stat failures:
   /tmp/ghctest-y99057f2/test   spaces/./perf/haddock/haddock.Cabal.run       haddock.Cabal [stat not good enough] (normal)
   /tmp/ghctest-y99057f2/test   spaces/./perf/haddock/haddock.compiler.run    haddock.compiler [stat not good enough] (normal)
   /tmp/ghctest-y99057f2/test   spaces/./perf/should_run/T14936.run           T14936 [stat not good enough] (hpc)
   /tmp/ghctest-y99057f2/test   spaces/./perf/should_run/T14052.run           T14052 [stat not good enough] (ghci)
   /tmp/ghctest-y99057f2/test   spaces/./perf/should_run/T14936.run           T14936 [stat not good enough] (profasm)
   /tmp/ghctest-y99057f2/test   spaces/./perf/space_leaks/space_leak_001.run  space_leak_001 [stat too good] (hpc)
   /tmp/ghctest-y99057f2/test   spaces/./perf/space_leaks/space_leak_001.run  space_leak_001 [stat too good] (optasm)
   /tmp/ghctest-y99057f2/test   spaces/./perf/space_leaks/T4334.run           T4334 [stat not good enough] (threaded2)
   /tmp/ghctest-y99057f2/test   spaces/./perf/space_leaks/space_leak_001.run  space_leak_001 [stat too good] (dyn)
   /tmp/ghctest-y99057f2/test   spaces/./perf/space_leaks/space_leak_001.run  space_leak_001 [stat too good] (optllvm)
   /tmp/ghctest-y99057f2/test   spaces/./perf/should_run/T14936.run           T14936 [stat not good enough] (threaded1)
   /tmp/ghctest-y99057f2/test   spaces/./perf/should_run/T14936.run           T14936 [stat not good enough] (threaded2)
   /tmp/ghctest-y99057f2/test   spaces/./perf/should_run/T14936.run           T14936 [stat not good enough] (profthreaded)

comment:15 Changed 7 months ago by requisitebits

I also see failures on my x86_64 Linux desktop, but maybe my verification strategy is missing something. See https://gist.github.com/jmitchell/97ec81a114cb92a3323abb4ec3e020b6 for the script I ran. Note that the git revision I chose comes from comment:12.

Here are results:

Unexpected results from:
TEST="T10294 T10294a T10420 T11462 T11525 T12567a annrun01 frontend01 ghci024 plugin-recomp-flags plugin-recomp-impure
plugin-recomp-pure plugins01 plugins05 plugins06 plugins07 plugins08 plugins09 plugins11 plugins12 plugins13 plugins14
plugins15 process001 process002"

SUMMARY for test run started at Mon Jun 25 03:50:20 2018 UTC
 1:30:34 spent to go through
    6419 total tests, which gave rise to
   19948 test cases, of which
   13291 were skipped

      20 had missing libraries
    6458 expected passes
     154 expected failures

       0 caused framework failures
       0 caused framework warnings
       0 unexpected passes
      25 unexpected failures
       0 unexpected stat failures

Unexpected failures:
   annotations/should_run/annrun01.run           annrun01 [exit code non-0] (normal)
   ghci/scripts/ghci024.run                      ghci024 [bad stderr] (normal)
   plugins/plugins01.run                         plugins01 [bad exit code] (normal)
   plugins/plugins05.run                         plugins05 [exit code non-0] (dyn)
   plugins/plugins06.run                         plugins06 [exit code non-0] (dyn)
   plugins/plugins07.run                         plugins07 [bad exit code] (normal)
   plugins/plugins08.run                         plugins08 [bad exit code] (normal)
   plugins/plugins09.run                         plugins09 [bad exit code] (normal)
   plugins/plugins11.run                         plugins11 [bad exit code] (normal)
   plugins/plugins12.run                         plugins12 [bad exit code] (normal)
   plugins/plugins13.run                         plugins13 [bad exit code] (normal)
   plugins/plugins14.run                         plugins14 [bad exit code] (normal)
   plugins/plugins15.run                         plugins15 [bad exit code] (normal)
   plugins/T10420.run                            T10420 [bad exit code] (normal)
   plugins/T10294.run                            T10294 [bad exit code] (normal)
   plugins/T10294a.run                           T10294a [bad exit code] (normal)
   plugins/frontend01.run                        frontend01 [bad exit code] (normal)
   plugins/T12567a.run                           T12567a [bad exit code] (normal)
   plugins/plugin-recomp-pure.run                plugin-recomp-pure [bad exit code] (normal)
   plugins/plugin-recomp-impure.run              plugin-recomp-impure [bad exit code] (normal)
   plugins/plugin-recomp-flags.run               plugin-recomp-flags [bad exit code] (normal)
   typecheck/should_compile/T11462.run           T11462 [exit code non-0] (normal)
   typecheck/should_compile/T11525.run           T11525 [exit code non-0] (normal)
   ../../libraries/process/tests/process001.run  process001 [bad exit code] (normal)
   ../../libraries/process/tests/process002.run  process002 [bad exit code] (normal)

make[1]: *** [../mk/test.mk:329: test] Error 1
make[1]: Leaving directory '/home/jake/src/ghc/testsuite/tests'
make: *** [Makefile:223: test] Error 2

I'm running the same script again using REV=1c2c2d3dfd4c36884b22163872feb87122b4528d (recent master commit) instead.

comment:16 Changed 7 months ago by requisitebits

Results when using the same script from comment:15, except with REV=1c2c2d3dfd4c36884b22163872feb87122b4528d:

Unexpected results from:
TEST="T10294 T10294a T10420 T11462 T11525 T12567a annrun01 frontend01 ghci024 plugin-recomp-flags plugin-recomp-impure plugin-recomp-pure plugins01 plugins05 plugins06 plugins07 plugins08 plugins09 plugins11 plugins12 plugins13 plugins14 plugins15"

SUMMARY for test run started at Mon Jun 25 19:59:07 2018 UTC
 1:30:40 spent to go through
    6454 total tests, which gave rise to
   20029 test cases, of which
   13338 were skipped

      20 had missing libraries
    6495 expected passes
     153 expected failures

       0 caused framework failures
       0 caused framework warnings
       0 unexpected passes
      23 unexpected failures
       0 unexpected stat failures

Unexpected failures:
   annotations/should_run/annrun01.run  annrun01 [exit code non-0] (normal)
   ghci/scripts/ghci024.run             ghci024 [bad stderr] (normal)
   plugins/plugins01.run                plugins01 [bad exit code] (normal)
   plugins/plugins05.run                plugins05 [exit code non-0] (dyn)
   plugins/plugins06.run                plugins06 [exit code non-0] (dyn)
   plugins/plugins07.run                plugins07 [bad exit code] (normal)
   plugins/plugins08.run                plugins08 [bad exit code] (normal)
   plugins/plugins09.run                plugins09 [bad exit code] (normal)
   plugins/plugins11.run                plugins11 [bad exit code] (normal)
   plugins/plugins12.run                plugins12 [bad exit code] (normal)
   plugins/plugins13.run                plugins13 [bad exit code] (normal)
   plugins/plugins14.run                plugins14 [bad exit code] (normal)
   plugins/plugins15.run                plugins15 [bad exit code] (normal)
   plugins/T10420.run                   T10420 [bad exit code] (normal)
   plugins/T10294.run                   T10294 [bad exit code] (normal)
   plugins/T10294a.run                  T10294a [bad exit code] (normal)
   plugins/frontend01.run               frontend01 [bad exit code] (normal)
   plugins/T12567a.run                  T12567a [bad exit code] (normal)
   plugins/plugin-recomp-pure.run       plugin-recomp-pure [bad exit code] (normal)
   plugins/plugin-recomp-impure.run     plugin-recomp-impure [bad exit code] (normal)
   plugins/plugin-recomp-flags.run      plugin-recomp-flags [bad exit code] (normal)
   typecheck/should_compile/T11462.run  T11462 [exit code non-0] (normal)
   typecheck/should_compile/T11525.run  T11525 [exit code non-0] (normal)

make[1]: *** [../mk/test.mk:329: test] Error 1
make[1]: Leaving directory '/home/jake/src/ghc/testsuite/tests'
make: *** [Makefile:223: test] Error 2

comment:17 Changed 7 months ago by bgamari

It's entirely possible that there is some environment-dependence here. It certainly passes somewhat reliably on CircleCI (save a pesky intermittent failure of posix002 which I haven't been able to reproduce locally). See the circleci configuration in the tree .circleci/config.yml for the test configuration.

Note: See TracTickets for help on using tickets.