Opened 4 years ago

Closed 4 years ago

#9961 closed bug (fixed)

compile-time performance regression compiling genprimcode

Reported by: hvr Owned by: simonpj
Priority: highest Milestone: 7.10.1
Component: Compiler Version: 7.10.1-rc1
Keywords: Cc: scpmw@…, simonmar, thoughtpolice
Operating System: Unknown/Multiple Architecture: Unknown/Multiple
Type of failure: Compile-time performance bug Test Case:
Blocked By: Blocking:
Related Tickets: #10007 Differential Rev(s):
Wiki Page:

Description (last modified by hvr)

compiling utils/genprimopcode (where compiling the Alex-generated Lexer takes up the most amount of time) with GHC 7.8.4:

$ time /opt/ghc/7.8.4/bin/ghc  -Rghc-timing --make -O Main.hs 
[1 of 5] Compiling ParserM          ( ParserM.hs, ParserM.o )
[2 of 5] Compiling Lexer            ( Lexer.hs, Lexer.o )
[3 of 5] Compiling Syntax           ( Syntax.hs, Syntax.o )
[4 of 5] Compiling Parser           ( Parser.hs, Parser.o )
[5 of 5] Compiling Main             ( Main.hs, Main.o )
Linking Main ...
<<ghc: 20742311240 bytes, 2092 GCs, 60174814/140258264 avg/max bytes residency (33 samples), 332M in use, 0.00 INIT (0.00 elapsed), 7.84 MUT (8.35 elapsed), 6.93 GC (6.93 elapsed) :ghc>>

real	0m15.331s
user	0m15.173s
sys	0m0.172s

(btw, is it ok that GC takes up so much?)

the same with a GHC 7.10 snapshot takes *alot* longer:

$ time /opt/ghc/7.10.1/bin/ghc  -Rghc-timing --make -O Main.hs 
[1 of 5] Compiling ParserM          ( ParserM.hs, ParserM.o )
[2 of 5] Compiling Lexer            ( Lexer.hs, Lexer.o )
[3 of 5] Compiling Syntax           ( Syntax.hs, Syntax.o )
[4 of 5] Compiling Parser           ( Parser.hs, Parser.o )
[5 of 5] Compiling Main             ( Main.hs, Main.o )
Linking Main ...
<<ghc: 147681534568 bytes, 37824 GCs, 203338614/375314176 avg/max bytes residency (538 samples), 876M in use, 0.001 INIT (0.001 elapsed), 81.374 MUT (81.888 elapsed), 235.240 GC (235.136 elapsed) :ghc>>

real	5m17.061s
user	5m16.702s
sys	0m0.499s

...and here the GC/MUT ratio is even more off.

Attachments (2)

ghc-stage2.prof (41.8 KB) - added by hvr 4 years ago.
+RTS -p output
fix-9961.patch (11.2 KB) - added by scpmw 4 years ago.
Proposed fix

Download all attachments as: .zip

Change History (23)

comment:1 Changed 4 years ago by hvr

Description: modified (diff)

comment:2 Changed 4 years ago by simonpj

Can anyone build a profiled GHC and see where the time is going?

comment:3 Changed 4 years ago by simonpj

I can't reproduce this with HEAD. I used this command line, from the root directory, which I took from a build log

ghc -hisuf hi -osuf  o -hcsuf hc -static  -H32m -O -Werror -Wall -H64m  -i -iutils/genprimopcode/. -iutils/genprimopcode/dist/build -iutils/genprimopcode/dist/build/autogen -Iutils/genprimopcode/dist/build -Iutils/genprimopcode/dist/build/autogen       -package array  -no-user-package-db -rtsopts      -odir utils/genprimopcode/dist/build -hidir utils/genprimopcode/dist/build -stubdir utils/genprimopcode/dist/build   --make utils/genprimopcode/Main.hs -fforce-recomp +RTS -s

With HEAD I get

   4,189,727,416 bytes allocated in the heap
     525,871,120 bytes copied during GC
      43,253,816 bytes maximum residency (12 sample(s))
       1,758,368 bytes maximum slop
             116 MB total memory in use (0 MB lost due to fragmentation)

                                     Tot time (elapsed)  Avg pause  Max pause
  Gen  0       600 colls,     0 par    1.824s   1.823s     0.0030s    0.0271s
  Gen  1        12 colls,     0 par    0.811s   0.811s     0.0676s    0.1351s

  TASKS: 4 (1 bound, 3 peak workers (3 total), using -N1)

  SPARKS: 0 (0 converted, 0 overflowed, 0 dud, 0 GC'd, 0 fizzled)

  INIT    time    0.002s  (  0.001s elapsed)
  MUT     time    6.997s  (  7.946s elapsed)
  GC      time    2.635s  (  2.635s elapsed)
  EXIT    time    0.038s  (  0.037s elapsed)
  Total   time    9.711s  ( 10.619s elapsed)

With ghc 7.8 I get

   3,874,936,096 bytes allocated in the heap
     416,943,272 bytes copied during GC
      44,518,152 bytes maximum residency (10 sample(s))
       1,549,168 bytes maximum slop
             115 MB total memory in use (0 MB lost due to fragmentation)

                                    Tot time (elapsed)  Avg pause  Max pause
  Gen  0       396 colls,     0 par    1.21s    1.25s     0.0032s    0.0207s
  Gen  1        10 colls,     0 par    0.57s    0.65s     0.0654s    0.1421s

  TASKS: 4 (1 bound, 3 peak workers (3 total), using -N1)

  SPARKS: 0 (0 converted, 0 overflowed, 0 dud, 0 GC'd, 0 fizzled)

  INIT    time    0.00s  (  0.00s elapsed)
  MUT     time    5.63s  (  6.59s elapsed)
  GC      time    1.78s  (  1.91s elapsed)
  EXIT    time    0.03s  (  0.04s elapsed)
  Total   time    7.46s  (  8.54s elapsed)

They may not be quite comparable (e.g. my HEAD libraries are built with -O, but I think my installed GHC probably has libraries built with -O2). But HEAD is not way out of line.

I suppose I can now try building the 7.10 branch.

Meanwhile, would someone try with -dshow-passes (which displays intermediate sizes)?

Simon

comment:4 Changed 4 years ago by simonpj

I tried with the 7.10 branch, and got

   4,187,629,104 bytes allocated in the heap
     476,423,296 bytes copied during GC
      42,531,768 bytes maximum residency (11 sample(s))
       1,561,840 bytes maximum slop
              96 MB total memory in use (0 MB lost due to fragmentation)

                                     Tot time (elapsed)  Avg pause  Max pause
  Gen  0       619 colls,     0 par    1.937s   1.936s     0.0031s    0.0290s
  Gen  1        11 colls,     0 par    0.710s   0.710s     0.0645s    0.1338s

  TASKS: 4 (1 bound, 3 peak workers (3 total), using -N1)

  SPARKS: 0 (0 converted, 0 overflowed, 0 dud, 0 GC'd, 0 fizzled)

  INIT    time    0.002s  (  0.001s elapsed)
  MUT     time    6.782s  (  7.641s elapsed)
  GC      time    2.647s  (  2.646s elapsed)
  EXIT    time    0.035s  (  0.034s elapsed)
  Total   time    9.497s  ( 10.323s elapsed)

So I can't reproduce the slow compile time. Alas.

comment:5 Changed 4 years ago by hvr

That's weird... I just tried with today's GHC HEAD (at 696f2cfdea4ab87cc457b141323bb25ab4afe795), and a BuildFlavour="quick" build. The exact steps I took now were:

cd utils/genprimopcode/

rm *.o *.hi

alex Lexer.x

../inplace/bin/ghc-stage2 -Rghc-timing -O -c ParserM.hs
# <<ghc: 906859816 bytes, 319 GCs, 4611760/14141808 avg/max bytes residency (8 samples), 31M in use, 0.001 INIT (0.001 elapsed), 0.822 MUT (0.833 elapsed), 0.225 GC (0.225 elapsed) :ghc>>

../inplace/bin/ghc-stage2 -Rghc-timing -O -c Lexer.hs 
# ...runs for over 15 minutes...

So I can actually reproduce this with GHC HEAD

comment:6 Changed 4 years ago by hvr

Here's the output of

time /opt/ghc/7.10.1/bin/ghc -O -dshow-passes -c Lexer.hs |& ts -s "%.s"

Each output line is prepended by a timestamp expressed in seconds:

0.000012 Glasgow Haskell Compiler, Version 7.10.0.20141227, stage 2 booted by GHC version 7.8.4
0.000080 Using binary package database: /opt/ghc/7.10.1/lib/ghc-7.10.0.20141227/package.conf.d/package.cache
0.001052 Using binary package database: /home/hvr/.ghc/x86_64-linux-7.10.0.20141227/package.conf.d/package.cache
0.020922 wired-in package ghc-prim mapped to ghc-prim-0.3.1.0-000208e3187fe823e04e0c88352da107
0.020997 wired-in package integer-gmp mapped to integer-gmp-1.0.0.0-bb665344376cc98555dad190d849a337
0.021545 wired-in package base mapped to base-4.8.0.0-7f94b08ebb75c6cfab2660fb04137780
0.021578 wired-in package rts mapped to builtin_rts
0.021650 wired-in package template-haskell mapped to template-haskell-2.10.0.0-ab6ca4770f90f6f44262138da6877395
0.021676 wired-in package ghc mapped to ghc-7.10.0.20141227-217cbe8ba9c57a128065806ea73c39b4
0.021689 wired-in package dph-seq not found.
0.021758 wired-in package dph-par not found.
0.022632 Created temporary directory: /tmp/ghc16466_0
0.022703 *** C pre-processor:
0.028039 *** Checking old interface for Lexer:
0.028103 *** Parser:
0.214401 *** Renamer/typechecker:
1.362793 *** Desugar:
1.841327 Result size of Desugar (after optimization)
1.841402   = {terms: 97,708, types: 6,414, coercions: 7}
1.841418 *** Simplifier:
2.113916 Result size of Simplifier iteration=1
2.113977   = {terms: 92,764, types: 1,443, coercions: 16}
2.369659 Result size of Simplifier
2.369721   = {terms: 92,764, types: 1,443, coercions: 16}
2.369732 *** Specialise:
2.432081 Result size of Specialise
2.432143   = {terms: 92,928, types: 1,649, coercions: 34}
2.432154 *** Float out(FOS {Lam = Just 0, Consts = True, OverSatApps = False}):
2.432163 Result size of Float out(FOS {Lam = Just 0,
2.432174                               Consts = True,
2.761993                               OverSatApps = False})
2.762059   = {terms: 161,836, types: 37,472, coercions: 34}
2.762070 *** Simplifier:
7.394822 Result size of Simplifier iteration=1
7.394887   = {terms: 233,709, types: 119,483, coercions: 965}
8.202722 Result size of Simplifier iteration=2
8.202785   = {terms: 233,373, types: 119,022, coercions: 774}
8.873386 Result size of Simplifier iteration=3
8.873447   = {terms: 233,369, types: 119,019, coercions: 771}
9.562065 Result size of Simplifier
9.562127   = {terms: 233,369, types: 119,019, coercions: 771}
9.562142 *** Simplifier:
10.462517 Result size of Simplifier iteration=1
10.462582   = {terms: 233,348, types: 118,990, coercions: 771}
11.290064 Result size of Simplifier
11.290130   = {terms: 233,340, types: 118,978, coercions: 771}
11.290146 *** Simplifier:
12.081775 Result size of Simplifier iteration=1
12.081839   = {terms: 233,381, types: 119,015, coercions: 771}
13.042842 Result size of Simplifier iteration=2
13.042906   = {terms: 233,356, types: 118,973, coercions: 771}
13.916546 Result size of Simplifier
13.916613   = {terms: 233,356, types: 118,973, coercions: 771}
13.916629 *** Float inwards:
14.237678 Result size of Float inwards
14.237740   = {terms: 233,356, types: 118,973, coercions: 771}
14.237755 *** Called arity analysis:
14.602005 Result size of Called arity analysis
14.602069   = {terms: 233,356, types: 118,973, coercions: 771}
14.602085 *** Simplifier:
15.475958 Result size of Simplifier iteration=1
15.476022   = {terms: 233,350, types: 118,967, coercions: 771}
16.353738 Result size of Simplifier
16.353807   = {terms: 233,350, types: 118,967, coercions: 771}
16.353824 *** Demand analysis:
17.733968 Result size of Demand analysis
17.734033   = {terms: 233,350, types: 118,967, coercions: 771}
17.734050 *** Worker Wrapper binds:
17.862780 Result size of Worker Wrapper binds
17.862858   = {terms: 233,515, types: 119,216, coercions: 771}
17.862873 *** Simplifier:
18.758890 Result size of Simplifier iteration=1
18.758955   = {terms: 233,463, types: 119,172, coercions: 771}
19.651759 Result size of Simplifier iteration=2
19.651822   = {terms: 233,394, types: 119,034, coercions: 771}
20.558257 Result size of Simplifier
20.558318   = {terms: 233,394, types: 119,034, coercions: 771}
20.558329 *** Float out(FOS {Lam = Just 0, Consts = True, OverSatApps = True}):
20.558337 Result size of Float out(FOS {Lam = Just 0,
20.558347                               Consts = True,
21.488077                               OverSatApps = True})
21.488144   = {terms: 292,556, types: 150,261, coercions: 771}
21.488160 *** Common sub-expression:
314.182413 Result size of Common sub-expression
314.182475   = {terms: 263,668, types: 149,641, coercions: 219}
314.182488 *** Float inwards:
314.655439 Result size of Float inwards
314.655503   = {terms: 263,668, types: 149,641, coercions: 219}
314.655518 *** Simplifier:
315.557057 Result size of Simplifier iteration=1
315.557125   = {terms: 206,235, types: 120,272, coercions: 219}
316.231673 Result size of Simplifier iteration=2
316.231735   = {terms: 206,236, types: 120,271, coercions: 219}
317.062088 Result size of Simplifier
317.062153   = {terms: 206,236, types: 120,271, coercions: 219}
317.062169 *** Tidy Core:
317.468495 Result size of Tidy Core
317.468565   = {terms: 206,104, types: 120,130, coercions: 219}
317.473975 *** CorePrep:
318.294956 Result size of CorePrep
318.295026   = {terms: 206,396, types: 120,491, coercions: 219}
318.295038 *** Stg2Stg:
318.295050 *** CodeGen:
323.746799 *** Assembler:
323.934329 *** Deleting temp files:
323.934396 Warning: deleting non-existent /tmp/ghc16466_0/ghc16466_3.c
323.934873 *** Deleting temp dirs:

real	5m24.015s
user	5m23.707s
sys	0m0.489s

comment:7 Changed 4 years ago by simonpj

great. Do you have the equivalent output with 7.8.

(I'm still puzzled why my compilation is fast and yours is slow.)

comment:8 Changed 4 years ago by hvr

Have you tried my exact same steps?

Btw, tomorrow I can throw a prof-compiled GHC at Lexer.hs, if you tell me exactly what kind of output you're interested in...

Here's the equivalent output to comment:6 for GHC 7.8.4:

0.003879 Glasgow Haskell Compiler, Version 7.8.4, stage 2 booted by GHC version 7.6.3
0.003990 Using binary package database: /opt/ghc/7.8.4/lib/ghc-7.8.4/package.conf.d/package.cache
0.013057 Using binary package database: /home/hvr/.ghc/x86_64-linux-7.8.4/package.conf.d/package.cache
0.022682 hiding package shelly-1.5.6 to avoid conflict with later version shelly-1.6.0
0.022748 wired-in package ghc-prim mapped to ghc-prim-0.3.1.0-a24f9c14c632d75b683d0f93283aea37
0.022836 wired-in package integer-gmp mapped to integer-gmp-0.5.1.0-26579559b3647acf4f01d5edd9491a46
0.022874 wired-in package base mapped to base-4.7.0.2-bfd89587617e381ae01b8dd7b6c7f1c1
0.022938 wired-in package rts mapped to builtin_rts
0.022964 wired-in package template-haskell mapped to template-haskell-2.9.0.0-6d27c2b362b15abb1822f2f34b9ae7f9
0.022982 wired-in package dph-seq not found.
0.023020 wired-in package dph-par not found.
0.023869 Created temporary directory: /tmp/ghc12579_0
0.023918 *** C pre-processor:
0.029089 *** Checking old interface for main:Lexer:
0.029165 *** Parser:
0.103668 *** Renamer/typechecker:
0.901522 *** Desugar:
1.507213 Result size of Desugar (after optimization)
1.507273   = {terms: 97,707, types: 6,407, coercions: 4}
1.507283 *** Simplifier:
1.722230 Result size of Simplifier iteration=1
1.722290   = {terms: 92,782, types: 1,457, coercions: 31}
2.014619 Result size of Simplifier iteration=2
2.014680   = {terms: 92,784, types: 1,469, coercions: 40}
2.251739 Result size of Simplifier
2.251799   = {terms: 92,784, types: 1,469, coercions: 40}
2.251808 *** Specialise:
2.308621 Result size of Specialise
2.308682   = {terms: 92,784, types: 1,469, coercions: 40}
2.308692 *** Float out(FOS {Lam = Just 0, Consts = True, PAPs = False}):
2.308699 Result size of Float out(FOS {Lam = Just 0,
2.308715                               Consts = True,
2.721925                               PAPs = False})
2.721984   = {terms: 161,685, types: 37,280, coercions: 40}
2.721993 *** Float inwards:
2.806740 Result size of Float inwards
2.806800   = {terms: 161,685, types: 37,280, coercions: 40}
2.806810 *** Simplifier:
4.854193 Result size of Simplifier iteration=1
4.854254   = {terms: 89,091, types: 2,930, coercions: 1,043}
5.037276 Result size of Simplifier iteration=2
5.037336   = {terms: 88,743, types: 2,545, coercions: 770}
5.299382 Result size of Simplifier iteration=3
5.299442   = {terms: 88,739, types: 2,542, coercions: 767}
5.557920 Result size of Simplifier
5.557980   = {terms: 88,739, types: 2,542, coercions: 767}
5.557989 *** Simplifier:
5.828540 Result size of Simplifier iteration=1
5.828599   = {terms: 88,694, types: 32,129, coercions: 767}
6.148168 Result size of Simplifier
6.148227   = {terms: 88,686, types: 32,117, coercions: 767}
6.148237 *** Simplifier:
6.428055 Result size of Simplifier iteration=1
6.428115   = {terms: 88,707, types: 32,132, coercions: 781}
6.708043 Result size of Simplifier iteration=2
6.708103   = {terms: 88,707, types: 32,132, coercions: 767}
6.996411 Result size of Simplifier
6.996470   = {terms: 88,707, types: 32,132, coercions: 767}
6.996480 *** Demand analysis:
7.259522 Result size of Demand analysis
7.259587   = {terms: 88,707, types: 32,132, coercions: 767}
7.259599 *** Worker Wrapper binds:
7.309554 Result size of Worker Wrapper binds
7.309618   = {terms: 88,872, types: 32,381, coercions: 767}
7.309631 *** Simplifier:
7.583439 Result size of Simplifier iteration=1
7.583499   = {terms: 88,820, types: 32,337, coercions: 767}
7.867272 Result size of Simplifier iteration=2
7.867333   = {terms: 88,751, types: 32,199, coercions: 767}
8.146394 Result size of Simplifier
8.146453   = {terms: 88,751, types: 32,199, coercions: 767}
8.146462 *** Float out(FOS {Lam = Just 0, Consts = True, PAPs = True}):
8.146469 Result size of Float out(FOS {Lam = Just 0,
8.146486                               Consts = True,
8.732691                               PAPs = True})
8.732752   = {terms: 205,891, types: 122,523, coercions: 767}
8.732761 *** Common sub-expression:
9.101395 Result size of Common sub-expression
9.101457   = {terms: 177,032, types: 121,927, coercions: 217}
9.101466 *** Float inwards:
9.207068 Result size of Float inwards
9.207128   = {terms: 177,032, types: 121,927, coercions: 217}
9.207137 *** Simplifier:
9.811384 Result size of Simplifier iteration=1
9.811444   = {terms: 119,608, types: 92,568, coercions: 217}
10.273151 Result size of Simplifier iteration=2
10.273205   = {terms: 119,609, types: 92,567, coercions: 217}
10.829669 Result size of Simplifier
10.829729   = {terms: 119,609, types: 92,567, coercions: 217}
10.829739 *** Tidy Core:
11.177167 Result size of Tidy Core
11.177231   = {terms: 119,673, types: 92,627, coercions: 217}
11.207078 *** CorePrep:
11.549034 Result size of CorePrep
11.549100   = {terms: 119,974, types: 92,991, coercions: 217}
11.549129 *** Stg2Stg:
11.549139 *** CodeOutput:
11.549306 *** New CodeGen:
11.549336 *** CPSZ:
11.549480 *** CPSZ:
11.550487 *** CPSZ:

(...almost 30000 lines of uninteresting "*** CPSZ:" removed...)

13.468961 *** CPSZ:
13.469193 *** CPSZ:
13.469423 *** CPSZ:
13.473052 *** Assembler:
13.611920 *** Deleting temp files:
13.612006 Warning: deleting non-existent /tmp/ghc12579_0/ghc12579_3.c
13.612314 *** Deleting temp dirs:

Changed 4 years ago by hvr

Attachment: ghc-stage2.prof added

+RTS -p output

comment:9 Changed 4 years ago by hvr

I just realised I had already build a prof GHC earlier, so I just ran it with

$ time ../inplace/bin/ghc-stage2 +RTS -p -RTS -O -fforce-recomp  -c Lexer.hs 
       
real	6m23.129s
user	6m22.575s
sys	0m0.723s

and attached the resulting attachment:ghc-stage2.prof

comment:10 Changed 4 years ago by simonpj

Cc: scpmw@… added

Peter, could you look at this please? lintAnnots (whatever that is, but it's yours) is taking more than 80% of GHC's allocation!!!

One thought might be to make it flag-enabled. (And not -dcore-lint, because we have that on a lot.) But another question is: does it have to be so expensive?

Simon

comment:11 Changed 4 years ago by scpmw

Sorry for late reaction, somehow missed the mail. lintAnnots should only do something with -dannot-lint enabled. It will then be *very* expensive, because it runs every Core pass twice, and diff'ing Core can be almost arbitrarily costly.

That being said, without -dannot-lint we should really have lintAnnots pname pass guts = pass guts. Not quite sure though how we get that much individual allocation on it, I'll investigate.

comment:12 Changed 4 years ago by scpmw

lintAnnots is definitely not the villain here: It doesn't change performance if I simply short-cut it out. All cost then lands in Core2Core. Reason is that FloatOut floats out the bodies of doCorePass past the cost-centre annotations. This is because the bodies are partially applied functions (so exprIsHNF is set), which makes floating treat it like a constructor application. I'm pretty sure that's wrong, and ezyang's tick semantics seem to agree.

At this point I'm still trying to understand how this got introduced (most likely my fault in some way). If my diagnosis is right, we'll probably need a new version of exprIsHNF or mkTick. In any case, while this definitely needs to be fixed, it's not the cause for this slow-down.

comment:13 Changed 4 years ago by thoughtpolice

Priority: normalhighest

Changed 4 years ago by scpmw

Attachment: fix-9961.patch added

Proposed fix

comment:14 Changed 4 years ago by scpmw

Turns out that this might have been my fault anyway. Background here is that CSE needs to strip ticks from expressions fairly often, and stripTicks was not particularly efficient at it.

The attached patch should restore acceptable performance. I also added a performance test - this is fairly easy to reproduce.

(For the cost-centre issue mentioned above, see Phabricator D616).

comment:16 Changed 4 years ago by hvr

Status: newpatch

comment:17 Changed 4 years ago by scpmw

Phab:D616 fixes that cost-centre profiling erroneously attributed all cost to lintAnnots instead of CSE.

comment:18 Changed 4 years ago by Austin Seipp <austin@…>

In 55199a97c020761ff4bfdc06da0042e43bede697/ghc:

Split stripTicks into expression editing and tick collection

As with stripTicksTop, this is because we often need the stripped
expression but not the ticks (at least not right away). This makes a big
difference for CSE, see #9961.

Signed-off-by: Austin Seipp <austin@well-typed.com>

comment:19 Changed 4 years ago by thoughtpolice

I've split off a separate ticket for Phab:D616 - #10007.

comment:20 Changed 4 years ago by thoughtpolice

Merged to ghc-7.10 (via 174082ffeb69b2f9df19e7af7b63a331dd074145).

comment:21 Changed 4 years ago by thoughtpolice

Resolution: fixed
Status: patchclosed

Forgot to close this; tested as fixed in the GHC 7.10 branch:

$ ../../inplace/bin/ghc-stage2 -Rghc-timing -O -c ParserM.hs
<<ghc: 322519440 bytes, 126 GCs, 5041166/14526288 avg/max bytes residency (7 samples), 31M in use, 0.002 INIT (0.001 elapsed), 0.279 MUT (0.304 elapsed), 0.199 GC (0.198 elapsed) :ghc>>

$ ../../inplace/bin/ghc-stage2 -Rghc-timing -O -c Lexer.hs
<<ghc: 583952016 bytes, 293 GCs, 8581591/22327768 avg/max bytes residency (9 samples), 58M in use, 0.001 INIT (0.001 elapsed), 0.304 MUT (0.374 elapsed), 0.362 GC (0.362 elapsed) :ghc>>

The lintAnnots stuff is taken care of in #10007 like I commented earlier.

Note: See TracTickets for help on using tickets.