Opened 2 years ago
Last modified 13 months ago
#5642 new bug
Deriving Generic of a big type takes a long time and lots of space
Reported by: | basvandijk | Owned by: | dimitris |
---|---|---|---|
Priority: | high | Milestone: | 7.6.2 |
Component: | Compiler | Version: | 7.3 |
Keywords: | Cc: | v.dijk.bas@…, dimitris@…, hackage.haskell.org@…, carter.schonwald@… | |
Operating System: | Unknown/Multiple | Architecture: | Unknown/Multiple |
Type of failure: | Compile-time performance bug | Difficulty: | Unknown |
Test Case: | T5642 | Blocked By: | |
Blocking: | Related Tickets: |
Description
If I load the following module into ghci my system will run out of memory after about 15 minutes:
{-# LANGUAGE DeriveGeneric #-} import GHC.Generics data BigSum = C0 | C1 | C2 | C3 | C4 | C5 | C6 | C7 | C8 | C9 | C10 | C11 | C12 | C13 | C14 | C15 | C16 | C17 | C18 | C19 | C20 | C21 | C22 | C23 | C24 | C25 | C26 | C27 | C28 | C29 | C30 | C31 | C32 | C33 | C34 | C35 | C36 | C37 | C38 | C39 | C40 | C41 | C42 | C43 | C44 | C45 | C46 | C47 | C48 | C49 | C50 | C51 | C52 | C53 | C54 | C55 | C56 | C57 | C58 | C59 | C60 | C61 | C62 | C63 | C64 | C65 | C66 | C67 | C68 | C69 | C70 | C71 | C72 | C73 | C74 | C75 | C76 | C77 | C78 | C79 | C80 | C81 | C82 | C83 | C84 | C85 | C86 | C87 | C88 | C89 | C90 | C91 | C92 | C93 | C94 | C95 | C96 | C97 | C98 | C99 | C100 | C101 | C102 | C103 | C104 | C105 | C106 | C107 | C108 | C109 | C110 | C111 | C112 | C113 | C114 | C115 | C116 | C117 | C118 | C119 | C120 | C121 | C122 | C123 | C124 | C125 | C126 | C127 | C128 | C129 | C130 | C131 | C132 | C133 | C134 | C135 | C136 | C137 | C138 | C139 | C140 | C141 | C142 | C143 | C144 | C145 | C146 | C147 | C148 | C149 | C150 | C151 | C152 | C153 | C154 | C155 | C156 | C157 | C158 | C159 | C160 | C161 | C162 | C163 | C164 | C165 | C166 | C167 | C168 | C169 | C170 | C171 | C172 | C173 | C174 | C175 | C176 | C177 | C178 | C179 | C180 | C181 | C182 | C183 | C184 | C185 | C186 | C187 | C188 | C189 | C190 | C191 | C192 | C193 | C194 | C195 | C196 | C197 | C198 | C199 | C200 | C201 | C202 | C203 | C204 | C205 | C206 | C207 | C208 | C209 | C210 | C211 | C212 | C213 | C214 | C215 | C216 | C217 | C218 | C219 | C220 | C221 | C222 | C223 | C224 | C225 | C226 | C227 | C228 | C229 | C230 | C231 | C232 | C233 | C234 | C235 | C236 | C237 | C238 | C239 | C240 | C241 | C242 | C243 | C244 | C245 | C246 | C247 | C248 | C249 | C250 | C251 | C252 | C253 | C254 | C255 | C256 | C257 | C258 | C259 | C260 | C261 | C262 | C263 | C264 | C265 | C266 | C267 | C268 | C269 | C270 | C271 | C272 | C273 | C274 | C275 | C276 | C277 | C278 | C279 | C280 | C281 | C282 | C283 | C284 | C285 | C286 | C287 | C288 | C289 | C290 | C291 | C292 | C293 | C294 | C295 | C296 | C297 | C298 | C299 deriving Generic
Big products have the same problem:
data BigProduct = C () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () deriving Generic
Change History (17)
comment:1 Changed 2 years ago by basvandijk
- Cc v.dijk.bas@… added
comment:2 follow-ups: ↓ 3 ↓ 5 Changed 2 years ago by clintm
comment:3 in reply to: ↑ 2 Changed 2 years ago by clintm
Replying to clintm:
Came to report this sort of thing too. I can't actually get this code to completely run ...
Forgot to mention, this is ghc 7.2.2
comment:4 Changed 2 years ago by dreixel
- Cc dreixel removed
- Owner set to dreixel
Thanks for reporting, I will look into it.
comment:5 in reply to: ↑ 2 Changed 2 years ago by basvandijk
Replying to clintm:
Came to report this sort of thing too. I can't actually get this code to completely run as my machine runs out of memory (12g) almost immediately.
...
This doesn't have anything to do with generics. You just forgot to give a definition for either show or showsPrec. Since these methods have default implementations which are defined in terms of each other you get an infinite loop which apparently overflows the stack.
Also note that the Show type class does not use DefaultSignatures and so does not have default generic implementations.
comment:6 Changed 2 years ago by dreixel
- Owner changed from dreixel to dimitris
- Version changed from 7.2.1 to 7.3
I thought the problem was the generation of Generic instances, but it isn't. If you write the instance by hand, it still takes long. So the bottleneck is the performance of the constraint solver TcSimplify. Things have improved with the new coercion solver; see the following figures for compiling derive Generic on a datatype with 100/200/300 constructors:
ghc-7.2.1 100 0m30.079s 200 3m44.419s 300 12m52.306s HEAD 100 0m8.302s 200 1m1.631s 300 3m32.507s
But they could probably be much better, since the coercions introduced are not that complicated. I've notified Dimitrios, he'll look further into it.
comment:7 Changed 2 years ago by igloo
- Milestone set to 7.4.1
- Priority changed from normal to high
comment:8 follow-up: ↓ 10 Changed 2 years ago by simonpj
- Difficulty set to Unknown
I think this is going to be a difficult one to solve. The underlying problem is that the types grow non-linearly with the program size. Why? Look at Section 2.3 of Scrap your type applications.
This is a fundamental problem with System F, so it's not easy for GHC to get around it. It shows up especially with deeply-nested sums and products, which is exactly what is generated by the generic stuff. I'm not sure what to do here.
Simon
comment:9 Changed 2 years ago by dreixel
I had the impression Dimitrios had already made some progress on this, and we were just waiting to turn it into a perf test for the testsuite before closing this ticket. I think the resulting performance is not great, but it's good enough for most cases, including ASTs with 100 or so constructors.
comment:10 in reply to: ↑ 8 ; follow-up: ↓ 11 Changed 2 years ago by basvandijk
Replying to simonpj:
I think this is going to be a difficult one to solve. The underlying problem is that the types grow non-linearly with the program size. Why? Look at Section 2.3 of Scrap your type applications.
Got it: N-ary data constructors like (C e1 e2 e3 e4 e5) are translated to nested pairs (Pair : ∀a,b. a → b → (a, b)). This causes a quadratic blow-up in size:
Pair σ1 (σ2,(σ3,(σ4,σ5))) e1 (Pair σ2 (σ3,(σ4,σ5)) e2 (Pair σ3 (σ4,σ5) e3 (Pair σ4 σ5 e4 e5)))
This is a fundamental problem with System F, so it's not easy for GHC to get around it. It shows up especially with deeply-nested sums and products, which is exactly what is generated by the generic stuff. I'm not sure what to do here.
So if I understand the paper correctly System IF would solve this by removing the redundant type applications using:
Pair ψ τ s t → Pair ψ s τ t by (ξ2) → Pair s τ t by (ξ1) → Pair s t by (ξ1)
Of course the question is: is it worth implementing System IF for only solving this problem? (I understand System IF did not significantly improve compile times for the base library).
comment:11 in reply to: ↑ 10 Changed 2 years ago by dreixel
Replying to basvandijk:
Got it: N-ary data constructors like (C e1 e2 e3 e4 e5) are translated to nested pairs (Pair : ∀a,b. a → b → (a, b)). This causes a quadratic blow-up in size:
Pair σ1 (σ2,(σ3,(σ4,σ5))) e1 (Pair σ2 (σ3,(σ4,σ5)) e2 (Pair σ3 (σ4,σ5) e3 (Pair σ4 σ5 e4 e5)))
No; we balance the sums and the products, so it grows with logarithmic complexity. See the second column of the 4th page of the original paper.
comment:12 Changed 2 years ago by igloo
- Test Case set to T5642
comment:13 Changed 2 years ago by igloo
- Milestone changed from 7.4.1 to 7.4.2
comment:14 Changed 2 years ago by simonpj
- Cc dimitris@… added
- Milestone changed from 7.4.2 to 7.6.1
There is still something odd here. Here's the output for HEAD, doing ghc T5642.hs -dshow-passes +RTS -s, for various sizes of input data type. If you do it yourself you'll see that 99% of the time is spent in the typechecker. The rest of compilation is fast.
So not only is it non-linear but the constant factor is terrible. It just can't take that long to typecheck this stuff!
I'm punting for 7.4, but we need to look again for 7.6.
Simon
100 constructors
simonpj@cam-05-unx:~/tmp$ ~/5builds/HEAD/inplace/bin/ghc-stage2 -c T5642.hs -dshow-passes +RTS -s Glasgow Haskell Compiler, Version 7.5, stage 2 booted by GHC version 7.4.1 *** Checking old interface for main:T5642: *** Parser: *** Renamer/typechecker: *** Desugar: Result size of Desugar (after optimization) = {terms: 3,498, types: 160,413, coercions: 1,305} *** Simplifier: Result size of Simplifier iteration=1 = {terms: 3,298, types: 110,113, coercions: 51,906} Result size of Simplifier = {terms: 3,298, types: 110,113, coercions: 51,805} *** Tidy Core: Result size of Tidy Core = {terms: 3,298, types: 110,113, coercions: 51,805} *** CorePrep: Result size of CorePrep = {terms: 4,642, types: 159,413, coercions: 51,805} *** Stg2Stg: *** CodeGen: *** CodeOutput: *** Assembler: *** Deleting temp files: Warning: deleting non-existent /tmp/ghc46808_0/ghc46808_0.c *** Deleting temp dirs: 3,189,810,752 bytes allocated in the heap 1,176,054,736 bytes copied during GC 70,837,720 bytes maximum residency (19 sample(s)) 1,225,552 bytes maximum slop 158 MB total memory in use (0 MB lost due to fragmentation) Tot time (elapsed) Avg pause Max pause Gen 0 6039 colls, 0 par 1.99s 1.99s 0.0003s 0.0038s Gen 1 19 colls, 0 par 1.45s 1.45s 0.0761s 0.2469s TASKS: 3 (1 bound, 2 peak workers (2 total), using -N1) SPARKS: 0 (0 converted, 0 overflowed, 0 dud, 0 GC'd, 0 fizzled) INIT time 0.00s ( 0.00s elapsed) MUT time 5.86s ( 5.95s elapsed) GC time 3.44s ( 3.43s elapsed) EXIT time 0.00s ( 0.00s elapsed) Total time 9.29s ( 9.38s elapsed) Alloc rate 544,723,338 bytes per MUT second Productivity 63.0% of total user, 62.4% of total elapsed
150 constructors
simonpj@cam-05-unx:~/tmp$ ~/5builds/HEAD/inplace/bin/ghc-stage2 -c T5642.hs -dshow-passes +RTS -s Glasgow Haskell Compiler, Version 7.5, stage 2 booted by GHC version 7.4.1 *** Checking old interface for main:T5642: *** Parser: *** Renamer/typechecker: *** Desugar: Result size of Desugar (after optimization) = {terms: 5,690, types: 401,601, coercions: 2,85} *** Simplifier: Result size of Simplifier iteration=1 = {terms: 5,370, types: 273,121, coercions: 131,46} Result size of Simplifier = {terms: 5,370, types: 273,121, coercions: 130,885} *** Tidy Core: Result size of Tidy Core = {terms: 5,370, types: 273,121, coercions: 130,885} *** CorePrep: Result size of CorePrep = {terms: 7,738, types: 400,1, coercions: 130,885} *** Stg2Stg: *** CodeGen: *** CodeOutput: *** Assembler: *** Deleting temp files: Warning: deleting non-existent /tmp/ghc47478_0/ghc47478_0.c *** Deleting temp dirs: 8,336,602,152 bytes allocated in the heap 4,066,619,088 bytes copied during GC 218,323,536 bytes maximum residency (33 sample(s)) 3,689,368 bytes maximum slop 479 MB total memory in use (0 MB lost due to fragmentation) Tot time (elapsed) Avg pause Max pause Gen 0 15893 colls, 0 par 6.45s 6.45s 0.0004s 0.0033s Gen 1 33 colls, 0 par 4.78s 4.78s 0.1448s 0.7436s TASKS: 3 (1 bound, 2 peak workers (2 total), using -N1) SPARKS: 0 (0 converted, 0 overflowed, 0 dud, 0 GC'd, 0 fizzled) INIT time 0.00s ( 0.00s elapsed) MUT time 24.45s ( 24.60s elapsed) GC time 11.23s ( 11.23s elapsed) EXIT time 0.00s ( 0.00s elapsed) Total time 35.68s ( 35.82s elapsed) Alloc rate 340,990,013 bytes per MUT second Productivity 68.5% of total user, 68.3% of total elapsed
200 constructors
*** Desugar: Result size of Desugar (after optimization) = {terms: 7,170, types: 622,561, coercions: 2,605} *** Simplifier: Result size of Simplifier iteration=1 = {terms: 6,770, types: 421,961, coercions: 203,806} Result size of Simplifier = {terms: 6,770, types: 421,961, coercions: 203,605} *** Tidy Core: Result size of Tidy Core = {terms: 6,770, types: 421,961, coercions: 203,605} *** CorePrep: Result size of CorePrep = {terms: 9,858, types: 620,561, coercions: 203,605} *** Stg2Stg: *** CodeGen: *** CodeOutput: *** Assembler: *** Deleting temp files: Warning: deleting non-existent /tmp/ghc47501_0/ghc47501_0.c *** Deleting temp dirs: 13,262,925,720 bytes allocated in the heap 7,286,910,712 bytes copied during GC 298,657,992 bytes maximum residency (45 sample(s)) 5,548,736 bytes maximum slop 640 MB total memory in use (0 MB lost due to fragmentation) Tot time (elapsed) Avg pause Max pause Gen 0 25322 colls, 0 par 11.98s 11.97s 0.0005s 0.0034s Gen 1 45 colls, 0 par 8.58s 8.59s 0.1908s 0.9460s TASKS: 3 (1 bound, 2 peak workers (2 total), using -N1) SPARKS: 0 (0 converted, 0 overflowed, 0 dud, 0 GC'd, 0 fizzled) INIT time 0.00s ( 0.00s elapsed) MUT time 50.07s ( 50.22s elapsed) GC time 20.56s ( 20.55s elapsed) EXIT time 0.00s ( 0.00s elapsed) Total time 70.63s ( 70.78s elapsed) Alloc rate 264,903,443 bytes per MUT second Productivity 70.9% of total user, 70.7% of total elapsed
comment:15 Changed 19 months ago by igloo
- Milestone changed from 7.6.1 to 7.6.2
comment:16 Changed 13 months ago by liyang
- Cc hackage.haskell.org@… added
comment:17 Changed 13 months ago by carter
- Cc carter.schonwald@… added
Came to report this sort of thing too. I can't actually get this code to completely run as my machine runs out of memory (12g) almost immediately.