Pattern matching against sets of strings sharing a prefix blows up pattern checker
hvr stumbled upon this issue while attempting to bootstrap GHC with GHC HEAD. In so doing he found that GHC HEAD required more than 10 GB of memory while compiling genprimopcode
(and never completed).
It appears that this blow-up is due to the new pattern checker. In particular it appears that the pattern checker is affected quite adversely by sets of patterns sharing a prefix. For instance, this example,
import System.Environment
main :: IO ()
main = do
args <- getArgs
print $ case head args of
"--primop-primop-info" -> "turtle"
"--primop-tag" -> "asdf"
"--primop-list" -> "casdhf"
"--primop-vector-uniques" -> "this"
"--primop-vector-tys" -> "is"
"--primop-vector-tys-exports" -> "silly"
"--primop-vector-tycons" -> "hmmm"
"--make-haskell-wrappers" -> "123512"
"--make-haskell-source" -> "as;dg"
"--make-latex-doc" -> "adghiw"
_ -> error "Should not happen, known_args out of sync?"
As written GHC requires over ten gigabytes of heap and several minutes to compile the example. If one perform s/--primop-//
to this example it takes 500ms to compile. Alternatively, if on replace the first -
in each of the --primop
strings with a unique character (thus breaking the shared prefixes) compilation time is a bit shy of a second.
Trac metadata
Trac field | Value |
---|---|
Version | 7.11 |
Type | Bug |
TypeOfFailure | OtherFailure |
Priority | normal |
Resolution | Unresolved |
Component | Compiler |
Test case | |
Differential revisions | |
BlockedBy | |
Related | |
Blocking | |
CC | |
Operating system | |
Architecture |