Loopification does not trigger for IO even if it could
The loopification optimization, as I understand it, allows a self-recursive function to jump to a local label instead of the beginning of the function, thus skipping a potential stack check. However, I only observe it triggering for pure functions while IO functions do not get that benefit, even when it would be possible.
I discovered this in #8793 (closed) after looking into an unexpected speedup by removing the IO context from an otherwise pure loop and while such a loop can simply be changed to a pure version (the IOLoop
examples), other functions can not, but the optimization could be applied to them (see MapM.hs
).
I tried to benchmark the differences between a naive loop in IO and some horrible inlinePerformIO
hacks to get the loopification to fire and the "optimized" version performs 3-5% faster on my machine.
See Commentary/Compiler/Loopification for details