Use appropriatly sized comparison instruction for small values.
GHC currently defaults all comparisons originating from Cmm switch statements to 64bit on x64.
This incurs a small overhead in instruction size. Fixing this manually gave a speedup of ~1,5% in microbenchmarks.
In detail we generate Cmm of the form:
_s8Dg::P64 = R1;
_c8EF::P64 = _s8Dg::P64 & 7;
switch [1 .. 2] _c8EF::P64 {
case 1 : goto c8Ey;
case 2 : goto c8EC;
}
Which results in assembly like:
andl $7,%ebx
cmpq $1,%rbx
It's obvious that c8EF fits into a byte, but is sized up to 64 bits. Changing this would enable us to use cmpl instead of cmpq and save us a byte on each comparison.
While this isn't major in my microbenchmarks it resultet in a speedup of ~1,5% for such constructs in inner loops.
Trac metadata
Trac field | Value |
---|---|
Version | |
Type | Task |
TypeOfFailure | OtherFailure |
Priority | normal |
Resolution | Unresolved |
Component | Compiler (CodeGen) |
Test case | |
Differential revisions | |
BlockedBy | |
Related | |
Blocking | |
CC | |
Operating system | |
Architecture |