Improve assembly for dense jump tables.
When generating assembly for cases with dense alternatives GHC currently often generates somewhat repetitive code:
Main.$wf_test_info:
... # Check range
_u48M: #Jump into jump table
jmp *_n48P(,%r14,8)
_c48H:
movl $11008,%ebx
jmp *(%rbp)
#####################
# Repeat 6 more times
#####################
_c48A:
movl $11001,%ebx
jmp *(%rbp)
_c48z:
movq $-1,%rbx
jmp *(%rbp)
_n48P: #Jump table
From what I've seen it should often be possible to replace the indirect jmp with a indirect mov instead.
... #Check Range
.Lu48M: #Get value out of table and jump to continuation.
movl .Ln48P(,%r14,8), %ebx
jmp *(%rbp)
.Lc48z:
movq $-1,%rbx
jmp *(%rbp)
.Ln48P: #Jump table
Depending on the number of cases this has the potential to do a lot for code size.
I did the transformation manually for one case in a inner loop.
It improved speed but not much and depending on the codelayout I did it was possible to get worse performance in specific cases.
It's also not a simply patch as Cmm currently generates switches containing gotos. So this would require a change in the Cmm stage as well as in the code generator.
But it should be at least worth investigating and seems to be what gcc does for similar cases.
Trac metadata
Trac field | Value |
---|---|
Version | 8.2.2 |
Type | Task |
TypeOfFailure | OtherFailure |
Priority | normal |
Resolution | Unresolved |
Component | Compiler (CodeGen) |
Test case | |
Differential revisions | |
BlockedBy | |
Related | |
Blocking | |
CC | |
Operating system | |
Architecture |