Poor performance of generated code on x86.
When implementing a hash function of !ByteStrings, I found the Haskell implementation to be 2.5 slower than the equivalent C implementation.
The code of both functions is attached in [attachment:Hash.hs] and [attachment:c_hash.c]. You can compile using
ghc -O --make Hash.hs c_hash.c
and run C implementation as ./Hash c bstr_len
and Haskell implementation as ./Hash h bstr_len
.
There is no apparent problem in the Haskell implementation -- both the foldl'
and the addWord8
are inlined and everything in the main loop is unboxed.
I believe the performance loss is because of bad register allocation. On x86_64 is the Haskell implementation only ~1.2 times slower.
The comparison on Intel Xeon E5520 32-bit, Windows 7, GHC 6.12.1 is in [attachment:res-32bit.txt]. C and Haskell implementation is run three times, and on strings of length 10, 50 and 100. All times are in seconds. The file also contains the assembler code of relevant methods.
On Intel Xeon E5320 64-bit, Fedora, GHC 6.12.1 is in [attachment:res-64bit.txt].
Trac metadata
Trac field | Value |
---|---|
Version | 6.12.1 |
Type | Bug |
TypeOfFailure | OtherFailure |
Priority | normal |
Resolution | Unresolved |
Component | Compiler |
Test case | |
Differential revisions | |
BlockedBy | |
Related | |
Blocking | |
CC | |
Operating system | |
Architecture |