Opened 7 years ago

Closed 7 years ago

Last modified 7 years ago

#2156 closed bug (wontfix)

compilation math/truncate bug with optimization enabled

Reported by: trevorm Owned by:
Priority: normal Milestone:
Component: Compiler Version: 6.8.2
Keywords: Cc:
Operating System: Linux Architecture: x86
Type of failure: Test Case:
Blocked By: Blocking:
Related Tickets: Differential Revisions:

Description

The following code produces different results depending on whether it is compiled with optimization (-O1/-O2) or not. The unoptimized program when run outputs the correct value (3), the optimized one outputs an incorrect value (2)

module Main where

import IO

lg8base2 :: Int
lg8base2 = truncate (log 8 / log 2)

main :: IO()
main = do
        hPutStrLn stdout $ show lg8base2



Output of gcc -v

Using built-in specs.
Target: i486-linux-gnu
Configured with: ../src/configure -v --enable-languages=c,c++,fortran,objc,obj-c++,treelang --prefix=/usr --enable-shared --with-system-zlib --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --enable-nls --with-gxx-include-dir=/usr/include/c++/4.2 --program-suffix=-4.2 --enable-clocale=gnu --enable-libstdcxx-debug --enable-mpfr --enable-targets=all --enable-checking=release --build=i486-linux-gnu --host=i486-linux-gnu --target=i486-linux-gnu
Thread model: posix
gcc version 4.2.3 20071123 (prerelease) (Debian 4.2.2-4)

Sequence of compiles / runs

trevor@tmlinux:~/haskell$ ghc --make test
[1 of 1] Compiling Main             ( test.hs, test.o )
Linking test ...
trevor@tmlinux:~/haskell$ ./test
3
trevor@tmlinux:~/haskell$ rm test.o
trevor@tmlinux:~/haskell$ ghc -O2 --make test
[1 of 1] Compiling Main             ( test.hs, test.o )
Linking test ...
trevor@tmlinux:~/haskell$ ./test
2

Output of compilation phase with -v is attached (test.unoptimized.output & test.optimized.output)

Attachments (3)

test.unoptimized.output (6.8 KB) - added by trevorm 7 years ago.
Ouput of ghc -v with optimizations off
test.unoptimized.2.output (6.8 KB) - added by trevorm 7 years ago.
Ouput of ghc -v with optimizations off
test.optimized.output (8.5 KB) - added by trevorm 7 years ago.
Ouput of ghc -v with optimizations on

Download all attachments as: .zip

Change History (8)

Changed 7 years ago by trevorm

Ouput of ghc -v with optimizations off

Changed 7 years ago by trevorm

Ouput of ghc -v with optimizations off

Changed 7 years ago by trevorm

Ouput of ghc -v with optimizations on

comment:1 Changed 7 years ago by dons

On a core 2 duo, I'm unable to reproduce this:

$ ghc --make A.hs -o A -no-recomp
$ ./A
3

$ ghc --make -O2 A.hs -o A -no-recomp 
$ ./A                                
3

$ ghc --make -O2 -fvia-C A.hs -o A -no-recomp 
$ ./A                                        
3

$ ghc --make -O2 -fvia-C -optc-O9 A.hs -o A -no-recomp
$ ./A                                                 
3

comment:2 Changed 7 years ago by taruti

i386 Debian system with GHC 6.8.2 (athlon xp 2ghz if relevant)

taruti@oz:/tmp$ ghc --make a.hs -o a -no-recomp && ./a [1 of 1] Compiling Main ( a.hs, a.o ) Linking a ... 3 taruti@oz:/tmp$ ghc --make a.hs -o a -no-recomp -O2 -fasm && ./a [1 of 1] Compiling Main ( a.hs, a.o ) Linking a ... 2 taruti@oz:/tmp$ ghc --make a.hs -o a -no-recomp -O2 -fvia-c && ./a [1 of 1] Compiling Main ( a.hs, a.o ) Linking a ... 3

comment:3 Changed 7 years ago by taruti

The same, but with proper formatting:

i386 Debian system with GHC 6.8.2 (athlon xp 2ghz if relevant)

taruti@oz:/tmp$ ghc --make a.hs -o a -no-recomp && ./a
[1 of 1] Compiling Main             ( a.hs, a.o )
Linking a ...
3
taruti@oz:/tmp$ ghc --make a.hs -o a -no-recomp -O2 -fasm && ./a
[1 of 1] Compiling Main             ( a.hs, a.o )
Linking a ...
2
taruti@oz:/tmp$ ghc --make a.hs -o a -no-recomp -O2 -fvia-c && ./a
[1 of 1] Compiling Main             ( a.hs, a.o )
Linking a ...
3

comment:4 Changed 7 years ago by trevorm

On investigating further I see that the optimized code generated is

subl $8, %esp ; fnstcw 4(%esp)
ffree %st(7) ; fld %st(0)
movzwl 4(%esp), %eax ; orl $0xC00, %eax
movl %eax, 0(%esp) ; fldcw 0(%esp)
fistpl 0(%esp)
fldcw 4(%esp) ; movl 0(%esp), %eax

which sets rounding mode to chop and then converts to integer. So the value 2 is correct according to the definition of truncate.

So the optimized code behaves as if the values are Double precision, while the unoptimized code behaves as if they have better (infinite?) precision

comment:5 Changed 7 years ago by simonmar

  • difficulty set to Unknown
  • Resolution set to wontfix
  • Status changed from new to closed

So I take it this is an instance of the unpredictability in floating-point that we get on x86 due to the floating point registers being wider than a Double?

When compiling via C we use gcc's -ffloat-store which forces the precision to be 64 bits everywhere (albeit expensively). The native code generator does not do this, although #594 will fix it eventually.

See also #1916, where the previously buggy truncate implementation was fixed in the x86 native code gen in 6.8.2.

Note: See TracTickets for help on using tickets.