Opened 4 years ago

Closed 4 years ago

#5639 closed bug (worksforme)

Computing the sum of all the primes below two million leads to a runtime system crash

Reported by: manzyuk Owned by:
Priority: normal Milestone:
Component: Runtime System Version: 7.0.3
Keywords: Cc: manzyuk@…
Operating System: Linux Architecture: x86_64 (amd64)
Type of failure: Runtime crash Test Case:
Blocked By: Blocking:
Related Tickets: Differential Revisions:

Description

The following program is meant to be a solution to Problem 10 from Project Euler (http://projecteuler.net/problem=10):

manzyuk@pandora:~/tmp$ cat euler-10.hs
sumOfPrimes n = sieve [2..n] 0
    where
      sieve []     s = s
      sieve (x:xs) s = sieve [y | y <- xs, y `mod` x /= 0] (x+s)

main = print (sumOfPrimes 2000000)

Compiling and running this program on my 64-bit Linux machine results either in a segmentation fault:

manzyuk@pandora:~/tmp$ uname -a
Linux pandora 2.6.32-5-amd64 #1 SMP Fri Sep 9 20:23:16 UTC 2011 x86_64 GNU/Linux
manzyuk@pandora:~/tmp$ ghc --make -O euler-10.hs
[1 of 1] Compiling Main             ( euler-10.hs, euler-10.o )
Linking euler-10 ...
manzyuk@pandora:~/tmp$ time ./euler-10
Segmentation fault

real	0m44.073s
user	0m44.031s
sys	0m0.028s

or in the following error message:

manzyuk@pandora:~/tmp$ time ./euler-10
euler-10: internal error: scavenge_stack: weird activation record found on stack: -1958211512
    (GHC version 7.0.3 for x86_64_unknown_linux)
    Please report this as a GHC bug:  http://www.haskell.org/ghc/reportabug
Aborted

real	0m54.780s
user	0m54.723s
sys	0m0.040s

The expected behavior is to output a number (142913828922, if I am not mistaken).

The version of gcc I am running:

manzyuk@pandora:~/tmp$ gcc -v
Using built-in specs.
Target: x86_64-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Debian 4.4.5-8' --with-bugurl=file:///usr/share/doc/gcc-4.4/README.Bugs --enable-languages=c,c++,fortran,objc,obj-c++ --prefix=/usr --program-suffix=-4.4 --enable-shared --enable-multiarch --enable-linker-build-id --with-system-zlib --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --with-gxx-include-dir=/usr/include/c++/4.4 --libdir=/usr/lib --enable-nls --enable-clocale=gnu --enable-libstdcxx-debug --enable-objc-gc --with-arch-32=i586 --with-tune=generic --enable-checking=release --build=x86_64-linux-gnu --host=x86_64-linux-gnu --target=x86_64-linux-gnu
Thread model: posix
gcc version 4.4.5 (Debian 4.4.5-8) 

I am attaching the output of GHC when compiling the program with the -v flag. Compiling the program with the -dcore-lint flag didn't reveal anything interesting:

manzyuk@pandora:~/tmp$ ghc --make -O euler-10.hs -dcore-lint -fforce-recomp
[1 of 1] Compiling Main             ( euler-10.hs, euler-10.o )
Linking euler-10 ...

Attachments (1)

euler-10.log (12.3 KB) - added by manzyuk 4 years ago.
GHC output when compiling the program with the -v flag

Download all attachments as: .zip

Change History (6)

Changed 4 years ago by manzyuk

GHC output when compiling the program with the -v flag

comment:1 Changed 4 years ago by daniel.is.fischer

I cannot reproduce it with 7.0.2, 7.0.4 or 7.2.2 (also 64-bit linux, 2.6.37; gcc 4.5.1).

In the compilation logs, after the final simplifier phase, 7.0.2 and 7.0.4 both print

*** Tidy Core:
    Result size = 105
writeBinIface: 24 Names
writeBinIface: 45 dict entries
*** CorePrep:
    Result size = 121
*** Stg2Stg:

The writeBinIface lines are absent in your log, whether that's relevant I don't know. Apart from that, my compilation logs are identical to yours from after the package listing until gcc comes into play.

In the log, I see that You have a couple of broken packages. I don't expect they have anything to do with this, but you should still unregister them.

You are aware that that is an egregiously horrible way to compute primes, by the way?

comment:2 Changed 4 years ago by manzyuk

  • Cc manzyuk@… added

I tried to compile the program on my laptop

manzyuk@paddy:~/tmp$ uname -a
Linux paddy 2.6.32-35-generic #78-Ubuntu SMP Tue Oct 11 16:11:24 UTC 2011 x86_64 GNU/Linux

and it runs without visible problems. Of course, it is unacceptably slow, but it doesn't crash. My laptop runs GHC 7.0.3, and the only difference I've noticed so far compared to my desktop is gcc:

manzyuk@paddy:~/tmp$ gcc -v
Using built-in specs.
Target: x86_64-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Ubuntu 4.4.3-4ubuntu5' --with-bugurl=file:///usr/share/doc/gcc-4.4/README.Bugs --enable-languages=c,c++,fortran,objc,obj-c++ --prefix=/usr --enable-shared --enable-multiarch --enable-linker-build-id --with-system-zlib --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --with-gxx-include-dir=/usr/include/c++/4.4 --program-suffix=-4.4 --enable-nls --enable-clocale=gnu --enable-libstdcxx-debug --enable-plugin --enable-objc-gc --disable-werror --with-arch-32=i486 --with-tune=generic --enable-checking=release --build=x86_64-linux-gnu --host=x86_64-linux-gnu --target=x86_64-linux-gnu
Thread model: posix
gcc version 4.4.3 (Ubuntu 4.4.3-4ubuntu5) 

Can I provide any other useful information about my desktop? Or maybe you have suggestions how I can try to debug the program?

Frankly, to my shame, I didn't know it was a horrible way to compute primes until yesterday. I've been taught this wrong implementation of the sieve of Eratosthenes, but have never actually tried it before. I am pleased to have re-discovered a queue-based algorithm today.

I should fix the broken packages, thanks for pointing this out.

comment:3 Changed 4 years ago by simonmar

Prime suspect is a hardware bug, I'm afraid. Try taking a binary created on the desktop machine to the laptop and see if it works there.

comment:4 Changed 4 years ago by manzyuk

The binary created on the desktop machine doesn't work on the laptop because of a missing shared library dependency:

manzyuk@paddy:~$ ./euler-10 
./euler-10: error while loading shared libraries: libgmp.so.10: cannot open shared object file: No such file or directory
manzyuk@paddy:~$ ldd euler-10
	linux-vdso.so.1 =>  (0x00007fffdb5ff000)
	libgmp.so.10 => not found
	libm.so.6 => /lib/libm.so.6 (0x00007f619dc07000)
	librt.so.1 => /lib/librt.so.1 (0x00007f619d9fe000)
	libdl.so.2 => /lib/libdl.so.2 (0x00007f619d7fa000)
	libc.so.6 => /lib/libc.so.6 (0x00007f619d477000)
	libpthread.so.0 => /lib/libpthread.so.0 (0x00007f619d259000)
	/lib64/ld-linux-x86-64.so.2 (0x00007f619deae000)

The binary built on the laptop is linked against libgmp.so.3, not libgmp.so.10:

manzyuk@paddy:~/tmp$ ldd euler-10
	linux-vdso.so.1 =>  (0x00007fff881ff000)
	libgmp.so.3 => /usr/lib/libgmp.so.3 (0x00007febe2833000)
	libm.so.6 => /lib/libm.so.6 (0x00007febe25b0000)
	librt.so.1 => /lib/librt.so.1 (0x00007febe23a7000)
	libdl.so.2 => /lib/libdl.so.2 (0x00007febe21a3000)
	libc.so.6 => /lib/libc.so.6 (0x00007febe1e20000)
	libpthread.so.0 => /lib/libpthread.so.0 (0x00007febe1c02000)
	/lib64/ld-linux-x86-64.so.2 (0x00007febe2ab7000)

The desktop has both libgmp.so.10 and libgmp.so.3, so I copied the binary built on the laptop to the desktop and ran it:

manzyuk@pandora:~$ ./euler-10
euler-10: internal error: scavenge_stack: weird activation record found on stack: 2
    (GHC version 7.0.3 for x86_64_unknown_linux)
    Please report this as a GHC bug:  http://www.haskell.org/ghc/reportabug
Aborted

Since this doesn't happen on the laptop, I am leaning towards accepting that this is a hardware bug and closing the ticket. Apologies for making noise.

comment:5 Changed 4 years ago by simonmar

  • Resolution set to worksforme
  • Status changed from new to closed

No problem. I'll close the ticket for now, please re-open it if you need to.

Note: See TracTickets for help on using tickets.