Opened 7 years ago

Closed 7 years ago

Last modified 6 years ago

#1002 closed bug (invalid)

ghc-6.6 sometimes hangs under Solaris

Reported by: maeder@… Owned by:
Priority: normal Milestone: 6.6.1
Component: Compiler Version: 6.6
Keywords: Cc:
Operating System: Solaris Architecture: Unknown/Multiple
Type of failure: Difficulty: Unknown
Test Case: N/A Blocked By:
Blocking: Related Tickets:

Description

After compiling 643 modules (in 5 minutes) ghc-6.6 did not finish it's batch job. (I've killed the job then next morning.) This happened on a PC (i386-solaris2.10).

[643 of 643] Compiling Main             ( hets.hs, hets.o )
Linking hets ...
gmake: *** [hets] Killed
gmake: *** Deleting file `hets'

real    684m49.318s
user    5m7.309s
sys     0m22.634s

I think, I've seen the same happening several times under sparc-solaris. I don't know how to reproduce this.

Change History (6)

comment:1 Changed 7 years ago by maeder@…

Mere import chasing and linking is fairly fast (regardless of Solaris-ld or GNU-ld)

Linking hets ...

real    0m46.984s
user    0m12.137s
sys     0m3.180s

But last night my ghc-6.6 job hung around for about 2 hours but continued.

[679 of 679] Compiling Main             ( hets.hs, hets.o )
Linking hets ...
linking ... done.
Loading package uni-htk-widgets ... linking ... done.
Loading package uni-htk-canvasitems ... linking ... done.
Loading package uni-htk-textitems ... linking ... done.
Loading package uni-htk-toplevel ... linking ... done.
Loading package uni-htk-toolkit ... linking ... done.
Loading package uni-htk ... linking ... done.
Loading package haskell98 ... linking ... done.
Loading package HaXml-1.13.2 ... linking ... done.
Loading package uni-server ... linking ... done.
Loading package uni-graphs ... linking ... done.
Loading package uni-davinci ... linking ... done.

real    122m20.462s
user    5m49.133s
sys     0m20.322s

comment:2 Changed 7 years ago by igloo

  • Milestone set to 6.6.1
  • Test Case set to N/A

Hi Christian,

Can you rule out swapping?

It might be worth seeing if strace/truss can shed any light.
I don't know about Solaris, but Linux's strace has a variety of (unfortunately mutually incompatible) timing options:

-c -- count time, calls, and errors for each syscall and report summary
-r -- print relative timestamp, -t -- absolute timestamp, -tt -- with usecs
-T -- print time spent in each syscall

Alternatively, attaching gdb to a process once it starts misbehaving might prove useful.

Thanks
Ian

comment:3 Changed 7 years ago by maeder@…

The timings above stem from compiling the sources (nicely) unoptimized at night. Maybe some backup or mirror jobs caused the two hours gap. At day it just takes about 7 minutes to build the binary from the shell.

When building with optimization after about 400 modules ghc takes 855 MB (about 700 resident) from 1G real memory. The machine becomes slow, load and cpu goes down, disk activity, paging, context switches and interrupts go up (swap remains low, too). I think it's ok again, when I kill the job and restart the compilation. Currently I'm testing if setting the limit for open files (increasing from 256 to 1024) helps.

comment:4 Changed 7 years ago by maeder@…

I can compile my 679 modules with optimization within 3 hours by restarting the compilation three times. ghc "only" needs around 600 MB then. However, unused memory should be swapped out.

However, my last jobs never really hung. Maybe all my previous oberservations apply to a compiler version without threaded RTS, where ghc-6.6 simply slept without noticable reason (and woke up later or not at all).


comment:5 Changed 7 years ago by maeder@…

  • Resolution set to invalid
  • Status changed from new to closed

close this bug. I could never observe the hanging behaviour again with the threaded RTS. The slow compilation with optimization is gone when I use ghc -M and compile the files individually.

comment:6 Changed 6 years ago by simonmar

  • Architecture changed from Multiple to Unknown/Multiple
Note: See TracTickets for help on using tickets.