Opened 5 months ago

Last modified 5 months ago

#8604 new bug

Some stack/vmem limits (ulimit) combinations causing GHC to fail

Reported by: clavin Owned by:
Priority: normal Milestone:
Component: Compiler Version: 7.6.3
Keywords: Cc:
Operating System: Other Architecture: x86_64 (amd64)
Type of failure: Runtime crash Difficulty: Unknown
Test Case: Blocked By:
Blocking: Related Tickets:

Description

I have encountered a strange occurrence with GHC that was causing several GHC job failures on an SGE cluster. It turned out that there were other SGE users who needed an absurdly large stack size limit (set via 'ulimit -s') in the several gigabytes range. The default stack size limit had to be raised for the entire cluster.

Any job that was run on a machine where the virtual memory limit was less than or equal to 2X the stack size limit, GHC would crash with the following message:

ghc: failed to create OS thread: Cannot allocate memory

I am running on GHC 7.6.3 with a 64-bit RedHat? Enterprise OS, version 5.5.

To reproduce the error, I was able to create the following test case:

[ ~]$ ulimit -v
unlimited
[ ~]$ ulimit -s
10240
[ ~]$ ghc --version
The Glorious Glasgow Haskell Compilation System, version 7.6.3
[ ~]$ ulimit -v 200000
[ ~]$ ulimit -s 100000
[ ~]$ ghc --version
ghc: failed to create OS thread: Cannot allocate memory

Several other programs work find using these settings, but GHC consistently has problems. Is this a fundamental issue with how GHC operates or can this be addressed?

Change History (4)

comment:1 follow-ups: Changed 5 months ago by carter

ghc allocates the stack on the heap, so this may be tripping a corner case somehow...

could you try it out with ghc HEAD to see if the problem is still there? theres been some changes to the default way stacks work. (that said, definitely worth understanding)

comment:2 in reply to: ↑ 1 Changed 5 months ago by clavin

Replying to carter:

ghc allocates the stack on the heap, so this may be tripping a corner case somehow...

could you try it out with ghc HEAD to see if the problem is still there? theres been some changes to the default way stacks work. (that said, definitely worth understanding)

I compiled 7.6.3 from source, what would be the easiest way for me to do the same with GHC HEAD (I'm new around here)?

comment:3 Changed 5 months ago by carter

pretty much the exact same process. Make sure you have the newest versions of happy and alex installed, then follow the directions here https://ghc.haskell.org/trac/ghc/wiki/Building

basically:
clone the ghc repo
./sync-all get
perl boot
./configure
make -j # nb, this does an optimized build, so it take a while, edit the build to be a fast one, docs for that are in the build guide
then test the ghc-stage2 in the ./inplace/bin folder

comment:4 in reply to: ↑ 1 Changed 5 months ago by int-e

Replying to carter:

ghc allocates the stack on the heap, so this may be tripping a corner case somehow...

There are two kinds of stacks. The STG stack (one per Haskell thread) is indeed allocated on the heap, grown dynamically, and its maximum size is governed by the '-K' RTS option. But each OS thread also has its own C stack and the size of that is determined by the stack size resource limit by default: (createOSThread() in rts/posix/OSThreads.c uses pthread_create() with no specific thread attributes. Quoting man pthread_create:

On Linux/x86-32, the default stack size for a new thread is 2
megabytes. Under the NPTL threading implementation, if the
RLIMIT_STACK soft resource limit at the time the program started has
any value other than "unlimited", then it determines the default stack
size of new threads.

Other implementations may differ.

I'm not sure whether or how this should be dealt with by ghc. In some circumstances, a large C stack is required (depending on used foreign libraries), and currently the soft stack resource limit is the knob to turn in that case.

Note: See TracTickets for help on using tickets.