Test codeGen/should_compile/massive_array failing on 32-bits
|Reported by:||simonmar||Owned by:||simonmar|
|Type of failure:||None/Unknown||Test Case:|
|Related Tickets:||Differential Rev(s):|
ezyang identified this problem with -fnew-codegen a while ago and made a test for it: codeGen/should_compile/massive_array. This test fails in the following way on 32-bit x86 platforms:
ghc-stage2: panic! (the 'impossible' happened) (GHC version 7.7.20120903 for i386-unknown-linux): RegAllocLinear.getStackSlotFor: out of stack slots If you are trying to compile SHA1.hs from the crypto library then this is a known limitation in the linear allocator. Try enabling the graph colouring allocator with -fregs-graph instead. You can still file a bug report if you like.
This happens because the code compiles to a long sequence of
I32[Hp - 19996] = GHC.Integer.Type.S#_con_info; I32[Hp - 19992] = 499; _c2mY::I32 = Hp - 19995; I32[Hp - 19988] = GHC.Integer.Type.S#_con_info; I32[Hp - 19984] = 499; _c2mZ::I32 = Hp - 19987; I32[Hp - 19980] = (,)_con_info; I32[Hp - 19976] = _c2mZ::I32; I32[Hp - 19972] = _c2mY::I32; _c2n0::I32 = Hp - 19979; I32[Hp - 19968] = :_con_info; I32[Hp - 19964] = _c2n0::I32; I32[Hp - 19960] = GHC.Types._closure+1; _c2n1::I32 = Hp - 19966; I32[Hp - 19956] = GHC.Integer.Type.S#_con_info; I32[Hp - 19952] = 498; _c2n2::I32 = Hp - 19955; I32[Hp - 19948] = GHC.Integer.Type.S#_con_info; I32[Hp - 19944] = 498; _c2n3::I32 = Hp - 19947;
each step in this sequence is allocating a pair of S# constructors. With -fPIC we have one fewer registers, which means that this code spills something for each step in the sequence, and eventually runs out of spill slots.
There are lots of ways to fix this. Workarounds:
- -fcmm-sink or just -O makes the temporaries go away, and so reduces register pressure
- -fregs-graph as suggested by the panic message
Note that this bug is the reason that we have to enable -fcmm-sink in compiler/parser/Parser.y.pp.
- re-use spill slots. This means tracking which spill slots are available in the register allocator, which is just annoying.
- my favourite: allow the register allocator to go over the spill slot limit, by inserting instructions to modify %esp at the entry points and exit points of the function. This isn't too hard, we already do something similar for the x87 stack.