Function arguments are always spilled/reloaded if scrutinee is already in WHNF

The code generator unnecessarily spills and reloads function arguments if the scrutinee turns out to be already evaluated (i.e. has non-zero tag bits).

Edit: #8905 (comment 408620) has a decent summary of how things are 7 years later on this issue.

Here's the beginning of a function body, taken from the insert function at https://github.com/tibbe/unordered-containers/blob/master/Data/HashMap/Base.hs#L303:

c2wQ:  // stack check
    if ((Sp + -72) < SpLim) goto c2wR; else goto c2wS;
c2wR:  // stack check failure
    R1 = PicBaseReg + $wpoly_go_closure;
    I64[Sp - 40] = R2;
    I64[Sp - 32] = R3;
    P64[Sp - 24] = R4;
    I64[Sp - 16] = R5;
    P64[Sp - 8] = R6;
    Sp = Sp - 40;
    call (I64[BaseReg - 8])(R1) args: 48, res: 0, upd: 8;
c2wS:  // stack check success
    I64[Sp - 40] = PicBaseReg + block_c2my_info;  // return addr for eval
    R1 = R6;  // t
    I64[Sp - 32] = R2;  // spill: s
    I64[Sp - 24] = R3;  // spill: x
    P64[Sp - 16] = R4;  // spill: k
    I64[Sp - 8] = R5;  // spill: h
    Sp = Sp - 40;
    if (R1 & 7 != 0) goto c2my; else goto c2mz;  // eval check of t
c2mz:  // eval check failed
    call (I64[R1])(R1) returns to c2my, args: 8, res: 8, upd: 8;  // eval
c2my:  // eval check succeeded
    _s2b1::I64 = I64[Sp + 8];  // reload: h
    _s2b2::I64 = I64[Sp + 16];  // reload: k
    _s2b3::P64 = P64[Sp + 24];  // reload: x
    _s2b4::I64 = I64[Sp + 32];  // reload: s
    switch [0 .. 4] (R1 & 7 - 1) {
        case 0 : goto c2wK;
        case 1 : goto c2wL;
        case 2 : goto c2wM;
        case 3 : goto c2wN;
        case 4 : goto c2wO;
    }

It seems to me that all the spills/reloads could be pushed into the c2mz block.

The c2my block, in its current form, is reused for a heap check failure case, so the heap check most likely will have to do its own spilling/reloading. However, since the scrutinee not having tags bit or the eval checking failing is not the common case, they should be out of the common path.

If it matters the data type is spine strict so GHC should have enough information to know that the common case (e.g. self-recursive calls) already have an evaluated argument (although there might be an indirection in some cases).

Trac metadata

Trac field	Value
Version	7.9
Type	Bug
TypeOfFailure	OtherFailure
Priority	normal
Resolution	Unresolved
Component	Compiler
Test case
Differential revisions
BlockedBy
Related
Blocking
CC	simonmar
Operating system
Architecture

Edited Feb 14, 2022 by Andreas Klebinger

To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information