unnecessary indirect jump when returning a case scrutinee
I happened to be looking at the Cmm for this code (ghc 7.8.3, -O2)
f :: Int -> Int
f x = if x < 0 then x else x+1
and I noticed something a bit funny about it:
c12e:
if ((Sp + -8) < SpLim) goto c12z; else goto c12A;
c12z:
R2 = R2;
R1 = Test.f_closure;
call (stg_gc_fun)(R2, R1) args: 8, res: 0, upd: 8;
c12A:
I64[Sp - 8] = c12b;
R1 = R2;
Sp = Sp - 8;
if (R1 & 7 != 0) goto c12b; else goto c12c;
c12c:
call (I64[R1])(R1) returns to c12b, args: 8, res: 8, upd: 8;
c12b:
Hp = Hp + 16;
if (Hp > HpLim) goto c12y; else goto c12x;
c12y:
HpAlloc = 16;
R1 = R1;
call stg_gc_unpt_r1(R1) returns to c12b, args: 8, res: 8, upd: 8;
c12x:
_s11Q::I64 = I64[R1 + 7];
if (%MO_S_Lt_W64(_s11Q::I64, 0)) goto c12u; else goto c12v;
c12u:
Hp = Hp - 16;
R1 = R1 & (-8); /* <--- */
Sp = Sp + 8;
call (I64[R1])(R1) args: 8, res: 0, upd: 8; /* <--- */
c12v:
I64[Hp - 8] = GHC.Types.I#_con_info;
I64[Hp] = _s11Q::I64 + 1;
R1 = Hp - 7;
Sp = Sp + 8;
call (P64[Sp])(R1) args: 8, res: 0, upd: 8;
On the two marked lines, we untag R1 (which is x
) and enter it. However, we know at this point that x
is already in WHNF so we could simply return it by replacing the two lines with call (P64[Sp])(R1)
, if I'm not mistaken. That will save a load and an indirect jump (which we actually know is to I#_con_info
, which would just retag R1 and return to the address on the stack anyways).
I think the same optimization should be available any time we do an algebraic case
and in a branch simply return the scrutinee.
I looked at what it would take to fix this. It looks almost easy: if we add a new LambdaFormInfo
constructor LFUnknownCon
meaning that we know the identifier is bound to a saturated application of an unknown constructor, then we could set the cg_lf
of the case binder variable of an algebraic case statement to LFUnknownCon
, and return ReturnIt
for LFUnknownCon
variables in getCallMethod
. I think that would do it. Does that sound right? Is there a better way?
(In my original example we actually know the constructor has to be I#
. But if the case was on a type with more than one constructor we wouldn't know statically which one we got, just that it has to be one of them.)
Trac metadata
Trac field | Value |
---|---|
Version | 7.8.3 |
Type | Bug |
TypeOfFailure | OtherFailure |
Priority | normal |
Resolution | Unresolved |
Component | Compiler (CodeGen) |
Test case | |
Differential revisions | |
BlockedBy | |
Related | |
Blocking | |
CC | simonmar |
Operating system | |
Architecture |