Add dead store elimination
|Reported by:||tibbe||Owned by:|
|Type of failure:||Runtime performance bug||Test Case:|
|Related Tickets:||Differential Rev(s):|
We could use some dead store elimination in the code generator. Here's some Cmm that has redundant stores to the same locations:
// thawArray#: I64[Hp - 168] = I64[PicBaseReg + stg_MUT_ARR_PTRS_DIRTY_info@GOTPCREL]; I64[Hp - 160] = 16; I64[Hp - 152] = 17; _c2nT::I64 = Hp - 168; call MO_Memcpy(_c2nT::I64 + 24, _s2cx::P64 + 24, 128, 8); // writeArray#: P64[(_c2nT::I64 + 24) + (_s2cy::I64 << 3)] = _s2cE::P64; I64[_c2nT::I64] = I64[PicBaseReg + stg_MUT_ARR_PTRS_DIRTY_info@GOTPCREL]; I8[(_c2nT::I64 + 24) + ((I64[_c2nT::I64 + 8] << 3) + (_s2cy::I64 >> 7))] = 1 :: W8; // unsafeFreeze# I64[_c2nT::I64] = I64[PicBaseReg + stg_MUT_ARR_PTRS_FROZEN0_info@GOTPCREL];
There are three stores to the same location (I64[_c2nT::I64]).
(There's also one much less obvious double store to another location, which will probably be much harder to address: the store to P64[(_c2nT::I64 + 24) + (_s2cy::I64 << 3)] overwrites a word previous written by the MO_Memcpy. Getting to that one will be hard as the memcoy callish MachOp only gets expanded in the backend.)