avoid redundant stores to the stack when examining already-tagged data

changed weight to 5

added Tfeature request Trac import code generation labels

Attached file 0001-XXX-Sink-writing-the-return-address-into-the-branch-.patch (download).

Trac metadata

Trac field	Value
TypeOfFailure	OtherFailure → RuntimePerformance

Moreover, the code for each alternative has to work regardless of whether we arrive at it by doing an eval-and-return, or just jump to it for the fully-tagged case. So the code for the alternative needs to know where t,f are going to be.

You say that you can use them "wherever they are", and that may be true if they are in local variables (= registers). But then the return-from-eval code will need to re-load them from the stack into the agreed registers, before going to the shared code for the alternative. And that could be bad if the first thing the alternative does is to save them on the stack for another eval!

Morover, an unboxed-tuple-return might use up a bunch of registers.

This looks tricky to me. Happy to Skype about it if you are keen to pursue. But I think there is lower-hanging fruit: see "Cmm and code generation" on wiki:Status/SLPJ-Tickets.

This is basically the same as #8905.

It's not hard to rearrange the code to optimise the already-evaluated path, but as you noticed it will increase code size due to not being able to share the saving code with the heap-check failure, and having to reload things from the stack in the unevaluated case. Things are rather delicately arranged at the moment to generate small code.

I believe the reason that you get some duplication when sinking the return address is because there's a special case in the stack allocator to spot this.

One thing I think it would be worth doing is having an option to tune the tradeoff between code size and speed (like gcc's -Os), and the code generated for case expressions would be a prime candidate to be altered by this.

added Pnormal label

added pointer tagging label

Trac field	Value
Version	7.11
Type	FeatureRequest
TypeOfFailure	OtherFailure
Priority	normal
Resolution	Unresolved
Component	Compiler (CodeGen)
Test case
Differential revisions
BlockedBy
Related
Blocking
CC	simonmar
Operating system
Architecture

avoid redundant stores to the stack when examining already-tagged data

Child items 0

Activity