Make the Register Allocator Loop-Aware
Currently the graph coloring register allocator has 2 spill cost functions: one that implements Chaitin’s spill cost function and one that just spills the longest live range. We currently use the latter. Neither of them have any knowledge of program structure’ the register allocator basically assumes all code runs an equal number of times and therefore it doesn’t matter where you insert spills. By making it aware of loops and biasing it to avoid spilling in them when possible loops can run significantly faster.
Currently if we have virtual registers born before a loop that are used in a large number of instructions before and after the loop but not in it we’ll avoid spilling them because it looks like we’ll have to reload them right away. In reality we won’t use them until the end of the loop which might be an order of magnitude of clock cycles later. So we’ll spill something else inside the loop which means every time the loop runs we run spill code.
Obviously we don’t actually write loops in Haskell but we definitely still have them underneath our abstractions. Since we don’t write them explicitly we have to find them in our output (which is what the bulk of LoopAnnotations.hs does). A loop is just a place where control flow doubles back on itself, so we walk through the jumps (we already know where these are because the output has been divided into basic blocks) and try to find places where that happens.
The goal is to have more intelligently placed spills and reloads instead of simply minimizing the number of them.