Changes between Version 7 and Version 8 of Commentary/Compiler/Backends/NCG/RegisterAllocator


Ignore:
Timestamp:
Sep 18, 2007 1:25:11 PM (7 years ago)
Author:
guest
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • Commentary/Compiler/Backends/NCG/RegisterAllocator

    v7 v8  
    1010   The linear allocator is turned on by default. This is what you get when you compile with {{{-fasm}}}. The linear allocator does a single pass through the code, allocating registers on a first-come-first-served basis. It is quick, and does a reasonable job for code with little register pressure.  
    1111 
    12 This algorithm has no look-ahead. If say, a particular hreg will be clobbered by a function call, it does not know to avoid allocating to it in the code before the call, and subsequently inserts more spill/reload instructions than strictly needed. 
     12  This algorithm has no look-ahead. If say, a particular hreg will be clobbered by a function call, it does not know to avoid allocating to it in the code before the call, and subsequently inserts more spill/reload instructions than strictly needed. 
    1313 
    1414 * '''Graph coloring''' (enabled with {{{-fregs-graph}}})[[BR]] 
    1515   The graph coloring algorithm operates on the code for a whole function at a time. From each function it extracts a register conflict graph which has a node for every vreg and an edge between two vregs if they are in use at the same time and thus cannot share the same hreg. The algorithm tries to assign hregs (represented as colors) to the nodes so that no two adjacent nodes share the same color, if it can't then it inserts spill code, rebuilds the graph and tries again.  
    1616 
    17 Graph coloring tends to do better than the linear allocator because the conflict graph helps it avoid the look-ahead problem. The coloring allocator also tries harder to allocate the source and destination of reg-to-reg move instructions to the same hreg. This is done by coalescing (merging) move-related nodes. If this succeeds then the moves can be erased. 
     17  Graph coloring tends to do better than the linear allocator because the conflict graph helps it avoid the look-ahead problem. The coloring allocator also tries harder to allocate the source and destination of reg-to-reg move instructions to the same hreg. This is done by coalescing (merging) move-related nodes. If this succeeds then the moves can be erased. 
    1818 
    1919 * '''Graph coloring with iterative coalescing''' (enabled with {{{-fregs-iterative}}})[[BR]] 
     
    4343== Hacking/Debugging == 
    4444 
    45 '''{{{-fasm-lint}}}'''[[BR]] 
    46 Breaking the allocator can result in compiled programs crashing randomly (if you're lucky) or producing the wrong output. 
     45 * '''Turn on {{{-fasm-lint}}}'''[[BR]] 
     46   Breaking the allocator can result in compiled programs crashing randomly (if you're lucky) or producing the wrong output. Make sure to always turn on {{{-fasm-lint}}}. Doing this makes the allocator call {{{GraphOps.validateGraph}}} after every spill/color stage. {{{validateGraph}}} checks that all the edges point to valid nodes, that no conflicting nodes have the same color, and if the graph is supposed to be colored then all nodes are really colored. 
    4747 
    48 When working on the allocator, make sure to always turn on {{{-fasm-lint}}}. Doing this makes the allocator call {{{GraphOps.validateGraph}}} after every spill/color stage. {{{validateGraph}}} checks that all the edges point to valid nodes, that no conflicting nodes have the same color, and if the graph is supposed to be colored then all nodes are really colored. 
     48 * '''Some useful dump flags''' 
    4949 
    50 The main dump flags are 
     50  {{{-ddump-asm-regalloc-stages}}}[[BR]] 
     51  Shows the code and conflict graph after ever spill/color stage. Also shows spill costs, and what registers were coalesced. 
    5152 
    52  {{{-ddump-asm-regalloc-stages}}}[[BR]] 
    53  Shows the code and conflict graph after ever spill/color stage. Also shows spill costs, and what registers were coalesced. 
     53  {{{-ddump-asm-stats}}}[[BR]] 
     54  Gives statistics about how many spills/reloads/reg-reg-moves are in the output program. 
    5455 
    55  {{{-ddump-asm-stats}}}[[BR]] 
    56  Gives statistics about how many spills/reloads/reg-reg-moves are in the output program. 
     56  {{{-ddump-asm}}}[[BR]] 
     57  Gives the final output code.  
    5758 
    58  {{{-ddump-asm}}}[[BR]] 
    59  Gives the final output code.  
     59  {{{-ddump-to-file}}}[[BR]] 
     60  Diverts dump output to files. This can be used to get dumps from each module in a nofib benchmark. 
     61  
     62  {{{ 
     63  cd nofib/real/anna 
     64  make EXTRA_HC_OPTS="-O2 -fregs-iterative -ddump-to-file -ddump-asm-regalloc-stages" 
     65  }}} 
    6066 
    61  {{{-ddump-to-file}}}[[BR]] 
    62  Diverts dump outputs to files. This can be used to get dumps from each module in a nofib benchmark. 
    63  
    64  Compile eg 
    65   
    66  cd nofib/real/anna 
    67  
    68  make EXTRA_HC_OPTS="-O2 -fregs-iterative -ddump-to-file -ddump-asm-regalloc-stages" 
     67 * '''Register pressure in Haskell code'''[[BR]] 
     68   Most GHC compiled code has very little register pressure and only a few spill/reload instructions need to be inserted - many modules need none at all. This is a mixed blessing - on one hand the conflict graphs are small so we don't have too many performance problems related to how the graph is represented, on the other hand it can be hard to find code to test against. 
    6969 
    7070