Changes between Version 9 and Version 10 of Commentary/Compiler/Backends/NCG/RegisterAllocator


Ignore:
Timestamp:
Sep 18, 2007 2:58:46 PM (8 years ago)
Author:
guest
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • Commentary/Compiler/Backends/NCG/RegisterAllocator

    v9 v10  
    6565  }}}
    6666
    67  * '''Register pressure in Haskell code'''[[BR]]
    68    Most GHC compiled code has very little register pressure and only a few spill/reload instructions need to be inserted, many modules need none at all. This is a mixed blessing - on one hand the conflict graphs are small so we don't have too many performance problems related to how the graph is represented, on the other hand it can be hard to find code to test against.
     67 * Graphviz
    6968
    70   Only a few nofib benchmarks create spills with {{{-O2}}}, two are {{{spectral/hartel/genfft}}} and {{{spectral/sorting}}}.
     69 * checkSpills
    7170
    72   Register pressure increases significantly when profiling is turned on.
    73  
     71== Register pressure in Haskell code ==
    7472
     73Present GHC compiled code places very little pressure on the register set. Even on x86 with only 3 allocable registers, most modules do not need spill/reloads. This is a mixed blessing - on one hand the conflict graphs are small so we can avoid performance problems related to how the graph is represented, on the other hand it can be hard to find code to test against. Register pressure is expected to increase as the Stg->Cmm transform improves.
     74
     75In the meantime, here are some good sources for test code:
     76
     77 * '''Nofib'''[[BR]]
     78   Only a few nofib benchmarks create spills with {{{-O2}}}, two are {{{spectral/hartel/genfft}}} and {{{spectral/sorting}}}.
     79
     80 * '''Turn on profiling.'''[[BR]]
     81   Register pressure increases significantly when the module is compiled with profiling. [attachment:checkSpills.report] gives tuples of {{{(spills, reloads, reg-reg-moves)}}} present in output code generated by the three algorithms when compiled with {{{-O2 -prof}}}. Left to right are the stats for the linear, graph coloring and iterative coalescing algorithms. Note that most modules compile with no spill/reloads inserted, but a few (notably {{{real/compress2/Encode}}}) need several hundred.
     82
     83  I've found it useful to maintain three darcs repos when working on the allocator. {{{ghc-HEAD-work}}} compiled with {{{-Onot}}} for fast compilation during hacking, {{{ghc-HEAD-prof}}} for testing with profiling turned on, and {{{ghc-HEAD-validate}}} for running the validate script. Patches are created in {{{work}}}, pushed into {{{prof}}} where {{{checkSpills}}} is used to compile the nofib benchmarks with the most register pressure. Once we're happy that the performance is ok, the patch is then pushed into {{{validate}}} for validation before pushing to the main repo on {{{darcs.haskell.org}}}
     84
     85 * '''SHA from darcs'''[[BR]]
     86   The {{{SHA1.lhs}}} module from the darcs source, compiled with {{{-O2}}} creates the most register pressure out of any Haskell code that I'm aware of. When compiling SHA1, GHC inlines several worker functions and the native code block that computes the hash ends up being around 1700 instructions long. vregs that live in the middle of the block have in the order of 30 conflict neighbors. (evidently, the conflict graph is too large for most of the graphviz layout algorithms to cope with)
     87
     88  For these reasons, {{{SHA1.lhs}}} can be treated as a good worst-case input to the allocator. In fact, the current linear allocator cannot compile it with {{{-O2 -prof}}} on x86 as it runs out of stack slots, which are allocated from a static pool. Make sure to test any changes to the allocator against this module.
    7589
    7690== Possible Improvements ==
    7791
    78  * work lists
     92These are some ideas for improving the current allocator, most usful first.
    7993
    80  * spill code
     94 * Work lists for iterative coalescing.
    8195
    82  * spill candidates
     96 * Improve spill code cleaner.
    8397
    84  * aliasing register sets
     98 * Revisit choosing of spill candidates.
     99
     100 * Revisit trivColorable / aliasing of register sets.
    85101
    86102