Changes between Version 23 and Version 24 of Commentary/Compiler/CodeGen


Ignore:
Timestamp:
Feb 1, 2014 3:51:46 PM (15 months ago)
Author:
jstolarek
Comment:

A whole new description of code generator (as implemented by GHC 7.8)

Legend:

Unmodified
Added
Removed
Modified
  • Commentary/Compiler/CodeGen

    v23 v24  
     1= Code Generator = 
    12 
     3This page describes code generator ("codegen") in GHC. It is meant to reflect current state of the implementation. If you notice any inacurracies please update the page (if you know how) or complain on ghc-devs. 
    24 
    3 = GHC Commentary: The Code Generator = 
     5== A brief history of code generator == 
    46 
    5 The material below describes the old code generator, which was retired in 2012.  New stuff is here: 
    6   * [wiki:Commentary/Compiler/CPS Michael Adams CPS conversion] 
    7   * [wiki:Commentary/Compiler/NewCodeGen New code generator] 
     7You might ocasionally hear about "old" and "new" code generator. GHC 7.6 and earlier used the old code generator. New code generator was being developed since 2007 and it was [changeset:832077ca5393d298324cb6b0a2cb501e27209768/ghc enabled by default on 31 August 2012] after the release of GHC 7.6.1. The first stable GHC to use the new code generator is 7.8.1 released in early 2014. The commentary on the old code generator can be found [wiki:Commentary/Compiler/OldCodeGen here]. Notes from the development process of the new code generator are located in a couple of pages on the wiki - go to [wiki:TitleIndex Index] and look for pages starting with "!NewCodeGen". 
    88 
    9 The code generator lives in [[GhcFile(compiler/codeGen)]] 
     9There are some plans for the future development of code generator. One plan is to expand the capability of the pipeline so that it does native code generation too so that existing backends can be discarded - see [wiki:Commentary/Compiler/IntegratedCodeGen IntegratedCodeGen] for discussion of the design. It is hard to say if this will ever happen as currently there is no work being done on that subject and in the meanwhile there was an alternative proposal to [wiki:Commentary/Compiler/Backends/LLVM/ReplacingNCG replace native code generator with LLVM]. 
    1010 
    11 The stuff below is probably out of date, although there were some similarities between the old and new code-generator. 
     11== Overview == 
    1212 
     13The goal of the code generator is to convert program from [wiki:Commentary/Compiler/GeneratedCode STG] representation to [wiki:Commentary/Compiler/CmmType Cmm] representation. STG is a functional language with explicit stack. Cmm is a low-level imperative language - something between C and assembly - that is suitable for machine code generation. Note that terminology might be a bit confusing here: the term "code generator" refers to STG->Cmm pass but it may also be used to refer to the Cmm->assembly pass. Referring to the latter as the "backend" eliminates the confusion. 
    1314 
    14 == Storage manager representations == 
     15The top-most entry point to the codegen is located in [[GhcFile(compiler/main/HscMain.hs)]] in the `tryNewCodegen` function. Code generation now has three stages: 
     16  1. Convert STG to Cmm with implicit stack, and native Cmm calls. This whole stage lives in [[GhcFile(compiler/codeGen)]] directory with the entry point being `codeGen` function in [[GhcFile(compiler/codeGen/StgCmm.hs)]] module. 
     17  2. Optimise the Cmm, and CPS-convert it to have an explicit stack, and no native calls. This lives in [[GhcFile(compiler/cmm)]] directory with the `cmmPipeline` function from [[GhcFile(compiler/cmm/CmmPipeline.hs)]] module being the entry point. This is described below in full detail. 
     18  3. Feed the CPS-converted Cmm to the existing, unmodified native code generators. This is done by `codeOutput` function ([[GhcFile(compiler/main/CodeOutput.lhs)]] called from `hscGenHardCode` after returning from `tryNewCodegen`. 
    1519 
    16 See [wiki:Commentary/Rts/Storage The Storage Manager] for the [wiki:Commentary/Rts/Storage/Stack Layout of the stack]. 
     20== Structure of Cmm pipeline == 
    1721 
    18 The code generator needs to know the layout of heap objects, because it generates code that accesses and constructs those heap objects.  The runtime also needs to know about the layout of heap objects, because it contains the garbage collector.  How can we share the definition of storage layout such that the code generator and the runtime both have access to it, and so that we don't have to keep two independent definitions in sync? 
     22The first two steps are described in more detail here: 
    1923 
    20 Currently we solve the problem this way: 
     24 * '''Code generator''' converts STG to `CmmGraph`.  Implemented in `StgCmm*` modules (in directory `codeGen`).  
     25   * `Cmm.CmmGraph` is pretty much a Hoopl graph of `CmmNode.CmmNode` nodes. Control transfer instructions are always the last node of a basic block. 
     26   * Parameter passing is made explicit; the calling convention depends on the target architecture.  The key function is `CmmCallConv.assignArgumentsPos`.  
     27     * Parameters are passed in virtual registers R1, R2 etc. [These map 1-1 to real registers.]  
     28     * Overflow parameters are passed on the stack using explicit memory stores, to locations described abstractly using the [wiki:Commentary/Compiler/StackAreas ''Stack Area'' abstraction.].    
     29     * Making the calling convention explicit includes an explicit store instruction of the return address, which is stored explicitly on the stack in the same way as overflow parameters. This is done (obscurely) in `MkGraph.mkCall`. 
    2130 
    22  * C types representing heap objects are defined in the C header files, see for example [[GhcFile(includes/rts/storage/Closures.h)]]. 
     31 * '''Simple control flow optimisation''', implemented in `CmmContFlowOpt`.  It's called both at the beginning and end of the pipeline. 
     32   * Branch chain elimination. 
     33   * Remove unreachable blocks. 
     34   * Block concatenation.  branch to K; and this is the only use of K.   
    2335 
    24  * A C program, [[GhcFile(includes/mkDerivedConstants.c)]],  `#includes` the runtime headers. 
    25    This program is built and run when you type `make` or `make boot` in `includes/`.  It is 
    26    run twice: once to generate `includes/DerivedConstants.h`, and again to generate  
    27    `includes/GHCConstants.h`. 
     36 * '''More control flow optimisations'''. 
     37   * Common Block Elimination (like CSE). This essentially implements the Adams optimisation, we believe. 
     38   * Consider (sometime): block duplication.  branch to K; and K is a short block.  Branch chain elimination is just a special case of this. 
    2839 
    29  * The file `DerivedConstants.h` contains lots of `#defines` like this: 
    30 {{{ 
    31 #define OFFSET_StgTSO_why_blocked 18 
    32 }}} 
    33    which says that the offset to the why_blocked field of an `StgTSO` is 18 bytes.  This file 
    34    is `#included` into [[GhcFile(includes/Cmm.h)]], so these offests are available to the 
    35    [wiki:Commentary/Rts/Cmm hand-written .cmm files]. 
     40 * '''Proc-point analysis''' and '''transformation''', implemented in `CmmProcPoint`. The transformation part adds a function prologue to the front of each proc-point, following a standard entry convention. 
     41    * The analysis produces a set of `BlockId` that should become proc-points 
     42    * The transformation inserts a function prologue at the start of each proc-point, and a function epilogue just before each branch to a proc-point. 
    3643 
    37  * The file `GHCConstants.h` contains similar definitions: 
    38 {{{ 
    39 oFFSET_StgTSO_why_blocked = 18::Int 
    40 }}} 
    41   This time the definitions are in Haskell syntax, and this file is `#included` directly into 
    42   [[GhcFile(compiler/main/Constants.lhs)]].  This is the way that these offsets are made 
    43   available to GHC's code generator. 
     44 * '''Remove dead assignments and stores''', implemented in `CmmLive`, removes assignments to dead variables and things like ``a = a`` or ``I32[Hp] = I32[Hp]``. The latter may more appropriately be done in a general optimization pass, as it doesn't take advantage of liveness information. 
    4445 
    45 == Generated Cmm Naming Convention == 
     46 * '''Figure out the stack layout''', implemented in `CmmStackLayout`. 
     47   * Each variable 'x', and each proc-point label 'K', has an associated ''Area'', written SS(x) and SS(k) resp, that names a contiguous portion of the stack frame.   
     48   * The stack layout pass produces a mapping of: ''(`Area` -> `StackOffset`)''. For more detail, see [wiki:Commentary/Compiler/StackAreas#Layingoutthestack the description of stack layout.] 
     49   * A `StackOffset` is the byte offset of a stack slot from the old end (high address) of the frame.  It doesn't vary as the physical stack pointer moves. 
    4650 
    47 See [[GhcFile(compiler/cmm/CLabel.hs)]] 
     51 * '''Manifest the stack pointer''', implemented in `CmmStackLayout`.  Once the stack layout mapping has been determined, a second pass walks over the graph, making the stack pointer, `Sp` explicit. Before this pass, there is no `Sp` at all.  After this, `Sp` is completely manifest. 
     52   * replacing references to `Areas` with offsets from `Sp`. 
     53   * adding adjustments to `Sp`. 
     54  
     55 * '''Split into multiple !CmmProcs''', implemented in `CmmProcPointZ`.  At this point we build an info-table for each of the !CmmProcs, including SRTs.  Done on the basis of the live local variables (by now mapped to stack slots) and live CAF statics. 
     56   * `LastCall` and `LastReturn` nodes are replaced by `Jump`s. 
    4857 
    49 Labels generated by the code generator are of the form {{{<name>_<type>}}} 
    50 where {{{<name>}}} is {{{<Module>_<name>}}} for external names and {{{<unique>}}} for 
    51 internal names. {{{<type>}}} is one of the following: 
     58 * '''Build info tables''', implemented in `CmmBuildInfoTables`..   
     59   * Find each safe `MidForeignCall` node, "lowers" it into the suspend/call/resume sequence (see `Note [Foreign calls]` in `CmmNode.hs`.), and build an info table for them. 
     60   * Convert the `CmmInfo` for each `CmmProc` into a `[CmmStatic]`, using the live variable information computed just before "Figure out stack layout".   
    5261 
    53   info::                   Info table 
    54   srt::                    Static reference table 
    55   srtd::                   Static reference table descriptor 
    56   entry::                  Entry code (function, closure) 
    57   slow::                   Slow entry code (if any) 
    58   ret::                    Direct return address     
    59   vtbl::                   Vector table 
    60   ''n''_alt::              Case alternative (tag ''n'') 
    61   dflt::                   Default case alternative 
    62   btm::                    Large bitmap vector 
    63   closure::                Static closure 
    64   con_entry::              Dynamic Constructor entry code 
    65   con_info::               Dynamic Constructor info table 
    66   static_entry::           Static Constructor entry code 
    67   static_info::            Static Constructor info table 
    68   sel_info::               Selector info table 
    69   sel_entry::              Selector entry code 
    70   cc::                     Cost centre 
    71   ccs::                    Cost centre stack 
     62== Dumping Cmm == 
    7263 
    73 Many of these distinctions are only for documentation reasons.  For 
    74 example, _ret is only distinguished from _entry to make it easy to 
    75 tell whether a code fragment is a return point or a closure/function 
    76 entry. 
     64== Hoopl == 
    7765 
    78 == Modules == 
    79 === {{{CodeGen}}} === 
    80 Top level, only exports {{{codeGen}}}. 
    81  
    82 Called from {{{HscMain}}} for each module that needs to be converted from Stg to Cmm. 
    83  
    84 For each such module {{{codeGen}}} does three things: 
    85  * {{{cgTopBinding}}} for the {{{StgBinding}}} 
    86  * {{{cgTyCon}}} for the {{{TyCon}}} (These are constructors not constructor calls). 
    87  * {{{mkModuleInit}}} for the module 
    88  
    89 {{{mkModuleInit}}} generates several boilerplate initialization functions 
    90 that: 
    91  * regiser the module, 
    92  * creates an Hpc table, 
    93  * setup its profiling info ({{{InitConstCentres}}}, code coverage info {{{initHpc}}}), and 
    94  * calls the initialization functions of the modules it imports. 
    95  
    96 If neither SCC profiling or HPC are used, 
    97 then the initialization code short circuits to return. 
    98  
    99 If the module has already been initialized, 
    100 the initialization function just returns. 
    101  
    102 The {{{Ghc.TopHandler}}} and {{{Ghc.Prim}}} modules get special treatment. 
    103  
    104 {{{cgTopBinding}}} is a small wrapper around {{{cgTopRhs}}} 
    105 which in turn disptaches to: 
    106  * {{{cgTopRhsCons}}} for {{{StgRhsCons}}} 
    107    (these are bindings of constructor applications not constructors themselves) and 
    108  * {{{cgTopRhsClosure}}} for {{{StgRhsClosure}}}. 
    109  
    110 {{{cgTopRhsCons}}} and {{{cgTopRhsClosure}}} are located in {{{CgCon}}} and {{{CgClosure}}} 
    111 which are the primary modules called by {{{CodeGen}}}. 
    112  
    113 === {{{CgCon}}} === 
    114 TODO 
    115  
    116 === {{{CgClosure}}} === 
    117 TODO 
    118  
    119 === {{{CgMonad}}} === 
    120 The monad that most of codeGen operates inside 
    121  * Reader 
    122  * State 
    123  * (could be Writer?) 
    124  * fork 
    125  * flatten 
    126  
    127 === {{{CgExpr}}} === 
    128 Called by {{{CgClosure}}} and {{{CgCon}}}. 
    129  
    130 Since everything in STG is an expression, almost everything branches off from here. 
    131  
    132 This module exports only one function {{{cgExpr}}}, 
    133 which for the most part just dispatches 
    134 to other functions to handle each specific constructor in {{{StgExpr}}}. 
    135  
    136 Here are the core functions that each constructor is disptached to 
    137 (though some may have little helper functions called in addition to the core function): 
    138  {{{StgApp}}}:: Calls to {{{cgTailCall}}} in {{{CgTailCall}}} 
    139  {{{StgConApp}}}:: Calls to {{{cgReturnDataCon}}} in {{{CgCon}}} 
    140  {{{StgLit}}}:: 
    141    Calls to {{{cgLit}}} in {{{CgUtil}}} 
    142     and {{{performPrimReturn}}} in {{{CgTailCall}}} 
    143  {{{StgOpApp}}}:: 
    144    Is a bit more complicated see below. 
    145  {{{StgCase}}}:: Calls to {{{cgCase}}} in {{{CgCase}}} 
    146  {{{StgLet}}}:: Calls to {{{cgRhs}}} in {{{CgExpr}}} 
    147  {{{StgLetNoEscape}}}:: 
    148    Calls to {{{cgLetNoEscapeBindings}}} in {{{CgExpr}}}, but with a little bit of wrapping 
    149    by {{{nukeDeadBindings}}} and {{{saveVolatileVarsAndRegs}}}. 
    150  {{{StgSCC}}}:: Calls to  {{{emitSetCCC}}} in {{{CgProf}}} 
    151  {{{StgTick}}}:: Calls to {{{cgTickBox}}} in {{{CgHpc}}} 
    152  {{{StgLam}}}:: 
    153    Does not have a case because it is only for {{{CoreToStg}}}'s work. 
    154  
    155 Some of these cases call to functions defined in {{{cgExpr}}}. 
    156 This is because they need a little bit of wrapping and processing 
    157 before calling out to their main worker function. 
    158  
    159  {{{cgRhs}}}:: 
    160  * For {{{StgRhsCon}}} calls out to {{{buildDynCon}}} in {{{CgCon}}}. 
    161  * For {{{StgRhsClosure}}} calls out to {{{mkRhsClosure}}}. 
    162    In turn, {{{mkRhsClosure}}} calls out to {{{cgStdRhsClosure}}} for selectors and thunks, 
    163    and calls out to {{{cgRhsClosure}}} in the default case. 
    164    Both these are defined in {{{CgClosure}}}. 
    165  
    166  {{{cgLetNoEscapeBindings}}}:: 
    167  * Wraps a call to {{{cgLetNoEscapeRhs}}} with {{{addBindsC}}} 
    168    depending on whether it is called on a recursive or a non-recursive binding. 
    169    In turn {{{cgLetNoEscapeRhs}}} wraps {{{cgLetNoEscapeClosure}}} 
    170    defined in {{{CgLetNoEscapeClosure}}}. 
    171  
    172 {{{StgOpApp}}} has a number of sub-cases. 
    173  * {{{StgFCallOp}}} 
    174  * {{{StgPrimOp}}} of a !TagToEnumOp 
    175  * {{{StgPrimOp}}} that is primOpOutOfLine 
    176  * {{{StgPrimOp}}} that returns Void 
    177  * {{{StgPrimOp}}} that returns a single primitive 
    178  * {{{StgPrimOp}}} that returns an unboxed tuple 
    179  * {{{StgPrimOp}}} that returns an enumeration type 
    180  
    181 (It appears that non-foreign-call, inline [wiki:Commentary/PrimOps PrimOps] are not allowed to return complex data types (e.g. a |Maybe|), but this fact needs to be verified.) 
    182  
    183 Each of these cases centers around one of these three core calls: 
    184  * {{{emitForeignCall}}} in {{{CgForeignCall}}} 
    185  * {{{tailCallPrimOp}}} in {{{CgTailCall}}} 
    186  * {{{cgPrimOp}}} in {{{CgPrimOp}}} 
    187  
    188 There is also a little bit of argument and return marshelling with the following functions 
    189  Argument marshelling:: 
    190    {{{shimForeignCallArg}}}, {{{getArgAmods}}} 
    191  Return marshelling:: 
    192    {{{dataReturnConvPrim}}}, {{{primRepToCgRep}}}, {{{newUnboxedTupleRegs}}} 
    193  Performing the return:: 
    194    {{{emitReturnInstr}}}, {{{performReturn}}}, 
    195    {{{returnUnboxedTuple}}}, {{{ccallReturnUnboxedTuple}}} 
    196  
    197 In summary the modules that get called in order to handle a specific expression case are: 
    198 ==== Also called for top level bindings by {{{CodeGen}}} ==== 
    199  {{{CgCon}}}:: for {{{StgConApp}}} and the {{{StgRhsCon}}} part of {{{StgLet}}} 
    200  {{{CgClosure}}}:: for the {{{StgRhsClosure}}} part of {{{StgLet}}} 
    201  
    202 ==== Core code generation ==== 
    203  {{{CgTailCall}}}:: for {{{StgApp}}}, {{{StgLit}}}, and {{{StgOpApp}}} 
    204  {{{CgPrimOp}}}:: for {{{StgOpApp}}} 
    205  {{{CgLetNoEscapeClosure}}}:: for {{{StgLetNoEscape}}} 
    206  {{{CgCase}}}:: for {{{StgCase}}} 
    207  
    208 ==== Profiling and Code coverage related ==== 
    209  {{{CgProf}}}:: for {{{StgSCC}}} 
    210  {{{CgHpc}}}:: for {{{StgTick}}} 
    211  
    212 ==== Utility modules that happen to have the functions for code generation ==== 
    213  {{{CgForeignCall}}}:: for {{{StgOpApp}}} 
    214  {{{CgUtil}}}:: for {{{cgLit}}} 
    215  
    216 Note that the first two are 
    217 the same modules that are called for top level bindings by {{{CodeGen}}}, 
    218 and the last two are really utility modules, 
    219 but they happen to have the functions 
    220 needed for those code generation cases. 
    221  
    222 === Memory and Register Management === 
    223  {{{CgBindery}}}:: 
    224    Module for {{{CgBindings}}} which maps variable names 
    225    to all the volitile or stable locations where they are stored 
    226    (e.g. register, stack slot, computed from other expressions, etc.) 
    227    Provides the {{{addBindC}}}, {{{modifyBindC}}} and {{{getCgIdInfo}}} functions 
    228    for adding, modifying and looking up bindings. 
    229  
    230  {{{CgStackery}}}:: 
    231    Mostly utility functions for allocating and freeing stack slots. 
    232    But also has things on setting up update frames. 
    233  
    234  {{{CgHeapery}}}:: 
    235    Functions for allocating objects that appear on the heap such as closures and constructors. 
    236    Also includes code for stack and heap checks and {{{emitSetDynHdr}}}. 
    237  
    238 === Function Calls and Parameter Passing === 
    239 (Note: these will largely go away once CPS conversion is fully implemented.) 
    240  
    241  {{{CgPrimOp}}}, {{{CgTailCall}}}, {{{CgForeignCall}}}:: 
    242    Handle different types of calls. 
    243  {{{CgCallConv}}}:: 
    244    Use by the others in this category to determine liveness and 
    245    to select in what registers and stack locations arguments and return 
    246    values get stored. 
    247  
    248 === Misc utilities === 
    249  {{{Bitmap}}}:: 
    250    Utility functions for making bitmaps (e.g. {{{mkBitmap}}} with type {{{[Bool] -> Bitmap}}}) 
    251  {{{ClosureInfo}}}:: 
    252    Stores info about closures and bindings. 
    253    Includes information about memory layout, how to call a binding ({{{LambdaFormInfo}}}) 
    254    and information used to build the info table ({{{ClosureInfo}}}). 
    255  {{{SMRep}}}:: 
    256    Storage manager representation of closures. 
    257    Part of !ClosureInfo but kept separate to "keep nhc happy." 
    258  {{{CgUtils}}}:: TODO 
    259  {{{CgInfoTbls}}}:: TODO 
    260  
    261 === Special runtime support === 
    262  {{{CgTicky}}}:: Ticky-ticky profiling 
    263  {{{CgProf}}}:: Cost-centre profiling 
    264  {{{CgHpc}}}:: Support for the Haskell Program Coverage (hpc) toolkit, inside GHC. 
    265  {{{CgParallel}}}:: 
    266    Code generation for !GranSim (GRAN) and parallel (PAR). 
    267    All the functions are dead stubs except {{{granYield}}} and {{{granFetchAndReschedule}}}. 
     66write about Hoopl (link paper, mention which modules are implemented with Hoopl)