Changes between Version 24 and Version 25 of Commentary/Compiler/CodeGen


Ignore:
Timestamp:
Feb 1, 2014 5:07:44 PM (15 months ago)
Author:
jstolarek
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • Commentary/Compiler/CodeGen

    v24 v25  
    1111== Overview == 
    1212 
    13 The goal of the code generator is to convert program from [wiki:Commentary/Compiler/GeneratedCode STG] representation to [wiki:Commentary/Compiler/CmmType Cmm] representation. STG is a functional language with explicit stack. Cmm is a low-level imperative language - something between C and assembly - that is suitable for machine code generation. Note that terminology might be a bit confusing here: the term "code generator" refers to STG->Cmm pass but it may also be used to refer to the Cmm->assembly pass. Referring to the latter as the "backend" eliminates the confusion. 
     13The goal of the code generator is to convert program from [wiki:Commentary/Compiler/GeneratedCode STG] representation to [wiki:Commentary/Compiler/CmmType Cmm] representation. STG is a functional language with explicit stack. Cmm is a low-level imperative language - something between C and assembly - that is suitable for machine code generation. Note that terminology might be a bit confusing here: the term "code generator" can refer both to STG->Cmm pass and the whole STG->Cmm->assembly pass. The Cmm->assembly conversion is performed by one the backends, eg. NCG (Native Code Generator or LLVM. 
    1414 
    15 The top-most entry point to the codegen is located in [[GhcFile(compiler/main/HscMain.hs)]] in the `tryNewCodegen` function. Code generation now has three stages: 
     15The top-most entry point to the codegen is located in [[GhcFile(compiler/main/HscMain.hs)]] in the `tryNewCodegen` function. Code generation is done in two stages: 
    1616  1. Convert STG to Cmm with implicit stack, and native Cmm calls. This whole stage lives in [[GhcFile(compiler/codeGen)]] directory with the entry point being `codeGen` function in [[GhcFile(compiler/codeGen/StgCmm.hs)]] module. 
    17   2. Optimise the Cmm, and CPS-convert it to have an explicit stack, and no native calls. This lives in [[GhcFile(compiler/cmm)]] directory with the `cmmPipeline` function from [[GhcFile(compiler/cmm/CmmPipeline.hs)]] module being the entry point. This is described below in full detail. 
    18   3. Feed the CPS-converted Cmm to the existing, unmodified native code generators. This is done by `codeOutput` function ([[GhcFile(compiler/main/CodeOutput.lhs)]] called from `hscGenHardCode` after returning from `tryNewCodegen`. 
     17  2. Optimise the Cmm, and CPS-convert it to have an explicit stack, and no native calls. This lives in [[GhcFile(compiler/cmm)]] directory with the `cmmPipeline` function from [[GhcFile(compiler/cmm/CmmPipeline.hs)]] module being the entry point. 
    1918 
    20 == Structure of Cmm pipeline == 
     19The CPS-converted Cmm is fed to one of the backends. This is done by `codeOutput` function ([[GhcFile(compiler/main/CodeOutput.lhs)]] called from `hscGenHardCode` after returning from `tryNewCodegen`. 
    2120 
    22 The first two steps are described in more detail here: 
     21== First stage: STG to Cmm conversion ==  
    2322 
    2423 * '''Code generator''' converts STG to `CmmGraph`.  Implemented in `StgCmm*` modules (in directory `codeGen`).  
     
    2726     * Parameters are passed in virtual registers R1, R2 etc. [These map 1-1 to real registers.]  
    2827     * Overflow parameters are passed on the stack using explicit memory stores, to locations described abstractly using the [wiki:Commentary/Compiler/StackAreas ''Stack Area'' abstraction.].    
    29      * Making the calling convention explicit includes an explicit store instruction of the return address, which is stored explicitly on the stack in the same way as overflow parameters. This is done (obscurely) in `MkGraph.mkCall`. 
     28     * Making the calling convention explicit includes an explicit store instruction of the return address, which is stored explicitly on the stack in the same way as overflow parameters. This is done (obscurely) in `StgCmmMonad.mkCall`. 
    3029 
    31  * '''Simple control flow optimisation''', implemented in `CmmContFlowOpt`.  It's called both at the beginning and end of the pipeline. 
    32    * Branch chain elimination. 
    33    * Remove unreachable blocks. 
    34    * Block concatenation.  branch to K; and this is the only use of K.   
     30== Second stage: the Cmm pipeline == 
     31 
     32The core of the Cmm pipeline is implemented by the `cpsTop` function in [[GhcFile(compiler/cmm/CmmPipeline.hs)]] module. The pipeline consists of following passes: 
     33 
     34 * '''Simple control flow optimisation''', implemented in `CmmContFlowOpt`, simplifies the control flow graph by: 
     35   * Eliminating blocks that have only one predecessor by concatenating them with that predecessor 
     36   * Shortcuting targets of branches and calls (see Note [What is shortcutting]) 
     37   Note that if a block becomes unreachable because of shortcutting it is eliminated from the graph. However, '''it is theoretically possible that this pass will produce unreachable blocks'''. The reason is the label renaming pass performed after block concatenation has been completed. 
     38  This pass might be optionally called for the second time at the end of the pipeline. 
    3539 
    3640 * '''More control flow optimisations'''.