Changes between Version 29 and Version 30 of Commentary/Compiler/CodeGen

Feb 3, 2014 1:05:57 PM (4 years ago)



  • Commentary/Compiler/CodeGen

    v29 v30  
    4242 * '''Common Block Elimination''', implemented in `CmmCommonBlockElim`, eliminates blocks that are identical (except for the label on their first node). Since this pass traverses blocks in depth-first order any unreachable blocks introduced by Control Flow Optimisations are eliminated. '''This pass is optional.'''
    44  * '''Determine proc-points''', implemented in `CmmProcPoint`. Proc-point analysis is done only when proc-point splitting is turned on. This is important for the LLVM backend. Proc-point are blocks in the graph that can be turned into entry points of procedures and proc-point splitting splits one function into many smaller ones. Initially we assume that entry point to a graph as well as continuation of every call is a proc-point. Then we look for blocks reachable from multiple proc-points and turn such blocks into a proc-point.
     44 * '''Determine proc-points''', implemented in `CmmProcPoint`. The idea behind the "proc-point splitting" is that we first determine proc-points, ie. blocks in the graph that can be turned into entry points of procedures, and then split a larger function into many smaller ones, each having a proc-point as its entry point. This is required for the LLVM backend. The proc-point splitting itself is done later in the pipeline, but here we only determine the set of proc-points. We first call `callProcPoints`, which assumes that entry point to a Cmm graph and every continuation of a call is a procpoint. If we are aplitting proc-points we update the list of proc-points by calling `minimalProcPointSet`, which adds all blocks reachable from more than one block in the graph. The set of proc-points is required by the stack layout pass.
    46  * '''Figure out the stack layout''', implemented in `CmmStackLayout`.
    47    * Each variable 'x', and each proc-point label 'K', has an associated ''Area'', written SS(x) and SS(k) resp, that names a contiguous portion of the stack frame. 
    48    * The stack layout pass produces a mapping of: ''(`Area` -> `StackOffset`)''. For more detail, see [wiki:Commentary/Compiler/StackAreas#Layingoutthestack the description of stack layout.]
    49    * A `StackOffset` is the byte offset of a stack slot from the old end (high address) of the frame.  It doesn't vary as the physical stack pointer moves.
    51  * '''Manifest the stack pointer''', implemented in `CmmStackLayout`.  Once the stack layout mapping has been determined, a second pass walks over the graph, making the stack pointer, `Sp` explicit. Before this pass, there is no `Sp` at all.  After this, `Sp` is completely manifest.
    52    * replacing references to `Areas` with offsets from `Sp`.
    53    * adding adjustments to `Sp`.
     46 * '''Figure out the stack layout''', implemented in `CmmStackLayout`. The job of this pass is to:
     47   * replace references to abstract stack Areas with fixed offsets from Sp.
     48   * replace the !CmmHighStackMark constant used in the stack check with
     49     the maximum stack usage of the proc.
     50   * save any variables that are live across a call, and reload them as
     51   necessary.
     52 '''Invariant violation''': It may happen that stack layout will invalidate the computed set of proc-points by removing a block that is a proc-point. This means that at this point in the pipeline we have insonsistent data and subsequent steps must be prepared for it.
    5554 * '''Sinking assignments''', implemented in `CmmSink`, performs these optimizations:
    5857   * inlines assignments to registers that are mentioned only once
    5958   * discards dead assignments
    6159 '''This pass is optional.''' It currently does not eliminate dead code in loops (#8327) and has some other minor deficiencies (eg. #8336).
    6361  * '''CAF analysis''', implemented in `CmmBuildInfoTables`. Computed CAF information is returned from `cmmPipeline` and used to create Static Reference Tables (SRT). See [wiki:Commentary/Rts/Storage/GC/CAFs here] for some more detail on CAFs and SRTs. This pass is implemented using Hoopl (see below).
    65 Here the pipeline splits into two alternative flows depending on whether we are splitting proc-points or not. Branches are essentially identical, except that one begins with splitting proc-points, while the other begins with attaching info tables.
     63  * '''Proc-point analysis and splitting''' (only when splitting proc-points), implemented by `procPointAnalysis` in `CmmProcPoint`, takes a list of proc-points and for each block and determines from which proc-point the block is reachable. This is implemented using Hoopl.
     64  Then the call to `splitAtProcPoints` splits the Cmm graph into multiple Cmm graphs (each represents a single function) and build info tables to each of them.
     65  When doing this we must be prepared for the fact that a proc-point does not actually exist in the graph since it was removed by stack layout pass (see #8205).
    67 '''TODO: Update descriptions below.'''
     67  * '''Attach continuations' info tables''' (only when NOT splitting proc-points), implemented by `attachContInfoTables` in `CmmProcPoint` attaches info tables for the continuations of calls in the graph. ''[PLEASE WRITE MORE IF YOU KNOW WHY THIS IS !DONE]''
    69  * '''Split into multiple !CmmProcs''', implemented in `CmmProcPointZ`.  At this point we build an info-table for each of the !CmmProcs, including SRTs.  Done on the basis of the live local variables (by now mapped to stack slots) and live CAF statics.
    70    * `LastCall` and `LastReturn` nodes are replaced by `Jump`s.
     69  * '''Update info tables to include stack liveness''', implemented by `setInfoTableStackMap` in `CmmLayoutStack`. Populates info tables of each Cmm function with stack usage information. Uses stack maps created by the stack layout pass.
    72  * '''Build info tables''', implemented in `CmmBuildInfoTables`.. 
    73    * Find each safe `MidForeignCall` node, "lowers" it into the suspend/call/resume sequence (see `Note [Foreign calls]` in `CmmNode.hs`.), and build an info table for them.
    74    * Convert the `CmmInfo` for each `CmmProc` into a `[CmmStatic]`, using the live variable information computed just before "Figure out stack layout". 
     71  * '''Control Flow Optimisations''', same as the beginning of the pipeline, but this pass runs only with `-O1` and `-O2`. Since this pass might produce unreachable blocks it is followed by a call to `removeUnreachableBlocksProc` (also in `CmmContFlowOpt.hs`)
    76 == Dumping Cmm ==
     73== Dumping and debugging Cmm ==
    78 == Hoopl ==
     75You can dump the generated Cmm code using `-ddump-cmm` flag. This is helpful for debugging Cmm problems. Cmm dump is divided into several sections:
    80 write about Hoopl (link paper, mention which modules are implemented with Hoopl)
     78==================== Cmm produced by new codegen ====================
     81==================== Post control-flow optimisations ====================
     84==================== Post common block elimination ====================
     87==================== Layout Stack ====================
     90==================== Sink assignments ====================
     93==================== CAFEnv ====================
     96==================== after setInfoTableStackMap ====================
     99==================== Post control-flow optimisations ====================
     102==================== Post CPS Cmm ====================
     105==================== Output Cmm ====================
     109"Cmm produced by new codegen" is emited in `HscMain` module after converting STG to Cmm. This Cmm has not been processed in any way by the Cmm pipeline. If you see that something is incorrect in that dump it means that the problem is located in the STG->Cmm pass. The last section, "Output Cmm", is also dumped in `HscMain` but this is done after the Cmm has been processed by the whole Cmm pipeline. All other sections are dumped by the !CmmPipeline. You can dump only selected passes with more specific flags. For example, if you know (or suspect) that the sinking pass is performing some incorrect transformations you can make the dump shorter by adding `-ddump-cmm-sp -ddump-cmm-sink` flags. This will produce only the "Layout Stack" dump (just before sinking pass) and "Sink assignments" dump (just after the sinking pass) allowing you to focus on the changes introduced by the sinking pass.