Changes between Version 17 and Version 18 of Commentary/Compiler/IntegratedCodeGen


Ignore:
Timestamp:
Jun 9, 2008 10:42:13 AM (7 years ago)
Author:
simonpj
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • Commentary/Compiler/IntegratedCodeGen

    v17 v18  
    6565== Proposed compilation pipeline ==
    6666
    67  0. Convert from {{{STG}}} to an control flow graph {{{CmmGraph}}} ([[GhcFile(compiler/cmm/ZipCfg.hs)]], [[GhcFile(compiler/cmm/ZipCfgCmmRep.hs)]]).  This step is Simon PJ's "new code generator" from September 2007.  This conversion may introduce new variables, stack slots, and compile-time constants.
     67 0. Convert from {{{STG}}} to an control flow graph {{{CmmGraph}}}:
     68{{{
     69STG -> CmmGraph Cmm.Middle Cmm.Last
     70}}}
     71 0. Instruction selection:
     72{{{
     73CmmGraph Cmm.Middle Cmm.Last -> CmmGraph I386.Middle I386.Last
     74}}}
     75 0. Optimise:
     76{{{
     77CmmGraph I386.Middle I386.Last -> CmmGraph I386.Middle I386.Last
     78}}}
     79 0. Proc-point analysis, and transformation
     80{{{
     81analyse :: CmmGraph I386.Middle I386.Last -> [BlockId]
     82transform :: [BlockId] -> CmmGraph I386.Middle I386.Last -> CmmGraph I386.Middle I386.Last
     83}}}
     84
     85
     86=== Convert from STG to control flow graph ===
     87
     88Convert from {{{STG}}} to an control flow graph {{{CmmGraph}}} ([[GhcFile(compiler/cmm/ZipCfg.hs)]], [[GhcFile(compiler/cmm/ZipCfgCmmRep.hs)]]).  This step is Simon PJ's "new code generator" from September 2007.  This conversion may introduce new variables, stack slots, and compile-time constants.
    6889   * Implements calling conventions for call, jump, and return instructions: all parameter passing is turned into data-movement instructions (register-to-register move, load, or store), and stack-pointer adjustments are inserted. After this point, calls, returns, and jumps are just control-transfer instructions -- the parameter passing has been compiled away. 
    6990   * How do we refer to locations on the stack when we haven't laid it out yet? The compiler names a stack slot using the idea of a "late compile-time constant," which is just a symbolic constant that will be replaced with an actual stack offset when the stack layout is chosen.One departure from the old code generator is that '''we do not build a {{{Cmm}}} abstract-syntax tree;''' instead we go straight to a control-flow graph.
    70  In practice, we first generate an "abstract control flow graph", `CmmAGraph`, which makes the business of generating fresh `BlockId`s more convenient, and convert that to a `CmmGraph`.  The former is convenient for ''construction'' but cannot be analysed; the latter is concrete, and can be analyzed, transformed, and optimized. 
    71  0. Instruction selection: each {{{Cmm}}} {{{Middle}}} and {{{Last}}} node in the control-flow graph is replaced with a new graph in which the nodes are machine instructions.  The {{{MachineMiddle}}} type represents computational machine instructions; the {{{MachineLast}}} type represents control-transfer instructions.  The choice of representation is up to the author of the back end, but for continuity with the existing native code generators, we expect to begin by using algebraic data types inspired by the existing definitions in [[GhcFile(compiler/nativeGen/MachInstrs.hs)]].
    72       * An '''extremely important distinction''' from the existing code is that we plan to eliminate {{{#ifdef}}} and instead provide multiple datatypes, e.g., in {{{I386Instrs.hs}}}, {{{PpcInstrs.hs}}}, {{{SparcInstrs.hs}}}, and so on.
    73     We expect to '''abstract away from the details of these representations''' by borrowing some abstractions from [http://www.eecs.harvard.edu/hube/software/nci/overview.html Machine SUIF].  In the longer term we would like to support RTL-based representations such as are used in gcc, vpo and Quick C--.  [[BR]][[BR]] We expect a set of types and an instruction selector for ''each'' back end, so for example, we might have a transformation that maps {{{LGraph Cmm.Middle Cmm.Last}}} (with variables, stack slots, and compile-time constants) {{{-> LGraph I368Instrs.Middle I368Instrs.Last}}} (with variables, stack slots, and compile-time constants)
    74  0. Optimizer: {{{LGraph Instrs}}} (with variables, stack slots, and compile-time constants) {{{-> LGraph Instrs}}} (with variables, stack slots, and compile-time constants)
    75      * Obvious code improvements include lazy code motion and dead-code elimination as well as various optimizations based on forward substitution (copy propagation, constant propagation, peephole optimization).  We intend to follow the style of Machine SUIF and '''make code improvements machine-independent''', using machine-dependent functions to capture the semantics of instructions.
    76      * The difficulty of implementing a good peephole optimizer varies greatly with the representation of instructions.  We propose to postpone serious work on peephole optimization until we have a back end capable of representing machine instructions as RTLs, which makes peephole optimization trivial.
    77  0. Proc-point analysis: {{{LGraph Instrs}}} (with variables, stack slots, and compile-time constants) -> {{{LGraph Instrs}}} (with variables, stack slots, and compile-time constants)
     91
     92In practice, we first generate an "abstract control flow graph", `CmmAGraph`, which makes the business of generating fresh `BlockId`s more convenient, and convert that to a `CmmGraph`.  The former is convenient for ''construction'' but cannot be analysed; the latter is concrete, and can be analyzed, transformed, and optimized. 
     93
     94=== Establish the Machine Invariant ===
     95
     96Instruction selection: each {{{Cmm}}} {{{Middle}}} and {{{Last}}} node in the control-flow graph is replaced with a new graph in which the nodes are machine instructions.
     97{{{
     98CmmGraph Cmm.Middle Cmm.Last -> CmmGraph I386.Middle I386.Last
     99}}}
     100The {{{I386.Middle}}} type represents computational machine instructions; the {{{I376.Last}}} type represents control-transfer instructions.  The choice of representation is up to the author of the back end, but for continuity with the existing native code generators, we expect to begin by using algebraic data types inspired by the existing definitions in [[GhcFile(compiler/nativeGen/MachInstrs.hs)]].
     101
     102An '''extremely important distinction''' from the existing code is that we plan to eliminate {{{#ifdef}}} and instead provide multiple datatypes, e.g., in {{{I386.hs}}}, {{{PpcInstrs.hs}}}, {{{Sparc.hs}}}, and so on. 
     103
     104Similarly, we expect a an instruction selector for ''each'' back end, so for example, we might have a transformation that maps {{{LGraph Cmm.Middle Cmm.Last}}} (with variables, stack slots, and compile-time constants) {{{-> LGraph I86.Middle I386.Last}}} (with variables, stack slots, and compile-time constants).
     105
     106We expect to '''abstract away from the details of these representations''' by borrowing some abstractions from [http://www.eecs.harvard.edu/hube/software/nci/overview.html Machine SUIF].  In the longer term we would like to support RTL-based representations such as are used in gcc, vpo and Quick C--.  What this means is that `I386.Middle` (etc) is an abstract type, an instance of type class that supports the functions that the rest of the pipeline needs. For example:
     107{{{
     108class Instr i where
     109  defs :: i -> [LocalReg]
     110  uses :: i -> [LocalReg]
     111  ..etc..
     112}}}
     113This allows us to  '''make code improvements machine-independent''', by using machine-dependent functions to capture the semantics of instructions.
     114
     115=== Optimisation ===
     116
     117Optimise the code.  {{{LGraph Instrs}}} (with variables, stack slots, and compile-time constants) {{{-> LGraph Instrs}}} (with variables, stack slots, and compile-time constants), such as
     118
     119   * Branch chain elimination.
     120   * Remove unreachable blocks (dead code).
     121   * Constant propagation.
     122   * Copy propagation.
     123   * Lazy code motion (hoisting, sinking, partial redundancy elimination).
     124   * Block concatenation.  branch to K; and this is the only use of K. 
     125   * Common Block Elimination (like CSE). This essentially implements the Adams optimisation, we believe.
     126   * Consider (sometime): block duplication.  branch to K; and K is a short block.  Branch chain elimination is just a special case of this.
     127   * Peephole optimisation.  The difficulty of implementing a good peephole optimizer varies greatly with the representation of instructions.  We propose to postpone serious work on peephole optimization until we have a back end capable of representing machine instructions as RTLs, which makes peephole optimization trivial.
     128
     129=== Proc-point analysis ===
     130
     131Proc-point analysis: {{{LGraph Instrs}}} (with variables, stack slots, and compile-time constants) -> {{{LGraph Instrs}}} (with variables, stack slots, and compile-time constants)
    78132  * Proc points are found, and the appropriate control-transfer instructions are inserted.
    79133  * Why so early? Depending on the back end (think of C as the worst case), the proc-point analysis might have to satisfy some horrible calling convention. We want to make these requirements explicit before we get to the register allocator.  We also want to '''exploit the register allocator''' to make the best possible decisions about ''which live variables (if any) should be in registers at a proc point''.
     134
     135=== not yet done ===
    80136 0. Register allocation: {{{LGraph Instrs}}} (with variables, stack slots, and compile-time constants) {{{-> LGraph Instrs}}} (with stack slots, and compile-time constants)
    81137  * Replace variable references with machine register and stack slots.