Changes between Version 10 and Version 11 of Commentary/Compiler/NewCodeGen


Ignore:
Timestamp:
Dec 11, 2007 12:10:34 PM (6 years ago)
Author:
simonpj
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • Commentary/Compiler/NewCodeGen

    v10 v11  
    6262 * '''`CmmExpr`''' contains the data types for Cmm expressions, registers, and the like.  It does not depend on the dataflow framework at all. 
    6363 
    64   
     64-------------------------------------- 
    6565== The pipeline == 
    6666 
     
    101101 * '''Split into multiple !CmmProcs'''. 
    102102 
     103----------------------------------- 
    103104== Runtime system == 
    104105 
     
    112113   * Use parameters 
    113114   * In a few cases, use native calls (notably eval) 
     115 
     116-------------------------------------- 
     117== The `Rep` swamp == 
     118 
     119I have completed a major representation change, affecting both old and new code generators, of the  various `Rep` types.  It's pervasive in that it touches a lot of files; and in the native code-gen very many lines are changed.  The new situation is much cleaner. 
     120 
     121Here are the highlights of the new design. 
     122 
     123=== `CmmType` === 
     124 
     125There is a new type `CmmType`, defined in module `CmmExpr`, which is just what it sounds like: it's the type of a `CmmExpr` or a `CmmReg`.   
     126   * A `CmmType` is ''abstract'': its representation is private to `CmmExpr`.  That makes it easy to change representation. 
     127   * A `CmmType` is actually just a pair of a `Width` and a category (`CmmCat`). 
     128   * The `Width` type is exported and widely used in pattern-matching, but it does what it says on the tin: width only.   
     129   * In contrast, the `CmmCat` type is entirely private to `CmmExpr`.  It is just an enumeration that allows us to distinguish: floats, gc pointers, and other.  
     130 
     131Other important points are these: 
     132 
     133 * Each `LocalReg` has a `CmmType` attached; this replaces the previous unsavoury combination of `MachRep` and `CmmKind`.  Indeed, both of the latter are gone entirely. 
     134 
     135 * Notice that a `CmmType` accurately knows about gc-pointer-hood. Ultimately we will abandon static-reference-table generation in STG syntax, and instead generate SRTs from the Cmm code.  We'll need to update the RTS `.cmm` files to declare pointer-hood. 
     136 
     137 * The type `TyCon.PrimRep` remains; it enumerates the representations that a Haskell value can take.  Differences from `CmmType`: 
     138   * `PrimRep` contains `VoidRep`, but `CmmType` has no zero-width form. 
     139   * `CmmType` includes sub-word width values (e.g. 8-bit) which `PrimRep` does not. 
     140   The function `primRepCmmType` converts a non-void `PrimRep` to a `CmmType`. 
     141 
     142=== `MachOp` === 
     143 
     144The `MachOp` type enumerates (in machine-independent form) the available machine instructions.  The principle they embody is that ''everything except the width is embodied in the opcode''.  In particular, we have 
     145 * `MO_S_Lt`, `MO_U_Lt`, and `MO_F_Lt` for comparison (signed, unsigned, and float). 
     146 * `MO_SS_Conv`, `MO_SF_Conv` etc, for conversion (`SS` is signed-to-signed, `SF` is signed-to-float, etc). 
     147These constructor all take `Width` arguments. 
     148 
     149The `MachOp` data type is defined in `CmmExpr`, not in a separate `MachOp` module. 
     150 
     151=== Foreign calls and hints === 
     152 
     153In the new Cmm representation (`ZipCfgCmmRep`), but not the old one, arguments and results to all calls, including foreign ones, are ordinary `CmmExpr` or `CmmReg` respectively.  The extra information we need for foreign calls (is this signed?  is this an address?) are kept in the calling convention.  Specifically: 
     154 
     155 * `MidUnsafeCall` calls a `MidCallTarget` 
     156 * `MidCallTarget` is either a `CallishMachOp` or a `ForeignTarget` 
     157 * In the latter case we supply a `CmmExpr` (the function to call) and a `ForeignConvention` 
     158 * A `ForeignConvention` contains the C calling convention (stdcall, ccall etc), and a list of `ForiegnHints` for arguments and for results. (We might want to rename this type.) 
     159 
     160This simple change was horribly pervasive.  The old Cmm rep (and Michael Adams's stuff) still has arguments and results being (argument,hint) pairs, as before. 
     161 
     162=== Native code generation and the `Size` type === 
     163 
     164The native code generator has an instruction data type for each architecture.  Many of the instructions in these data types used to have a `MachRep` argument, but now have a `Size` argument instead.  In fact, so far as the native code generators are concerned, these `Size` types (which can be machine-specific) are simply a plug-in replacement for `MachRep`, with one big difference: '''`Size` is completely local to the native code generator''' and hence can be changed at will without affecting the rest of the compiler. 
     165 
     166`Size` is badly named, but I inherited the name from the previous code. 
     167 
     168I rather think that many instructions should have a `Width` parameter, not a `Size` parameter.  But I didn't feel confident to change this.  Generally speaking the NCG is a huge swamp and needs re-factoring.