Version 20 (modified by dias, 9 years ago) (diff)


Material about the new code generator

This page summarises work that Norman Ramsey, Simon M, Simon PJ, and John Dias are doing on re-architecting GHC's back end. You may want to see the short-term view of the pipeline.

Bug list (code-gen related bugs that we may be able to fix):


  • The Rep swamp is drained: see Commentary/Compiler/BackEndTypes
  • Code generator: first draft done.
  • Control-flow opt: simple ones done
    • Common block elmination: to do
    • Block concatenation: to do
  • Adams optimisation: currently done in compiler/cmm/CmmProcPointZ.hs, which is incomplete because it does not insert the correct CopyOut nodes. The Adams optimization should be divorced from this module and replaced with common-block elimination, to be done after the proc-point transformation. In principle this combination may be slightly less effective than the current code, since the selection of proc-point protocols is guided by Adams's criteria, but NR thinks it will be easy to get the common, important cases nailed.
  • Proc-point analysis and transformation: 'working' but incomplete and incorrect in the sense that CopyIn nodes are created without all the required dual CopyOut nodes. There is still no coherent plan for calling conventions, and the lack of such a plan prevents the completion of proc-point analysis, as in principle it should come up with a calling convention for each freely chosen proc point. In practice NR recommends the following procedure:
    • All optional proc points to be generated with no parameters (all live variables on the stack)
    • This situation to be remedied when the code generator is reorganized along the lines NR proposed in July 2007, i.e., the register allocator runs on C-- with calls (as opposed to C-- with jumps only) and therefore before proc-point analysis
  • Add spill/reload: Implemented to NR's satisfaction in compiler/cmm/CmmSpillReload.hs, with the proviso that spilling is done to abstract stack slots rather than real stack positions (see comments below on stack-slot allocation)
  • Stack slot allocation: nothing here but some broken bits and pieces. Progress in this arena is blocked by the lack of a full understanding of how to do stack-frame layout and how to deal with calling conventions. NR proposes that life would be simplified if all calls downstream from the Cmm converter were to be parameterless---the idea being to handle the calling conventions here and to put arguments and results in their conventional locations.
  • Make stack explicit: to do
  • Split into multiple CmmProcs: to do

Norman's plan

  1. New code to check invariants of output from compiler/cmm/ZipDataflow.hs
  2. Finish debugging compiler/cmm/ZipDataflow.hs.
  3. Use Simon PJ's 'common-blockifier' (which does not exist!!!) to move the Adams optimization outside compiler/cmm/CmmProcProintZ.hs
  4. ProcPointZ does not insert CopyOut nodes; this omission must be rectified and will require some general infrastructure for inserting predecessors.
  5. Simple optimizations on CopyIn and CopyOut may be required
  6. Define an interface for calling conventions and invariants for the output of frame layout [will require help from Simon M]
  7. Stack layout
  8. Glue the whole pipeline together and make sure it works.

Items 1-5 look like a few days apiece. Items 6 and 7 are more scary...

ToDo: main issues

  • SRTs simply record live global variables. So we should use the same live-variable framework as for live local variables. That means we must be able to identify which globals are SRT-able. What about compression/encoding schemes?
  • How do we write continuations in the RTS? E.g. the update-frame continuation? Michael Adams had a syntax with two sets of parameters, the the ones on the stack and the return values.
  • Review code gen for calls with lots of args. In the existing codegen we push magic continuations that say "apply the return value to N more args". Do we want to do this? ToDo: how rare is it to have too many args?
  • Figure out how PAPs work. This may interact with the GC check and stack check at the start of a function call.
  • How do stack overflow checks work? (They are inserted by the CPS conversion, and must not generate a new info table etc.)
  • Was there something about sinking spills and hoisting reloads?

ToDo: small issues

  • Shall we rename Branch to GoTo?!
  • Where is the "push new continuation" middle node?
  • Change the C-- parser (which parses RTS .cmm files) to directly construct CmmGraph.
  • (SLPJ) See let-no-escape todos in StgCmmExpr.

The new Cmm data type

There is a new Cmm data type:

  • compiler/cmm/ZipCfg.hs contains a generic zipper-based control-flow graph data type. It is generic in the sense that it's polymorphic in the type of middle nodes and last nodes of a block. (Middle nodes don't do control transfers; last nodes only do control transfers.) There are extensive notes at the start of the module.

    The key types it defines are:
    • Block identifiers: BlockId, BlockEnv, BlockSet
    • Control-flow blocks: Block
    • Control-flow graphs: Graph

  • ZipDataFlow contains a generic framework for solving dataflow problems over ZipCfg.

  • compiler/cmm/ZipCfgCmmRep.hs instantiates ZipCfg for Cmm, by defining types Middle and Last and using these to instantiate the polymorphic fields of ZipCfg. It also defines a bunch of smart constructor (mkJump, mkAssign, mkCmmIfThenElse etc) which make it easy to build CmmGraph.

  • CmmExpr contains the data types for Cmm expressions, registers, and the like. It does not depend on the dataflow framework at all.

Runtime system

  • Garbage collector entry points: see Note [Heap checks] in StgCmmHeapery.
  • PAPs

  • Update frames and exception handling. Also STM frames.
  • Primitives can be rewritten:
    • Use parameters
    • In a few cases, use native calls (notably eval)