wiki:Commentary/Compiler/CoreSynType

Version 1 (modified by simonpj, 8 years ago) (diff)

--

The Core type

The Core language is GHC's central data types. Core is a very small, explicitly-typed, variant of System. The exact variant is called System FC, and described by our paper System F with equality coercions. (Note: the move to FC was done in Autumn 2006, but earlier versions of GHC had a very similar language.)

The CoreSyn type, and the functions that operate over it, gets an entire directory compiler/coreSyn:

Here is the entire Core type compiler/coreSyn/CoreSyn.lhs:

type CoreExpr = Expr Var

data Expr b	-- "b" for the type of binders, 
  = Var	  Id
  | Lit   Literal
  | App   (Expr b) (Arg b)
  | Lam   b (Expr b)
  | Let   (Bind b) (Expr b)
  | Case  (Expr b) b Type [Alt b]
  | Cast  (Expr b) Coercion
  | Note  Note (Expr b)
  | Type  Type

type Arg b = Expr b
type Alt b = (AltCon, [b], Expr b)

data AltCon = DataAlt DataCon | LitAlt  Literal | DEFAULT

data Bind b = NonRec b (Expr b) | Rec [(b, (Expr b))]

That's it. All of Haskell gets compiled through this tiny core.

Expr is parameterised over the type of its binders, b. This facility is used only rarely, and always temporarily; for example, the let-floater SetLevels pass attaches a binding level to every binder. By far the most important type is CoreExpr, which is Expr with Var binders.

Here are some notes about the individual constructors of Expr.

  • Lam is used for both term and type abstraction (small and big lambdas).
  • Type appears only in type-argument positions (e.g. App (Var f) (Type ty)). To emphasise this, the type synonym Arg is used as documentation when we expect that a Type constructor may show up.
  • Let handles both recursive and non-recursive let-bindings; see the the two constructors for Bind.
  • Cast is used for an FC cast expression. Corecion is a synonym for Type.
  • Note is used for profiling and debugging information.

Case expressions

Case expressions are the most complicated bit of Core.

  • The case expression can scrutinise
    • a data type (the alternatives are DataAlts), or
    • a primitive literal type (the alternatives are LitAlts), or
    • a value of any type at all (if there is one DEFAULT alternative).

  • If there is a DEFAULT alternative, it must appear first.
  • The remaining non-DEFAULT alternatives must appear in order of

This makes finding the relevant constructor easy, and makes comparison easier too.

  • The list of alternatives is always exhaustive, meaning that it covers all cases that can occur. An "exhausive" case does not necessarily mention all constructors:
    data Foo = Red | Green | Blue
    
    ...case x of 
    	Red   -> True
    	other -> f (case x of 
    			Green -> ...
    			Blue  -> ... )
    
    The inner case does not need a Red alternative, because x can't be Red at that program point.