wiki:Commentary/Pipeline

Version 5 (modified by guest, 8 years ago) (diff)

--

[ Up: Commentary ]

The compilation pipeline

When GHC compiles a module, it calls other programs, and generates a series of intermediate files. Here's a summary of the process.

We start with Foo.hs or Foo.lhs, the "l" specifing whether literate style is being used.

  • Run the unlit pre-processor, unlit, to remove the literate markup, generating ???. The unlit processor is a C program kept in utils/unlit.
  • Run CPP (if -fcpp is specified), generating Foo.cpp or Foo.lpp respectively.
  • Run the compiler itself. This does not start a separate process; it's just a call to a Haskell function. This step always generates an interface file Foo.hi, and depending on what flags you give, it also generates a compiled file:
    • Assembly code: flag -S, file Foo.s
    • C code: flag -fviaC, file Foo.hc
  • Run the C compiler or assembler, as appropriate, generating Foo.o

Interface files

An interface file supports separate compilation by recording the information gained by compiling M.hs in its interface file M.hi. Morally speaking, the interface file M.hi is part of the object file M.o; it's like a super symbol-table for M.o.

Interface files are kept in binary, GHC-specific format. The format of these files changes with each GHC release, but not with patch-level releases. You can see what's in an interface file (often very useful) thus:

  ghc --show-iface M.hi

Here are some of the things stored in an interface file M.hi

  • A list of what M exports.
  • The types of exported functions, definition of exported types, and so on.
  • Version information, used to drive the smart recompilation checker.
  • The strictness, arity, and unfolding of exported functions. This is crucial for cross-module optimisation; but it is only included when you compile with -O.

HC files

GHC uses gcc as a code generator, in a very stylised way:

  • Generate Foo.hc
  • Compile it with gcc, using register declarations to nail a bunch of things into registers (e.g. the allocation pointer)
  • Post-process the generated assembler code with the Evil Mangler