wiki:Commentary/Pipeline

Version 29 (modified by dterei, 3 years ago) (diff)

--

Video: Compilation Pipeline and interface files (17'30")

Overview

GHC is structured into two parts:

  • The ghc package (in subdirectory compiler), which implements almost all GHC's functionality. It is an ordinary Haskell library, and can be imported into a Haskell program by saying import GHC.
  • The ghc binary (in subdirectory ghc) which implements imports the ghc package, and implements the I/O for the ghci interactive loop.

Here's an overview of the module structure of the top levels of GHC library. (Note: more precisly, this is the plan. Currently the module Make below is glommed into the giant module GHC.)

          |---------------------------------|
          |              GHC                |
          | The root module for the GHC API |
          | Very little code;               |
          | just simple wrappers            |
          |---------------------------------|
                     /                \
                    /                  \
                   /                    \
 |------------------------|    |------------------------|
 |        GhcMake         |    |    InteractiveEval     |
 | Implements --make      |    | Stuff to support the   |
 | Deals with compiling   |    | GHCi interactive envt  |
 |    multiple modules    |    |                        |
 |------------------------|    |------------------------|
           |                                |
           |                                |
           |      --------------------      |
- - - - - -| - - -|     GhcMonad     |- - - | - - - - - - - -
           |      --------------------      |
           |                                |
           |                                |
 |-------------------------|                |
 |   DriverPipeline        |                |
 | Deals with compiling    |                |
 |  *a single module*      |                |
 | through all its stages  |                |
 | (cpp, unlit, compile,   |                |
 |  assemble, link etc)    |                |
 |-------------------------|                |
              \                             |
               \                            |
                \                           |  
         |----------------------------------------------|
         |                    HscMain                   |
         | Compiling a single module (or expression or  |
         | stmt) to bytecode, or to a M.hc or M.s file  |
         |----------------------------------------------|
              |      |       |         |       |
            Parse Rename Typecheck Optimise CodeGen

The compilation pipeline

When GHC compiles a module, it calls other programs, and generates a series of intermediate files. Here's a summary of the process. (source reference: compiler/main/DriverPipeline.hs)

We start with Foo.hs or Foo.lhs, the "l" specifing whether literate style is being used.

  • Run the unlit pre-processor, unlit, to remove the literate markup, generating Foo.lpp. The unlit processor is a C program kept in utils/unlit.
  • Run the C preprocessor, cpp, (if -cpp is specified), generating Foo.hspp.
  • Run the compiler itself. This does not start a separate process; it's just a call to a Haskell function. This step always generates an 'interface file' Foo.hi, and depending on what flags you give, it also generates a compiled file. As GHC supports three backend code generators currently (a native code generator, a c code generator and an llvm code generator) the possible range of outputs depends on the backend used. All three support assembly output:
    • Assembly code: flag -S, file Foo.s (supported by all three)
    • C code: flags -C, file Foo.hc (only supported by C backend)
  • In the -fvia-C case:
    • Run the C compiler on Foo.hc, to generate Foo.raw_s.
    • Run the Evil Mangler, generating Foo.s
  • If -split-objs is in force, run the splitter on Foo.s. This splits Foo.s into lots of small files. The idea is that the static linker will thereby avoid linking dead code.
  • Run the assembler on Foo.s, or if -split-objs is in force, on each individual assembly file.

The compiler pipeline

The compiler itself, independent of the external tools, is also structured as a pipeline. For details (and a diagram), see Commentary/Compiler/HscMain