Version 21 (modified by lelf, 6 years ago) (diff)

fix typo

# The GHC Commentary - Coding Style Guidelines for the compiler

This is a rough description of some of the coding practices and style that we use for Haskell code inside compiler. (For run-time system code see the Coding Style Guidelines for RTS C code.)

The general rule is to stick to the same coding style as is already used in the file you're editing. If you must make stylistic changes, commit them separately from functional changes, so that someone looking back through the change logs can easily distinguish them.

## Warnings

We are aiming to make the GHC code warning-free, for all warnings turned on by

-Wall -fno-warn-name-shadowing


The build automatically sets these flags for the stage 2 compiler.

The validate script, which is used to test the build before commiting, additionally sets the -Werror flag, so that the code must be warning-free to pass validation. The -Werror flag is not set during normal builds, so warnings will be printed but won't halt the build.

Currently we are some way from our goal, so many modules have a

{-# OPTIONS -w #-}


pragma; you are encouraged to remove this pragma and fix any warnings when working on a module.

## To literate or not to literate?

In GHC we use a mixture of literate (.lhs) and non-literate (.hs) source. I (Simon M.) prefer to use non-literate style, because I think the \begin{code}..\end{code} clutter up the source too much, and I like to use Haddock-style comments (we haven't tried processing the whole of GHC with Haddock yet, though).

## To CPP or not to CPP?

We pass all the compiler sources through CPP. The -cpp flag is always added by the build system. The following CPP symbols are used throughout the compiler:

DEBUG
Used to enables extra checks and debugging output in the compiler. The ASSERT macro (see HsVersions.h) provides assertions which disappear when DEBUG is not defined.

HsVersions.h provides a macro debugIsOn which is defined to be True when DEBUG is defined and False otherwise. The ideal way to provide debugging output is to use a Haskell expression "if debugIsOn then ... else ..." to arrange that the compiler will be silent when DEBUG is off (unless of course something goes wrong or the verbosity level is nonzero). The advantage of this scheme is that all code is typechecked on every compilation, no matter what the setting of DEBUG. When option -O is used, GHC will easily sweep away the unreachable code.

As a last resort, debugging code can be placed inside #ifdef DEBUG, but since this strategy guarantees that only a fraction of the code is seen be the compiler on any one compilation, it is to be avoided when possible.

Regarding performance, a good rule of thumb is that DEBUG shouldn't add more than about 10-20% to the compilation time. This is the case at the moment. If it gets too expensive, we won't use it. For more expensive runtime checks, consider adding a flag - see for example -dcore-lint.

GHCI
Enables GHCi support, including the byte code generator and interactive user interface. This isn't the default, because the compiler needs to be bootstrapped with itself in order for GHCi to work properly. The reason is that the byte-code compiler and linker are quite closely tied to the runtime system, so it is essential that GHCi is linked with the most up-to-date RTS. Another reason is that the representation of certain datatypes must be consistent between GHCi and its libraries, and if these were inconsistent then disaster could follow.
Platform tests
There are three platforms of interest to GHC:
• The Build platform: This is the platform on which we are building GHC.
• The Host platform: This is the platform on which we are going to run this GHC binary, and associated tools.
• The Target platform: This is the platform for which this GHC binary will generate code. At the moment, there is very limited support for having different values for build, host, and target. In particular:

The build platform is currently always the same as the host platform. The build process needs to use some of the tools in the source tree, for example ghc-pkg and hsc2hs.

If the target platform differs from the host platform, then this is generally for the purpose of building .hc files from Haskell source for porting GHC to the target platform. Full cross-compilation isn't supported (yet). In the compiler's source code, you may make use of the following CPP symbols:

xxx_TARGET_ARCH
xxx_TARGET_VENDOR
xxx_TARGET_OS
xxx_HOST_ARCH
xxx_HOST_VENDOR
xxx_HOST_OS


where xxx is the appropriate value: eg. i386_TARGET_ARCH.

## Compiler versions and language extensions

GHC must be compilable by every major version of GHC from 6.2 onwards, and itself. It isn't necessary for it to be compilable by every intermediate development version (that includes last week's darcs sources).

To maintain compatibility, use HsVersions.h (see below) where possible, and try to avoid using #ifdef in the source itself.

Also, it is necessary to avoid certain language extensions. In particular, the ScopedTypeVariables extension must not be used.

## The source file

We now describe a typical source file, annotating stylistic choices as we go.

### The OPTIONS pragma

An {-# OPTIONS_GHC ... #-} pragma is optional, but if present it should go right at the top of the file. Things you might want to put in OPTIONS include:

• #include options to bring into scope prototypes for FFI declarations
• -fvia-C if you know that this module won't compile with the native code generator. (deprecated: everything should compile with the NCG nowadays, but that wasn't always the case).

Don't bother putting -cpp or -fglasgow-exts in the OPTIONS pragma; these are already added to the command line by the build system.

### Exports

module Foo (
T(..),
foo,	     -- :: T -> T
) where


We usually (99% of the time) include an export list. The only exceptions are perhaps where the export list would list absolutely everything in the module, and even then sometimes we do it anyway.

It's helpful to give type signatures inside comments in the export list, but hard to keep them consistent, so we don't always do that.

### HsVersions.h

HsVersions.h is a CPP header file containing a number of macros that help smooth out the differences between compiler versions. It defines, for example, macros for library module names which have moved between versions. Take a look compiler/HsVersions.h.

#include "HsVersions.h"


### Imports

List imports in the following order:

• Local to this subsystem (or directory) first
• Compiler imports, generally ordered from specific to generic (ie. modules from utils/ and basicTypes/ usually come last)
• Library imports
• Standard Haskell 98 imports last
-- friends

-- GHC
import CoreSyn
import Id
import BasicTypes

-- libraries
import Data.IORef

-- std
import Data.List
import Data.Maybe


Import library modules from the core packages only (core packages are listed in libraries/core-packages). Use #defines in HsVersions.h when the modules names differ between versions of GHC. For code inside #ifdef GHCI, don't worry about GHC versioning issues, because this code is only ever compiled by the this very version of GHC.

Do not use explicit import lists, except to resolve name clashes. There are several reasons for this:

• They slow down development: almost every change is accompanied by an import list change.
• They cause spurious conflicts between developers.
• They lead to useless warnings about unused imports, and time wasted trying to keep the import declarations "minimal".
• GHC's warnings are useful for detecting unnecessary imports: see -fwarn-unused-imports.
• TAGS is a good way to find out where an identifier is defined (use make tags in ghc/compiler, and hit M-. in emacs).

If the module can be compiled multiple ways (eg. GHCI vs. non-GHCI), make sure the imports are properly #ifdefed too, so as to avoid spurious unused import warnings.

### General Style

It's much better to write code that is transparent than to write code that is short.

Often it's better to write out the code longhand than to reuse a generic abstraction (not always, of course). Sometimes it's better to duplicate some similar code than to try to construct an elaborate generalisation with only two instances. Remember: other people have to be able to quickly understand what you've done, and overuse of abstractions just serves to obscure the really tricky stuff, and there's no shortage of that in GHC.