wiki:Commentary/Compiler/CaseStudy/Bool

Version 1 (modified by jstolarek, 15 months ago) (diff)

--

Current Bool implementation

This page gives a hopefully comprehensive view of how Bool type is wired-in into the compiler. For easier location of functions within the source code I list the line numbers in which they appear. This may however change very quickly. If you find that is the case please update this wiki page. All paths to are given relative to $(TOP)/compiler where $(TOP) is the root of GHC sources.

Constants for Bool type and data constructors

All data constructors, type constructors and so on have their unique identifier which is needed during the compilation process. For the wired-in types these unique values are defined in the prelude/PrelNames.lhs. In case of Bool the relevant definitions look like this:

boolTyConKey, falseDataConKey, trueDataConKey :: Unique
boolTyConKey    = mkPreludeTyConUnique    4 -- line 1256
falseDataConKey = mkPreludeDataConUnique  4 -- line 1445
trueDataConKey  = mkPreludeDataConUnique 15 -- line 1451

A side note on generating Unique values

The mkPreludeTyConUnique and mkPreludeDataConUnique take care of generating a unique Unique value. They are defined in basicTypes/Unique.lhs:

data Unique = MkUnique FastInt

mkPreludeTyConUnique :: Int -> Unique
mkPreludeTyConUnique i = mkUnique '3' (3*i)

mkPreludeDataConUnique :: Int -> Unique
mkPreludeDataConUnique i = mkUnique '6' (2*i)

You will find definition of mkUnique :: Char -> Int -> Unique at line 135 in basicTypes/Unique.lhs.

Defining wired-in information about Bool

All the wired-in information that compiler needs to know about Bool is defined in prelude/TysWiredIn.lhs. This file exports following functions related to Bool:

  boolTy, boolTyCon, boolTyCon_RDR, boolTyConName,
  trueDataCon,  trueDataConId,  true_RDR,
  falseDataCon, falseDataConId, false_RDR,

They define Names, RdrNames, Type, TyCon, DataCons and Ids for Bool type and its two data constructors True and False.

Defining Names of type and data constructors

Having defined unique constants we can finally define all needed information about type and data constructors. These definitions might be tricky because they are mutually recursive.

Definitions of type and data constructor Name look like this (lines 185-188):

boolTyConName, falseDataConName, trueDataConName :: Name
boolTyConName     = mkWiredInTyConName   UserSyntax gHC_TYPES (fsLit "Bool")  boolTyConKey    boolTyCon
falseDataConName  = mkWiredInDataConName UserSyntax gHC_TYPES (fsLit "False") falseDataConKey falseDataCon
trueDataConName   = mkWiredInDataConName UserSyntax gHC_TYPES (fsLit "True")  trueDataConKey  trueDataCon

boolTyConKey, falseDataConKey and trueDataConKey are Unique values defined earlier. boolTyCon, falseDataCon and trueDataCon are yet undefined. Type of syntax is defined in basicTypes/Names.lhs, line 129:

data BuiltInSyntax = BuiltInSyntax | UserSyntax

BuiltInSyntax is used for things like (:), [] and tuples. All other things are UserSyntax. gHC_TYPES is a module GHC.Types to which these type and data constructors get assigned. It is defined in prelude/PrelNames.lhs:

gHC_TYPES = mkPrimModule (fsLit "GHC.Types") -- line 359

mkPrimModule :: FastString -> Module               -- line 435
mkPrimModule m = mkModule primPackageId (mkModuleNameFS m)

FastString is a string type based on ByteStrings and the fsLit function converts a standard Haskell Strings to FastString. See utils/FastString.lhs for more details.

A side note on creating wired-in Names

Name is a data type used across the compiler to give a unique name to something and identify where that thing originated from (see NameType for more details):

data Name = Name {
                n_sort :: NameSort,     -- What sort of name it is
                n_occ  :: !OccName,     -- Its occurrence name
                n_uniq :: FastInt,      
                n_loc  :: !SrcSpan      -- Definition site
            }
    deriving Typeable

data NameSort
  = External Module
  | WiredIn Module TyThing BuiltInSyntax
  | Internal
  | System

The mkWiredInTyConName and mkWiredInDataConName are functions that create Names for wired in types and data constructors. They are defined in prelude/TysWiredIn.lhs, lines 163-173:

mkWiredInTyConName :: BuiltInSyntax -> Module -> FastString -> Unique -> TyCon -> Name
mkWiredInTyConName built_in modu fs unique tycon
  = mkWiredInName modu (mkTcOccFS fs) unique
      (ATyCon tycon)  -- Relevant TyCon
      built_in

mkWiredInDataConName :: BuiltInSyntax -> Module -> FastString -> Unique -> DataCon -> Name
mkWiredInDataConName built_in modu fs unique datacon
  = mkWiredInName modu (mkDataOccFS fs) unique
      (ADataCon datacon)  -- Relevant DataCon
      built_in

The mkWiredInName is defined in basicTypes/Names.lhs (lines 279-283), and it just assigns values to fields of Name:

mkWiredInName :: Module -> OccName -> Unique -> TyThing -> BuiltInSyntax -> Name
mkWiredInName mod occ uniq thing built_in
  = Name { n_uniq = getKeyFastInt uniq,
           n_sort = WiredIn mod thing built_in,
           n_occ = occ, n_loc = wiredInSrcSpan}

RdrNames for Bool

Having defined Names for Bool, the RdrNames can be defined (prelude/TysWiredIn.lhs, lines 221-225):

boolTyCon_RDR, false_RDR, true_RDR :: RdrName
boolTyCon_RDR   = nameRdrName boolTyConName
false_RDR = nameRdrName falseDataConName
true_RDR  = nameRdrName trueDataConName

nameRdrName is defined in basicTypes.lhs (line 203) and it simply wraps the Name into one of RdrName's value constructors:

nameRdrName :: Name -> RdrName
nameRdrName name = Exact name

Type and Data constructors for Bool

Having defined the Names we can define type and data constructors for Bool. Lines 578--588 contain these definitions:

boolTy :: Type
boolTy = mkTyConTy boolTyCon

boolTyCon :: TyCon
boolTyCon = pcTyCon True NonRecursive boolTyConName
                    (Just (CType Nothing (fsLit "HsBool")))
                    [] [falseDataCon, trueDataCon]

falseDataCon, trueDataCon :: DataCon
falseDataCon = pcDataCon falseDataConName [] [] boolTyCon
trueDataCon  = pcDataCon trueDataConName  [] [] boolTyCon

Note that boolTyCon is on the list of wired in type constructors created by wiredInTyCons :: [TyCon] (line 138).

A side note on functions generating type and data constructors

types/TypeRep.lhs, lines 281-282:

mkTyConTy :: TyCon -> Type
mkTyConTy tycon = TyConApp tycon []

prelude/TysWiredIn.lhs, 247-257:

pcTyCon :: Bool -> RecFlag -> Name -> Maybe CType -> [TyVar] -> [DataCon] -> TyCon
pcTyCon is_enum is_rec name cType tyvars cons
  = tycon
  where
    tycon = mkAlgTyCon name
    (mkArrowKinds (map tyVarKind tyvars) liftedTypeKind)
                tyvars
                cType
                []    -- No stupid theta
    (DataTyCon cons is_enum)
    NoParentTyCon
                is_rec
    False   -- Not in GADT syntax

prelude/TysWiredIn.lhs, 261-297:

pcDataCon :: Name -> [TyVar] -> [Type] -> TyCon -> DataCon
pcDataCon = pcDataConWithFixity False

pcDataConWithFixity :: Bool -> Name -> [TyVar] -> [Type] -> TyCon -> DataCon
pcDataConWithFixity infx n = pcDataConWithFixity' infx n (incrUnique (nameUnique n))

pcDataConWithFixity' :: Bool -> Name -> Unique -> [TyVar] -> [Type] -> TyCon -> DataCon
pcDataConWithFixity' declared_infix dc_name wrk_key tyvars arg_tys tycon
  = data_con
  where
    data_con = mkDataCon dc_name declared_infix
                (map (const HsNoBang) arg_tys)
                []  -- No labelled fields
                tyvars
    []  -- No existential type variables
    []  -- No equality spec
    []  -- No theta
    arg_tys (mkTyConApp tycon (mkTyVarTys tyvars))
    tycon
    []  -- No stupid theta
                (mkDataConWorkId wrk_name data_con)
    NoDataConRep  -- Wired-in types are too simple to need wrappers

    modu     = ASSERT( isExternalName dc_name )
         nameModule dc_name
    wrk_occ  = mkDataConWorkerOcc (nameOccName dc_name)
    wrk_name = mkWiredInName modu wrk_occ wrk_key
           (AnId (dataConWorkId data_con)) UserSyntax

Generating Id for True and False data constructors

Finally, lines 590-592 contain definitions of Id for True and False data constructors:

falseDataConId, trueDataConId :: Id
falseDataConId = dataConWorkId falseDataCon
trueDataConId  = dataConWorkId trueDataCon

falseDataConId and trueDataConId just extract Id from previously defined data constructors. These definitions are from basicTypes/DataCon.lhs:

data DataCon -- line 253
  = MkData {
  ...
  dcWorkId :: Id -- line 360
  ...
 }

dataConWorkId :: DataCon -> Id -- line 736
dataConWorkId dc = dcWorkId dc

Final remarks

Remember that all the non-primitive wired-in things are also defined in GHC's libraries. Bool is defined in ghc-prim library, GHC.Types module: data {-# CTYPE "HsBool" #-} Bool = False | True See Wired-in and known-key things for more details