Changes between Version 1 and Version 2 of Commentary/Compiler/CaseStudy/Bool


Ignore:
Timestamp:
Jan 23, 2013 4:52:43 PM (3 years ago)
Author:
jstolarek
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • Commentary/Compiler/CaseStudy/Bool

    v1 v2  
    1 = Current Bool implementation =
    2 
    3 This page gives a hopefully comprehensive view of how `Bool` type is wired-in into the compiler. For easier location of functions within the source code I list the line numbers in which they appear. This may however change very quickly. If you find that is the case please update this wiki page. All paths to are given relative to `$(TOP)/compiler` where `$(TOP)` is the root of GHC sources.
    4 
    5 == Constants for Bool type and data constructors ==
    6 
    7 All data constructors, type constructors and so on have their unique identifier which is needed during the compilation process. For the wired-in types these unique values are defined in the `prelude/PrelNames.lhs`. In case of `Bool` the relevant definitions look like this:
    8 
    9 {{{
    10 boolTyConKey, falseDataConKey, trueDataConKey :: Unique
    11 boolTyConKey    = mkPreludeTyConUnique    4 -- line 1256
    12 falseDataConKey = mkPreludeDataConUnique  4 -- line 1445
    13 trueDataConKey  = mkPreludeDataConUnique 15 -- line 1451
    14 }}}
    15 
    16 === A side note on generating Unique values ===
    17 
    18 The `mkPreludeTyConUnique` and `mkPreludeDataConUnique` take care of generating a unique `Unique` value. They are defined in `basicTypes/Unique.lhs`:
    19 
    20 {{{
    21 data Unique = MkUnique FastInt
    22 
    23 mkPreludeTyConUnique :: Int -> Unique
    24 mkPreludeTyConUnique i = mkUnique '3' (3*i)
    25 
    26 mkPreludeDataConUnique :: Int -> Unique
    27 mkPreludeDataConUnique i = mkUnique '6' (2*i)
    28 }}}
    29 
    30 You will find definition of `mkUnique :: Char -> Int -> Unique` at line 135 in `basicTypes/Unique.lhs`.
    31 
    32 == Defining wired-in information about Bool ==
    33 
    34 All the wired-in information that compiler needs to know about `Bool` is defined in `prelude/TysWiredIn.lhs`. This file exports following functions related to `Bool`:
    35 
    36 {{{
    37   boolTy, boolTyCon, boolTyCon_RDR, boolTyConName,
    38   trueDataCon,  trueDataConId,  true_RDR,
    39   falseDataCon, falseDataConId, false_RDR,
    40 }}}
    41 
    42 They define `Name`s, `RdrName`s, `Type`, `TyCon`, `DataCon`s and `Id`s for `Bool` type and its two data constructors `True` and `False`.
    43 
    44 == Defining Names of type and data constructors ==
    45 
    46 Having defined unique constants we can finally define all needed information about type and data constructors. These definitions might be tricky because they are mutually recursive.
    47 
    48 Definitions of type and data constructor `Name` look like this (lines 185-188):
    49 
    50 {{{
    51 boolTyConName, falseDataConName, trueDataConName :: Name
    52 boolTyConName     = mkWiredInTyConName   UserSyntax gHC_TYPES (fsLit "Bool")  boolTyConKey    boolTyCon
    53 falseDataConName  = mkWiredInDataConName UserSyntax gHC_TYPES (fsLit "False") falseDataConKey falseDataCon
    54 trueDataConName   = mkWiredInDataConName UserSyntax gHC_TYPES (fsLit "True")  trueDataConKey  trueDataCon
    55 }}}
    56 
    57 `boolTyConKey`, `falseDataConKey` and `trueDataConKey` are `Unique` values defined earlier. `boolTyCon`, `falseDataCon` and `trueDataCon` are yet undefined. Type of syntax is defined in `basicTypes/Names.lhs`, line 129:
    58 
    59 {{{
    60 data BuiltInSyntax = BuiltInSyntax | UserSyntax
    61 }}}
    62 
    63 `BuiltInSyntax` is used for things like (:), [] and tuples. All other things are `UserSyntax`. `gHC_TYPES` is a module `GHC.Types` to which these type and data constructors get assigned. It is defined in `prelude/PrelNames.lhs`:
    64 
    65 {{{
    66 gHC_TYPES = mkPrimModule (fsLit "GHC.Types") -- line 359
    67 
    68 mkPrimModule :: FastString -> Module               -- line 435
    69 mkPrimModule m = mkModule primPackageId (mkModuleNameFS m)
    70 }}}
    71 
    72 `FastString` is a string type based on `ByteStrings` and the `fsLit` function converts a standard Haskell `Strings` to `FastString`. See `utils/FastString.lhs` for more details.
    73 
    74 === A side note on creating wired-in Names ===
    75 
    76 `Name` is a data type used across the compiler to give a unique name to something and identify where that thing originated from (see [http://hackage.haskell.org/trac/ghc/wiki/Commentary/Compiler/NameType NameType] for more details):
    77 
    78 {{{
    79 data Name = Name {
    80                 n_sort :: NameSort,     -- What sort of name it is
    81                 n_occ  :: !OccName,     -- Its occurrence name
    82                 n_uniq :: FastInt,     
    83                 n_loc  :: !SrcSpan      -- Definition site
    84             }
    85     deriving Typeable
    86 
    87 data NameSort
    88   = External Module
    89   | WiredIn Module TyThing BuiltInSyntax
    90   | Internal
    91   | System
    92 }}}
    93 
    94 The `mkWiredInTyConName` and `mkWiredInDataConName` are functions that create `Name`s for wired in types and data constructors. They are defined in `prelude/TysWiredIn.lhs`, lines 163-173:
    95 
    96 {{{
    97 mkWiredInTyConName :: BuiltInSyntax -> Module -> FastString -> Unique -> TyCon -> Name
    98 mkWiredInTyConName built_in modu fs unique tycon
    99   = mkWiredInName modu (mkTcOccFS fs) unique
    100       (ATyCon tycon)  -- Relevant TyCon
    101       built_in
    102 
    103 mkWiredInDataConName :: BuiltInSyntax -> Module -> FastString -> Unique -> DataCon -> Name
    104 mkWiredInDataConName built_in modu fs unique datacon
    105   = mkWiredInName modu (mkDataOccFS fs) unique
    106       (ADataCon datacon)  -- Relevant DataCon
    107       built_in
    108 }}}
    109 
    110 The `mkWiredInName` is defined in `basicTypes/Names.lhs` (lines 279-283), and it just assigns values to fields of `Name`:
    111 
    112 {{{
    113 mkWiredInName :: Module -> OccName -> Unique -> TyThing -> BuiltInSyntax -> Name
    114 mkWiredInName mod occ uniq thing built_in
    115   = Name { n_uniq = getKeyFastInt uniq,
    116            n_sort = WiredIn mod thing built_in,
    117            n_occ = occ, n_loc = wiredInSrcSpan}
    118 }}}
    119 
    120 == !RdrNames for Bool ==
    121 
    122 Having defined `Name`s for `Bool`, the [http://hackage.haskell.org/trac/ghc/wiki/Commentary/Compiler/RdrNameType RdrName]s can be defined (`prelude/TysWiredIn.lhs`, lines 221-225):
    123 
    124 {{{
    125 boolTyCon_RDR, false_RDR, true_RDR :: RdrName
    126 boolTyCon_RDR   = nameRdrName boolTyConName
    127 false_RDR = nameRdrName falseDataConName
    128 true_RDR  = nameRdrName trueDataConName
    129 }}}
    130 
    131 `nameRdrName` is defined in `basicTypes.lhs` (line 203) and it simply wraps the `Name` into one of `RdrName`'s value constructors:
    132 
    133 {{{
    134 nameRdrName :: Name -> RdrName
    135 nameRdrName name = Exact name
    136 }}}
    137 
    138 == Type and Data constructors for Bool ==
    139 
    140 Having defined the `Name`s we can define type and data constructors for `Bool`. Lines 578--588 contain these definitions:
    141 
    142 {{{
    143 boolTy :: Type
    144 boolTy = mkTyConTy boolTyCon
    145 
    146 boolTyCon :: TyCon
    147 boolTyCon = pcTyCon True NonRecursive boolTyConName
    148                     (Just (CType Nothing (fsLit "HsBool")))
    149                     [] [falseDataCon, trueDataCon]
    150 
    151 falseDataCon, trueDataCon :: DataCon
    152 falseDataCon = pcDataCon falseDataConName [] [] boolTyCon
    153 trueDataCon  = pcDataCon trueDataConName  [] [] boolTyCon
    154 }}}
    155 
    156 Note that `boolTyCon` is on the list of wired in type constructors created by `wiredInTyCons :: [TyCon]` (line 138).
    157 
    158 === A side note on functions generating type and data constructors ===
    159 
    160 `types/TypeRep.lhs`, lines 281-282:
    161 
    162 {{{
    163 mkTyConTy :: TyCon -> Type
    164 mkTyConTy tycon = TyConApp tycon []
    165 }}}
    166 
    167 `prelude/TysWiredIn.lhs`, 247-257:
    168 
    169 {{{
    170 pcTyCon :: Bool -> RecFlag -> Name -> Maybe CType -> [TyVar] -> [DataCon] -> TyCon
    171 pcTyCon is_enum is_rec name cType tyvars cons
    172   = tycon
    173   where
    174     tycon = mkAlgTyCon name
    175     (mkArrowKinds (map tyVarKind tyvars) liftedTypeKind)
    176                 tyvars
    177                 cType
    178                 []    -- No stupid theta
    179     (DataTyCon cons is_enum)
    180     NoParentTyCon
    181                 is_rec
    182     False   -- Not in GADT syntax
    183 }}}
    184 
    185 `prelude/TysWiredIn.lhs`, 261-297:
    186 
    187 {{{
    188 pcDataCon :: Name -> [TyVar] -> [Type] -> TyCon -> DataCon
    189 pcDataCon = pcDataConWithFixity False
    190 
    191 pcDataConWithFixity :: Bool -> Name -> [TyVar] -> [Type] -> TyCon -> DataCon
    192 pcDataConWithFixity infx n = pcDataConWithFixity' infx n (incrUnique (nameUnique n))
    193 
    194 pcDataConWithFixity' :: Bool -> Name -> Unique -> [TyVar] -> [Type] -> TyCon -> DataCon
    195 pcDataConWithFixity' declared_infix dc_name wrk_key tyvars arg_tys tycon
    196   = data_con
    197   where
    198     data_con = mkDataCon dc_name declared_infix
    199                 (map (const HsNoBang) arg_tys)
    200                 []  -- No labelled fields
    201                 tyvars
    202     []  -- No existential type variables
    203     []  -- No equality spec
    204     []  -- No theta
    205     arg_tys (mkTyConApp tycon (mkTyVarTys tyvars))
    206     tycon
    207     []  -- No stupid theta
    208                 (mkDataConWorkId wrk_name data_con)
    209     NoDataConRep  -- Wired-in types are too simple to need wrappers
    210 
    211     modu     = ASSERT( isExternalName dc_name )
    212          nameModule dc_name
    213     wrk_occ  = mkDataConWorkerOcc (nameOccName dc_name)
    214     wrk_name = mkWiredInName modu wrk_occ wrk_key
    215            (AnId (dataConWorkId data_con)) UserSyntax
    216 }}}
    217 
    218 == Generating Id for True and False data constructors ==
    219 
    220 Finally, lines 590-592 contain definitions of `Id` for `True` and `False` data constructors:
    221 
    222 {{{
    223 falseDataConId, trueDataConId :: Id
    224 falseDataConId = dataConWorkId falseDataCon
    225 trueDataConId  = dataConWorkId trueDataCon
    226 }}}
    227 
    228 `falseDataConId` and `trueDataConId` just extract `Id` from previously defined data constructors. These definitions are from `basicTypes/DataCon.lhs`:
    229 
    230 {{{
    231 data DataCon -- line 253
    232   = MkData {
    233   ...
    234   dcWorkId :: Id -- line 360
    235   ...
    236  }
    237 
    238 dataConWorkId :: DataCon -> Id -- line 736
    239 dataConWorkId dc = dcWorkId dc
    240 }}}
    241 
    242 == Final remarks ==
    243 
    244 Remember that all the non-primitive wired-in things are also defined in GHC's libraries. `Bool` is defined in ghc-prim library, `GHC.Types` module: {{{data {-# CTYPE "HsBool" #-} Bool = False | True}}} See [http://hackage.haskell.org/trac/ghc/wiki/Commentary/Compiler/WiredIn Wired-in and known-key things] for more details