Changes between Version 1 and Version 2 of Commentary/Compiler/CaseStudy/Bool


Ignore:
Timestamp:
Jan 23, 2013 4:52:43 PM (15 months ago)
Author:
jstolarek
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • Commentary/Compiler/CaseStudy/Bool

    v1 v2  
    1 = Current Bool implementation = 
    2  
    3 This page gives a hopefully comprehensive view of how `Bool` type is wired-in into the compiler. For easier location of functions within the source code I list the line numbers in which they appear. This may however change very quickly. If you find that is the case please update this wiki page. All paths to are given relative to `$(TOP)/compiler` where `$(TOP)` is the root of GHC sources. 
    4  
    5 == Constants for Bool type and data constructors == 
    6  
    7 All data constructors, type constructors and so on have their unique identifier which is needed during the compilation process. For the wired-in types these unique values are defined in the `prelude/PrelNames.lhs`. In case of `Bool` the relevant definitions look like this: 
    8  
    9 {{{ 
    10 boolTyConKey, falseDataConKey, trueDataConKey :: Unique 
    11 boolTyConKey    = mkPreludeTyConUnique    4 -- line 1256 
    12 falseDataConKey = mkPreludeDataConUnique  4 -- line 1445 
    13 trueDataConKey  = mkPreludeDataConUnique 15 -- line 1451 
    14 }}} 
    15  
    16 === A side note on generating Unique values === 
    17  
    18 The `mkPreludeTyConUnique` and `mkPreludeDataConUnique` take care of generating a unique `Unique` value. They are defined in `basicTypes/Unique.lhs`: 
    19  
    20 {{{ 
    21 data Unique = MkUnique FastInt 
    22  
    23 mkPreludeTyConUnique :: Int -> Unique 
    24 mkPreludeTyConUnique i = mkUnique '3' (3*i) 
    25  
    26 mkPreludeDataConUnique :: Int -> Unique 
    27 mkPreludeDataConUnique i = mkUnique '6' (2*i) 
    28 }}} 
    29  
    30 You will find definition of `mkUnique :: Char -> Int -> Unique` at line 135 in `basicTypes/Unique.lhs`. 
    31  
    32 == Defining wired-in information about Bool == 
    33  
    34 All the wired-in information that compiler needs to know about `Bool` is defined in `prelude/TysWiredIn.lhs`. This file exports following functions related to `Bool`: 
    35  
    36 {{{ 
    37   boolTy, boolTyCon, boolTyCon_RDR, boolTyConName, 
    38   trueDataCon,  trueDataConId,  true_RDR, 
    39   falseDataCon, falseDataConId, false_RDR, 
    40 }}} 
    41  
    42 They define `Name`s, `RdrName`s, `Type`, `TyCon`, `DataCon`s and `Id`s for `Bool` type and its two data constructors `True` and `False`. 
    43  
    44 == Defining Names of type and data constructors == 
    45  
    46 Having defined unique constants we can finally define all needed information about type and data constructors. These definitions might be tricky because they are mutually recursive. 
    47  
    48 Definitions of type and data constructor `Name` look like this (lines 185-188): 
    49  
    50 {{{ 
    51 boolTyConName, falseDataConName, trueDataConName :: Name 
    52 boolTyConName     = mkWiredInTyConName   UserSyntax gHC_TYPES (fsLit "Bool")  boolTyConKey    boolTyCon 
    53 falseDataConName  = mkWiredInDataConName UserSyntax gHC_TYPES (fsLit "False") falseDataConKey falseDataCon 
    54 trueDataConName   = mkWiredInDataConName UserSyntax gHC_TYPES (fsLit "True")  trueDataConKey  trueDataCon 
    55 }}} 
    56  
    57 `boolTyConKey`, `falseDataConKey` and `trueDataConKey` are `Unique` values defined earlier. `boolTyCon`, `falseDataCon` and `trueDataCon` are yet undefined. Type of syntax is defined in `basicTypes/Names.lhs`, line 129: 
    58  
    59 {{{ 
    60 data BuiltInSyntax = BuiltInSyntax | UserSyntax 
    61 }}} 
    62  
    63 `BuiltInSyntax` is used for things like (:), [] and tuples. All other things are `UserSyntax`. `gHC_TYPES` is a module `GHC.Types` to which these type and data constructors get assigned. It is defined in `prelude/PrelNames.lhs`: 
    64  
    65 {{{ 
    66 gHC_TYPES = mkPrimModule (fsLit "GHC.Types") -- line 359 
    67  
    68 mkPrimModule :: FastString -> Module               -- line 435 
    69 mkPrimModule m = mkModule primPackageId (mkModuleNameFS m) 
    70 }}} 
    71  
    72 `FastString` is a string type based on `ByteStrings` and the `fsLit` function converts a standard Haskell `Strings` to `FastString`. See `utils/FastString.lhs` for more details. 
    73  
    74 === A side note on creating wired-in Names === 
    75  
    76 `Name` is a data type used across the compiler to give a unique name to something and identify where that thing originated from (see [http://hackage.haskell.org/trac/ghc/wiki/Commentary/Compiler/NameType NameType] for more details): 
    77  
    78 {{{ 
    79 data Name = Name { 
    80                 n_sort :: NameSort,     -- What sort of name it is 
    81                 n_occ  :: !OccName,     -- Its occurrence name 
    82                 n_uniq :: FastInt,       
    83                 n_loc  :: !SrcSpan      -- Definition site 
    84             } 
    85     deriving Typeable 
    86  
    87 data NameSort 
    88   = External Module 
    89   | WiredIn Module TyThing BuiltInSyntax 
    90   | Internal 
    91   | System 
    92 }}} 
    93  
    94 The `mkWiredInTyConName` and `mkWiredInDataConName` are functions that create `Name`s for wired in types and data constructors. They are defined in `prelude/TysWiredIn.lhs`, lines 163-173: 
    95  
    96 {{{ 
    97 mkWiredInTyConName :: BuiltInSyntax -> Module -> FastString -> Unique -> TyCon -> Name 
    98 mkWiredInTyConName built_in modu fs unique tycon 
    99   = mkWiredInName modu (mkTcOccFS fs) unique 
    100       (ATyCon tycon)  -- Relevant TyCon 
    101       built_in 
    102  
    103 mkWiredInDataConName :: BuiltInSyntax -> Module -> FastString -> Unique -> DataCon -> Name 
    104 mkWiredInDataConName built_in modu fs unique datacon 
    105   = mkWiredInName modu (mkDataOccFS fs) unique 
    106       (ADataCon datacon)  -- Relevant DataCon 
    107       built_in 
    108 }}} 
    109  
    110 The `mkWiredInName` is defined in `basicTypes/Names.lhs` (lines 279-283), and it just assigns values to fields of `Name`: 
    111  
    112 {{{ 
    113 mkWiredInName :: Module -> OccName -> Unique -> TyThing -> BuiltInSyntax -> Name 
    114 mkWiredInName mod occ uniq thing built_in 
    115   = Name { n_uniq = getKeyFastInt uniq, 
    116            n_sort = WiredIn mod thing built_in, 
    117            n_occ = occ, n_loc = wiredInSrcSpan} 
    118 }}} 
    119  
    120 == !RdrNames for Bool == 
    121  
    122 Having defined `Name`s for `Bool`, the [http://hackage.haskell.org/trac/ghc/wiki/Commentary/Compiler/RdrNameType RdrName]s can be defined (`prelude/TysWiredIn.lhs`, lines 221-225): 
    123  
    124 {{{ 
    125 boolTyCon_RDR, false_RDR, true_RDR :: RdrName 
    126 boolTyCon_RDR   = nameRdrName boolTyConName 
    127 false_RDR = nameRdrName falseDataConName 
    128 true_RDR  = nameRdrName trueDataConName 
    129 }}} 
    130  
    131 `nameRdrName` is defined in `basicTypes.lhs` (line 203) and it simply wraps the `Name` into one of `RdrName`'s value constructors: 
    132  
    133 {{{ 
    134 nameRdrName :: Name -> RdrName 
    135 nameRdrName name = Exact name 
    136 }}} 
    137  
    138 == Type and Data constructors for Bool == 
    139  
    140 Having defined the `Name`s we can define type and data constructors for `Bool`. Lines 578--588 contain these definitions: 
    141  
    142 {{{ 
    143 boolTy :: Type 
    144 boolTy = mkTyConTy boolTyCon 
    145  
    146 boolTyCon :: TyCon 
    147 boolTyCon = pcTyCon True NonRecursive boolTyConName 
    148                     (Just (CType Nothing (fsLit "HsBool"))) 
    149                     [] [falseDataCon, trueDataCon] 
    150  
    151 falseDataCon, trueDataCon :: DataCon 
    152 falseDataCon = pcDataCon falseDataConName [] [] boolTyCon 
    153 trueDataCon  = pcDataCon trueDataConName  [] [] boolTyCon 
    154 }}} 
    155  
    156 Note that `boolTyCon` is on the list of wired in type constructors created by `wiredInTyCons :: [TyCon]` (line 138). 
    157  
    158 === A side note on functions generating type and data constructors === 
    159  
    160 `types/TypeRep.lhs`, lines 281-282: 
    161  
    162 {{{ 
    163 mkTyConTy :: TyCon -> Type 
    164 mkTyConTy tycon = TyConApp tycon [] 
    165 }}} 
    166  
    167 `prelude/TysWiredIn.lhs`, 247-257: 
    168  
    169 {{{ 
    170 pcTyCon :: Bool -> RecFlag -> Name -> Maybe CType -> [TyVar] -> [DataCon] -> TyCon 
    171 pcTyCon is_enum is_rec name cType tyvars cons 
    172   = tycon 
    173   where 
    174     tycon = mkAlgTyCon name 
    175     (mkArrowKinds (map tyVarKind tyvars) liftedTypeKind) 
    176                 tyvars 
    177                 cType 
    178                 []    -- No stupid theta 
    179     (DataTyCon cons is_enum) 
    180     NoParentTyCon 
    181                 is_rec 
    182     False   -- Not in GADT syntax 
    183 }}} 
    184  
    185 `prelude/TysWiredIn.lhs`, 261-297: 
    186  
    187 {{{ 
    188 pcDataCon :: Name -> [TyVar] -> [Type] -> TyCon -> DataCon 
    189 pcDataCon = pcDataConWithFixity False 
    190  
    191 pcDataConWithFixity :: Bool -> Name -> [TyVar] -> [Type] -> TyCon -> DataCon 
    192 pcDataConWithFixity infx n = pcDataConWithFixity' infx n (incrUnique (nameUnique n)) 
    193  
    194 pcDataConWithFixity' :: Bool -> Name -> Unique -> [TyVar] -> [Type] -> TyCon -> DataCon 
    195 pcDataConWithFixity' declared_infix dc_name wrk_key tyvars arg_tys tycon 
    196   = data_con 
    197   where 
    198     data_con = mkDataCon dc_name declared_infix 
    199                 (map (const HsNoBang) arg_tys) 
    200                 []  -- No labelled fields 
    201                 tyvars 
    202     []  -- No existential type variables 
    203     []  -- No equality spec 
    204     []  -- No theta 
    205     arg_tys (mkTyConApp tycon (mkTyVarTys tyvars)) 
    206     tycon 
    207     []  -- No stupid theta 
    208                 (mkDataConWorkId wrk_name data_con) 
    209     NoDataConRep  -- Wired-in types are too simple to need wrappers 
    210  
    211     modu     = ASSERT( isExternalName dc_name ) 
    212          nameModule dc_name 
    213     wrk_occ  = mkDataConWorkerOcc (nameOccName dc_name) 
    214     wrk_name = mkWiredInName modu wrk_occ wrk_key 
    215            (AnId (dataConWorkId data_con)) UserSyntax 
    216 }}} 
    217  
    218 == Generating Id for True and False data constructors == 
    219  
    220 Finally, lines 590-592 contain definitions of `Id` for `True` and `False` data constructors: 
    221  
    222 {{{ 
    223 falseDataConId, trueDataConId :: Id 
    224 falseDataConId = dataConWorkId falseDataCon 
    225 trueDataConId  = dataConWorkId trueDataCon 
    226 }}} 
    227  
    228 `falseDataConId` and `trueDataConId` just extract `Id` from previously defined data constructors. These definitions are from `basicTypes/DataCon.lhs`: 
    229  
    230 {{{ 
    231 data DataCon -- line 253 
    232   = MkData { 
    233   ... 
    234   dcWorkId :: Id -- line 360 
    235   ... 
    236  } 
    237  
    238 dataConWorkId :: DataCon -> Id -- line 736 
    239 dataConWorkId dc = dcWorkId dc 
    240 }}} 
    241  
    242 == Final remarks == 
    243  
    244 Remember that all the non-primitive wired-in things are also defined in GHC's libraries. `Bool` is defined in ghc-prim library, `GHC.Types` module: {{{data {-# CTYPE "HsBool" #-} Bool = False | True}}} See [http://hackage.haskell.org/trac/ghc/wiki/Commentary/Compiler/WiredIn Wired-in and known-key things] for more details