Changes between Version 6 and Version 7 of Commentary/Libraries/Integer


Ignore:
Timestamp:
Jun 28, 2012 9:24:52 PM (22 months ago)
Author:
simonpj
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • Commentary/Libraries/Integer

    v6 v7  
    1515== The Integer interface  == 
    1616 
    17 All Integer implementations should export the same set of types and functions from `GHC.Integer`. These exports are used by the `base` package However, all of these types and functions must actually be defined in `GHC.Integer.Type`, so that GHC knows where to find them. 
     17All Integer implementations should export the same set of types and functions from `GHC.Integer` (within whatever `integer` package you are using). These exports are used by the `base` package However, all of these types and functions must actually be defined in `GHC.Integer.Type`, so that GHC knows where to find them. 
    1818 
    1919== How Integer is handled inside GHC  == 
    2020 
    21 Integers are represented using the `HsInteger` constructor of `HsLit` for the early phases of compilation (e.g. type checking), but for later stages, once we use the `Core` representation, they are converted to the `LitInteger` constructor of the `Literal` type by `mkIntegerExpr`. While `Integer`s aren't "machine literals" like the other `Literal` constructors, it is more convenient when writing rules to pretend that they are literals rather than having to understand their real core representation. We also carry around a `Type`, representing the `Integer` type, in the constructor, as we need access to it in a few functions (e.g. `literalType`). 
     21 * '''Front end'''.  Integers are represented using the `HsInteger` constructor of `HsLit` for the early phases of compilation (e.g. type checking) 
    2222 
    23 All of the types and functions in the `Integer` interface have built-in names, e.g. `plusIntegerName`, defined in [[GhcFile(compiler/prelude/PrelNames.lhs)]] and included in `basicKnownKeyNames`. This allows us to match on all of the functions in `builtinIntegerRules` in [[GhcFile(compiler/prelude/PrelRules.lhs)]], so we can constant-fold Integer expressions. 
     23 * '''Core'''.  In `Core` representation, an integer literal is represented by the LitInteger` constructor of the `Literal` type.  
     24{{{ 
     25data Literal = ... | LitInteger Integer Type 
     26}}} 
     27 While `Integer`s aren't "machine literals" like the other `Core` `Literal` constructors, it is more convenient when writing constant folding RULES to pretend that they are literals rather than having to understand their concrete representation. (Especially as the concrete representation varies from package to package.) We also carry around a `Type`, representing the `Integer` type, in the constructor, as we need access to it in a few functions (e.g. `literalType`). 
    2428 
    25 We keep the `LitInteger` representation as late as possible; in particular, it's important that this representation is used in unfoldings in interface files, so that constant folding can happen on expressions that get inlined. We only convert it to a proper core representation of Integer in [[GhcFile(compiler/coreSyn/CorePrep.lhs)]], which looks up the Id for `mkInteger` and uses it to build an expression like `mkInteger True [123, 456]` (where the `Bool` represents the sign, and the list of `Int`s are 31 bit chunks of the absolute value from lowest to highest). 
     29 * '''Constant folding'''.  There are many constant-folding optimisations for `Integer` expressed as built-in rules in [[GhcFile(compiler/prelude/PrelRules.lhs)]]; look at `builtinIntegerRules`.  All of the types and functions in the `Integer` interface have built-in names, e.g. `plusIntegerName`, defined in [[GhcFile(compiler/prelude/PrelNames.lhs)]] and included in `basicKnownKeyNames`. This allows us to match on all of the functions in `builtinIntegerRules` in [[GhcFile(compiler/prelude/PrelRules.lhs)]], so we can constant-fold Integer expressions. 
    2630 
    27 However, there is a special case for `Integer`s that are within the range of `Int` when the `integer-gmp` implementation is being used; in that case, we use the `S#` constructor (via `integerGmpSDataCon` in [[GhcFile(compiler/prelude/TysWiredIn.lhs)]]) to break the abstraction and directly create the datastructure. 
     31 * '''Representing integers'''.  We stick to the `LitInteger` representation (which hides the concrete representation) as late as possible in the compiler.   In particular, it's important that this representation is used in unfoldings in interface files, so that constant folding can happen on expressions that get inlined.  We finally convert `LitInteger` to a proper core representation of Integer in [[GhcFile(compiler/coreSyn/CorePrep.lhs)]], which looks up the Id for `mkInteger` and uses it to build an expression like `mkInteger True [123, 456]` (where the `Bool` represents the sign, and the list of `Int`s are 31 bit chunks of the absolute value from lowest to highest). 
    2832 
    29 Most of the functions in the Integer implementation are marked `NOINLINE`. This is because inlining them is generally not beneficial (any constant folding is already handled by the built-in rules), and in fact can be harmful: In the GMP representation, each argument can be one of two constructors (`S#` and `J#`), which leads to 2 branches. When you have a number of `Integer` arithmetic operations, you can get an exponential code explosion if they all get inlined. 
     33 However, there is a special case for `Integer`s that are within the range of `Int` when the `integer-gmp` implementation is being used; in that case, we use the `S#` constructor (via `integerGmpSDataCon` in [[GhcFile(compiler/prelude/TysWiredIn.lhs)]]) to break the abstraction and directly create the datastructure. 
     34 
     35 * '''Don't inline integer functions'''.  Most of the functions in the Integer implementation in the `integer` package are marked `NOINLINE`. For example in `integer-gmp` we have 
     36{{{ 
     37plusInteger :: Integer -> Integer -> Integer 
     38plusInteger (S# i1) (S# i2) = ... 
     39plusInteger (S# i1) (J# j1 j2) = ... 
     40...two more cases... 
     41}}} 
     42 Not only is this a big function to inline, but inlining it typically does no good because the representation of literals is abstact, so no pattern-matching cancellation happens.  And even if you have `(a+b+c)`, the conditionals mean that no cancellation happens, or you get an exponential code explosion! 
     43 
     44