wiki:StrictPragma

The Strict Language Pragma

This page explains the motivation, semantics, and implementation of a new language pragma named Strict.

The Problem

High-performance Haskell code (e.g. numeric code) can sometimes be littered with bang patterns, making it harder to read. The reason is that laziness isn't the right default in this particular code, but the programmer has no way to say that except by repeatedly adding bang patterns. This page proposes a new language pragma that allows the programmer to switch the default on a per module basis.

Semantics

Informally the -XStrict language extension switches functions, data types, and bindings to be strict by default, allowing optional laziness by adding ~ in front of a variable. This essentially reverses the present situation where laziness is default and strictness can be optionally had by adding ! in front of a variable.

  • Function definitions. When the user writes
    f x = ...
    
    we interpret it as if she had written
    f !x = ...
    
    Adding ~ in front of x gives the old lazy behavior.
  • Let/where bindings. When the user writes
    let x = ...
    let pat = ...
    
    we interpret it as if she had written
    let !x = ...
    let !pat = ...
    
    Adding ~ in front of x gives the old lazy behavior. Notice that we do not put bangs on nested patterns. For example
     let (p,q) = if flob then (undefined, undefined) else (True, False)
     in ...
    
    will behave like
     let !(p,q) = if flob then (undefined, undefined) else (True, False)
    
    which will strictly evaluate the RHS, and bind p and q to the components of the pair. But the pair itself is lazy (unless we also compile the Prelude with -XStrict; see "Modularity" below). So p and q may end up bound to undefined. See also "Recursive and polymorphic let bindings" below.
  • Case expressions. The patterns of a case expression get an implicit bang, unless disabled with ~. For example
      case x of (a,b) -> rhs
    
    is interpreted as
      case x of !(a,b) -> rhs
    
    Since the semantics of pattern matching in case expressions is strict, this usually has no effect whatsoever. But it does make a difference in the degenerate case of variables and newtypes. So
      case x of y -> rhs
    
    is lazy in Haskell, but with -XStrict is interpreted as
     case x of !y -> rhs
    
    which evalutes x. Similarly, if newtype Age = MkAge Int, then
     case x of MkAge i -> rhs
    
    is lazy in Haskell; but with -XStrict the added bang makes it strict.
  • Top level bindings are unaffected by -XStrict. For example:
     x = factorial 20
     (y,z) = if x > 10 then True else False
    
    Here x and the pattern binding (y,z) remain lazy. Reason: there is no good moment to force them, until first use.
  • Data types. When the user writes
    data T = C a
    
    we interpret it as if she had written
    data T = C !a
    
    Haskell doesn't allow for ~ patterns in data constructor definitions today. We'll add support for such patterns and have it give the old lazy behavior.
  • Newtypes. There is no effect on newtypes, which simply rename existing types. For example:
    newtype T = C a
    f (C x)  = rhs1
    g !(C x) = rhs2
    
    In ordinary Haskell , f is lazy in its argument and hence in x; and g is strict in its argument and hence also strict in x. With -XStrict, both become strict because f's argument gets an implict bang.

Modularity

The pragma only affects definitions in this module. Functions and data types imported from other modules are unaffected. For example, we won't evaluate the argument to Just before applying the constructor. Similarly we won't evaluate the first argument to Data.Map.findWithDefault before applying the function.

This is crucial to preserve correctness. Entities defined in other modules might rely on laziness for correctness (whether functional or performance).

Tuples, lists, Maybe, and all the other types from Prelude continue to have their existing, lazy, semantics.

Recursive and polymorphic let bindings

Consider a banged let-binding

  let !pat = rhs in body

Bang patterns in let bindings today (GHC 7.8.3 and earlier) behave as described in the user manual:

  • The binding cannot be recursive
  • The variables bound by the pattern always get monomorphic types
  • The complete pattern is matched before evaluation of body begins

The intent was that it is valid to desugar such a binding to

  case rhs of pat -> body

This currently applies even if the pattern is just a single variable, so that the case boils down to a seq.

Continuing with this rule would mean that -XStrict would not allow recursive or polymoprhic pattern bindings at all. So instead we propose the following revised specification for bang patterns in let bindings.

  • Static semantics. Exactly as in Haskell, unaffected by -XStrict. This is more permissive than the current rule for bang patterns in let bindings, because it supports bang-patterns for polymorphic and recursive bindings.
  • Dynamic semantics. Consider the rules in the box of Section 3.12 of the Haskell report. Replace these rules with the following ones, where v stands for a variable

    • (FORCE). Replace any binding !p = e with v = e; p = v and replace e0 with v `seq` e0, where v is fresh. This translation works fine if p is already a variable x, but can obviously be optimised by not introducing a fresh variable v.

    • (SPLIT). Replace any binding p = e, where p is not a variable, with v = e; x1 = case v of p -> x1; ...; xn = case v of p -> xn, where v is fresh and x1..xn are the bound variables of p. Again if e is a variable, you can optimised his by not introducing a fresh variable.

The result will be a (possibly) recursive set of bindings, binding only simple variables on the LHS. (One could go one step further, as in the Haskell Report and make the recursive bindings non-recursive using fix, but we do not do so in Core, and it only obfuscates matters, so we do not do so here.)

Here are some examples of how this translation works. The first expression of each sequence is Haskell source; the subsequent ones are Core.

Here is a simple non-recursive case.

let x :: Int     -- Non-recursive
    !x = factorial y
in body

 ===> (FORCE)
     let x = factorial y in x `seq` body

 ===> (inline seq)
     let x = factorial y in case x of x -> body

 ===> (inline x)
     case factorial y of x -> body

Same again, only with a pattern binding

let !(x,y) = if blob then (factorial p, factorial q) else (0,0)
in body

 ===> (FORCE)
     let v = if blob then (factorial p, factorial q) else (0,0)
         (x,y) = v
     in v `seq` body

 ===> (SPLIT)
     let v = if blob then (factorial p, factorial q) else (0,0)
         x = case v of (x,y) -> x
         y = case v of (x,y) -> y
     in v `seq` body

 ===> (inline seq, float x,y bindings inwards)
     let v = if blob then (factorial p, factorial q) else (0,0)
     in case v of v -> let x = case v of (x,y) -> x
                           y = case v of (x,y) -> y
                       in body

 ===> (fluff up v's pattern; this is a standard Core optimisation)
     let v = if blob then (factorial p, factorial q) else (0,0)
     in case v of v@(p,q) -> let x = case v of (x,y) -> x
                                 y = case v of (x,y) -> y
                             in body

 ===> (case of known constructor)
     let v = if blob then (factorial p, factorial q) else (0,0)
     in case v of v@(p,q) -> let x = p
                                 y = q
                             in body

 ===> (inline x,y)
     let v = if blob then (factorial p, factorial q) else (0,0)
     in case v of (p,q) -> body[p/x, q/y]

The final form is just what we want: a simple case expression.

Here is a recursive case

letrec xs :: [Int]  -- Recursive
       !xs = factorial y : xs
in body

 ===> (FORCE)
     letrec xs = factorial y : xs in xs `seq` body

 ===> (inline seq)
     letrec xs = factorial y : xs in case xs of xs -> body

 ===> (eliminate case of value)
     letrec xs = factorial y : xs in body

and a polymorphic one:

let f :: forall a. [a] -> [a]    -- Polymorphic
    !f = fst (reverse, True)
in body

 ===> (FORCE)
     let f = /\a. fst (reverse a, True) in f `seq` body
        -- Notice that the `seq` is added only in the translation to Core
        -- If we did it in Haskell source, thus
        --    let f = ... in f `seq` body
        -- then f's polymorphic type would get intantiated, so the Core
        -- translation woudl be
        --    let f = ... in f Any `seq` body

 ===> (inline seq, inline f)
     case (/\a. fst (reverse a, True)) of f -> body

When overloading is involved, the results might be slightly counter intuitive:

let f :: forall a. Eq a => a -> [a] -> Bool    -- Overloaded
    !f = fst (member, True)
in body

 ===> (FORCE)
     let f = /\a \(d::Eq a). fst (member, True) in f `seq` body

 ===> (inline seq, case of value)
     let f = /\a \(d::Eq a). fst (member, True) in body

Note that the bang has no effect at all in this case

Implementation

TODO Find all the places where we do special things for bang patterns and list them here.

Last modified 8 months ago Last modified on Jul 31, 2014 7:30:09 AM