# Demand analyser in GHC

This page explains basics of the so-called demand analysis in GHC, comprising strictness and absence analyses. Meanings of demand signatures are explained and examples are provided. Also, components of the compiler possibly affected by the results of the demand analysis are listed with explanations provided.

- The demand-analyser draft paper is as yet unpublished, but gives the most accurate overview of the way GHC's demand analyser works.

## Demand signatures

Let us compile the following program with `-O2 -ddump-stranal`

flags:

f c p = case p of (a, b) -> if c then (a, True) else (True, False)

The resulting demand signature for function `f`

will be the following one:

Str=DmdType <S,U><S,U(UA)>m

This should be read as "`f`

puts stricts demands on both its arguments (hence, `S`

); `f`

might use its first and second arguments. but in the second argument (which is a product), the second component is ignored". The suffix `m`

in the demand signature indicates that the function returns **CPR**, a constructed product result (for more information on CPR see the JFP paper Constructed Product Result Analysis for Haskell).

Current implementation of demand analysis in Haskell performs annotation of all binders with demands, put on them in the context of their use. For functions, it is assumed, that the result of the function is used strictly. The analysis infers strictness and usage information separately, as two components of a cartesian product domain. The same analysis also performs inference CPR and bottoming properties for functions, which can be read from the suffix of the signature. Demand signatures of inner definitions may also include *demand environments* that indicate demands, which a closure puts to its free variables, once strictly used, e.g. the signature

Str=DmdType <L,U> {skY-><S,U>}

indicates that the function has one parameter, which is used lazily (hence `<L,U>`

), however, when its result is used strictly, the free variable `skY`

in its body is also used strictly.

### Demand descriptions

Strictness demands

`B`

-- a*hyperstrict*demand. The expression`e`

puts this demand on its argument`x`

if every evaluation of`e`

is guaranteed to diverge, regardless of the value of the argument. We call this demand*hyperstrict*because it is safe to evaluate`x`

to arbitrary depth before evaluating`e`

. This demand is polymorphic with respect to function calls and can be seen as`B = C(B) = C(C(B)) = ...`

for an arbitrary depth.

`L`

-- a*lazy*demand. If an expression`e`

places demand`L`

on a variable`x`

, we can deduce nothing about how`e`

uses`x`

.`L`

is the completely uninformative demand, the top element of the lattice.

`S`

-- a*head-strict*demand. If`e`

places demand`S`

on`x`

then`e`

evaluates`x`

to at least head-normal form; that is, to the outermost constructor of`x`

. This demand is typically placed by the`seq`

function on its first argument. The demand`S(L ... L)`

places a lazy demand on all the components, and so is equivalent to`S`

; hence the identity`S = S(L ... L)`

. Another identity is for functions, which states that`S = C(L)`

. Indeed, if a function is certainly called, it is evaluated at lest up to the head normal form, i.e.,*strictly*. However, its result may be used lazily.

`S(s1 ... sn)`

-- a structured strictness demand on a product. It is at least head-strict, and perhaps more.

`C(s)`

-- a*call-demand*, when placed on a binder`x`

, indicates that the value is a function, which is always called and its result is used according to the demand`s`

.

Absence/usage demands

`A`

-- when placed on a binder`x`

it means that`x`

is definitely unused.

`U`

-- the value is used on some execution path. This demand is a top of usage domain.

`H`

-- a*head-used*demand. Indicates that a product value is used itself, however its components are certainly ignored. This demand is typically placed by the`seq`

function on its first argument. This demand is polymorphic with respect to products and functions. For a product, the head-used demand is expanded as`U(A, ..., A)`

and for functions it can be read as`C(A)`

, as the function is called (i.e., evaluated to at least a head-normal form), but its result is ignored.

`U(u1 ... un)`

-- a structured usage demand on a product. It is at least head-used, and perhaps more.

`C(u)`

-- a*call-demand*for usage information. When put on a binder`x`

, indicates that`x`

in all executions paths where`x`

is used, it is*applied*to some argument, and the result of the application is used with a demand`u`

.

Additional information (demand signature suffix)

`m`

-- the function returns a constructed product result.

`b`

-- the function definitely diverges.

`x`

-- the function catches exceptions. For instance, consider`catch undefined g`

: naturally,`catch`

is strict in its first argument and therefore one would usually think that this expression would bottom. However,`catch`

has special semantics: it catches exceptions. Consequently we give the first argument a demand of`C(L)x`

to indicate that an application of`catch`

to bottom can't be assumed to be itself bottom.

## Worker-Wrapper split

Demand analysis in GHC drives the *worker-wrapper transformation*, which exposes specialised calling conventions to the rest of the compiler. In particular, the worker-wrapper transformation implements the unboxing optimisation.

The worker-wrapper transformation splits each
function `f`

into a *wrapper*, with the
ordinary calling convention, and a *worker*, with a specialised
calling convention. The wrapper serves as an impedance-matcher to the
worker; it simply calls the worker using the specialised calling convention.
The transformation can be expressed directly in GHC's intermediate language.
Suppose that `f`

is defined thus:

f :: (Int,Int) -> Int f p = <rhs>

and that we know that `f`

is strict in its argument (the pair, that is),
and uses its components.
What worker-wrapper split shall we make? Here is one
possibility:

f :: (Int,Int) -> Int f p = case p of (a,b) -> $wf a b $wf :: Int -> Int -> Int $wf a b = let p = (a,b) in <rhs>

Now the wrapper, `f`

, can be inlined at every call site, so that
the caller evaluates `p`

, passing only the components to the worker
`$wf`

, thereby implementing the unboxing transformation.

But what if `f`

did not use `a`

, or `b`

? Then it would be silly to
pass them to the worker `$wf`

. Hence the need for absence
analysis. Suppose, then, that we know that `b`

is not needed. Then
we can transform to:

f :: (Int,Int) -> Int f p = case p of (a,b) -> $wf a $wf :: Int -> Int $wf a = let p = (a,error "abs") in <rhs>

Since `b`

is not needed, we can avoid passing it from the wrapper to
the worker; while in the worker, we can use `error "abs"`

instead of
`b`

.

In short, the worker-wrapper transformation allows the knowledge gained from strictness and absence analysis to be exposed to the rest of the compiler simply by performing a local transformation on the function definition. Then ordinary inlining and case elimination will do the rest, transformations the compiler does anyway.

## Relevant compiler parts

Multiple parts of GHC are sensitive to changes in the nature of demand signatures and results of the demand analysis, which might cause unexpected errors when hacking into demands. This list enumerates the parts of the compiler that are sensitive to demand, with brief summaries of how so.

## Instrumentation

For the Journal version of the demand analysis paper we created some instrumentation

- to measure how often a thunk is entered (to see if the update code was useful), and also
- to find out why a thunk is expected to be entered multiple times.

The code adds significant complexity to the demand analyser and the code generator, so we decided not to merge it into master (not even hidden behind flags), but should it ever have to be resurrected, it can be found in the branch `wip/T10613`

. View the full diff (at least as long as the link is valid, as it hard-codes the base commit).