Version 27 (modified by chak, 4 years ago) (diff) |
---|

# The VECTORISE pragma

The vectoriser needs to know about all types and functions whose vectorised variants are directly implemented by the DPH library (instead of generated by the vectoriser), and it needs to know what the vectorised versions are. That is the purpose of the `VECTORISE` pragma (which comes in in number of flavours).

## The basic VECTORISE pragma for values

Given a function `f`, the vectoriser generates a vectorised version `f_v`, which comprises the original, scalar version of the function and a second version lifted into array space. The lifted version operates on arrays of inputs and produces arrays of results in one parallel computation. The original function name is, then, rebound to use the scalar version referred to by `f_v`. This differs from the original in that it uses vectorised versions for any embedded parallel array computations.

However, if a variable `f` is accompanied by a pragma of the form

{-# VECTORISE f = e #-}

then the vectoriser defines `f_v = e` and refrains from rebinding `f`. This implies that for `f :: t`, `e`'s type is the `t` vectorised (in particular), `e`'s type uses the array closure type `(:->)` instead of the vanilla function space `(->)`. The vectoriser checks that `e` has the appropriate type.

This pragma can also be used for imported functions `f`. In this case, `f_v` and a suitable vectorisation mapping of `f` to `f_v` is exported implicitly — just like `RULES` applied to imported identifiers. By vectorising imported functions, we can vectorise functions of modules that have not been compiled with `-fvectorise`. This is crucial to using the standard `Prelude` in vectorised code.

**IMPLEMENTATION RESTRICTION:** Currently the right-hand side of the equation —i.e., `e`— may only be a simple identifier **and** it must be at the correct type instance. More precisely, the Core type of the right-hand side must be identical to the vectorised version of `t`.

## The NOVECTORISE pragma for values

If a variable `f` is accompanied by a pragma

{-# NOVECTORISE f #-}

then it is ignored by the vectoriser — i.e., no function `f_v` is generated and `f` is left untouched.

This pragma can only be used for bindings in the current module (exactly like an `INLINE` pragma).

**Caveat:** If `f`'s definition contains bindings that are being floated to the toplevel, those bindings will still be vectorised.

## The VECTORISE SCALAR pragma for functions

Functions that contain no array computations, especially if they are cheap (such as `(+)`), should not be vectorised, but applied by simply mapping them over an array. This could be achieved by using the `VECTORISE` pragma with an appropriate right-hand side, but leads to repetitive code that we rather like the compiler to generate.

If a unary function `f` is accompanied by a pragma

{-# VECTORISE SCALAR f #-}

then the vectoriser generates

f_v = closure1 f (scalar_map f)

and keeps `f` unchanged.

For a binary function, it generates

f_v = closure2 f (scalar_zipWith f)

for a tertiary function, it generates

f_v = closure3 f (scalar_zipWith3 f)

and so on. (The variable `f` must have a proper function type.)

This pragma can also be used on imported functions `f`, in the same manner as the plain `VECTORISE` pragma.

## The basic VECTORISE pragma for type constructors

### Without right-hand side

For a type constructor `T`, the pragma

{-# VECTORISE type T #-}

indicates that the type `T` should be vectorised and embeds no parallel arrays. This is the same as where the vectoriser automatically decides to vectorise a type, but no special vectorised representation needs to be generated as the type embeds no arrays. The purpose of this pragma is to enable the vectorisation of imported types from modules that where not compiled with vectorisation enabled.

The data type constructor `T` that together with its constructors `Cn` may be used in vectorised code, where `T` and the `Cn` represent themselves in vectorised code. An example is the treatment of 'Bool'. 'Bool' together with 'False' and 'True' may appear in vectorised code and they remain unchanged by vectorisation. (There is no need for a special representation as the values cannot embed any arrays.)

The type constructor `T` must be in scope, but it may be imported. 'PData' and 'PRepr' instances are automatically generated by the vectoriser.

**ALTERNATIVE:** This pragma simply means to treat an imported tycon as if it was defined in this module (and is automatically vectorised as usual). This is what I just implemented.

**OPEN QUESTION:**

- Do we need to be able to specify that an imported type embedding arrays should be vectorised including the generation of a specialised right-hand side?

### With right-hand side

{-# VECTORISE type T = ty #-}

**TODO**

- This isn't fully implemented yet. (Implemented up to and including desugaring and being put into
`ModGuts`, but not used in the vectoriser.)

## The VECTORISE SCALAR pragma for type constructors

For a type constructor `T`, the pragma

{-# VECTORISE SCALAR type T #-}

indicates that the type is scalar; i.e., it has no embedded arrays and its constructors can *only* be used in scalar code. Note that the type cannot be parameterised (as we would not be able to rule out that a type parameter is instantiated with an array type at a usage site.)

Due to this pragma declaration, `T` that may be used in vectorised code, where `T` represents itself. However, the representation of `T` is opaque in vectorised code. An example is the treatment of `Int`. `Int`s can be used in vectorised code and remain unchanged by vectorisation. However, the representation of `Int` by the `I#` data constructor wrapping an `Int#` is not exposed in vectorised code. Instead, computations involving the representation need to be confined to scalar code.

The type constructor `T` must be in scope, but it may be imported. The `PData` and `PRepr` instances for `T` need to be manually defined. (For types that the vectoriser automatically determines that they don't need a vectorised version, instances for `PData` and `PRepr` are still generated automatically.)

NB: The crucial difference between `{-# VECTORISE type T1 #-}` and `{-# VECTORISE SCALAR type T2 #-}` is that the *representation* (i.e., the constructors) of the latter can only be used in scalar code. However, the representation of both `T1` and `T2` does not get vectorised — so, both types are suitable for code that does not get vectorised due to vectorisation avoidance.

**TODO**

- For type constructors identified with this pragma, can we generate an
`instance`of the`Scalar`type class automatically (instead of relying on it being in the library)?

## Cross-module functionality

The various `VECTORISE` pragmas can be applied to imported identifiers (both variables and types). The resulting vectorisation mappings and the vectorised version of the identifier will will be implicitly exported — much like it is the case for `RULES` defined on imported identifiers.