wiki:DataParallel/CodeVectorisation

Vectorisation for nested data parallelism

TODO This material needs to be revised and we need to come up with a plan for getting some programs to quickly get some programs to compile. Also integrate the lifted case example.

We will try to implement full blown vectorisation using an explicit closure representation on Core code after lambda lifting. The transformation performs closure conversion and vectorisation in one sweep. We represent scalar and array closures as follows:

data a :-> b = forall env. PA env =>  Clo env     (env -> a -> b) ([:env:] -> [:a:] -> [:b:])
data a :=> b = forall env. PA env => AClo [:env:] (env -> a -> b) ([:env:] -> [:a:] -> [:b:])

(#) :: (a :-> b) -> a -> b
(#) (Clo env f _) = f env

(##) :: (a :=> b) -> [:a:] -> [:b:]
(##) (AClo envs _ fs) = fs envs

It is important that both kinds of closures include scalar and lifted code, as we need to move between a :-> b and a :=> b in both directions due to the functions:

replicateP :: Int -> (a :-> b) -> (a :=> b)
replicateP n (Clo env f fs) = AClo (replicateP n env) f fs

(!:) :: (a :=> b) -> Int -> (a :-> b)
i !: AClo envs f fs = Clo (i !: envs) f fs

In other words, we move between the two types of closures simply by replicating and indexing into the environment.

We do not have any explicit type transformations. These are all encoded using associated types of the parallel array type class PA:

class PA e where
  data [:e:]
  type Vec e
  (!:) :: [:e:] -> Int -> e
  -- and so on

instance PA () where
  data [:():]  = PAUnit Int
  type Vect () = ()
  PAUnit len !: i | i < len   = ()
                  | otherwise = error "..."

instance PA Int where
  data [:Int:] = PAInt Int [!Int!]
  type Vect Int = Int
  PAInt l as !: i = as `indexU` i

instance (PA a, PA b) => PA (a, b) where
  data [:(a, b):]  = PAProd [:a:] [:b:]
  type Vect (a, b) = (Vect a, Vect b)
  PAProd as bs !: i = (as !: i, bs :! i)

instance (PA a, PA b) => PA (Either a b) where
  data [:Either a b:]    = PASum [:Bool:] [:Int:] [:a:] [:b:]
  type Vect (Either a b) = Either (Vect a) (Vect b)
  PASum sels idx as bs !: i = if sels!:i then a!:(idx!:i) else b!:(idx!:i)

instance PA a => PA [:a:] where
  data [:[:a:]:] = PAArr [:Int:] [:a:]
  type Vect [:a:] = [:[:a:]:]
  PAArr segd as !: i = sliceP from size as
    where
      segd' = scanlP (+) 0 segd
      from  = segd'!:i
      size  = segd !:i

instance (PA a, PA b) => PA (a -> b) where
  data [: a -> b :] = PAFun [:Vect a :-> Vect b:]
  type Vect (a -> b) = Vect a :-> Vect b
  PAFun as !: i = as !: i

instance (PA a, PA b) => PA (a :-> b) where
  data [: a :-> b :] = PAClo (a :=> b)
  type Vect (a :-> b) = a:-> b  -- shouldn't happen, right?
  PAClo (AClo envs f fs) !: i = Clo (envs!:i) f fs

Mixing Vectorised and Scalar Code

We have two types of modules: (a) modules compiled as ever, which we call 'scalar modules', and (b) 'vectorised modules'. Scalar modules export the same code as before. Vectorised modules export additional identifiers.

  • For every variable f :: t, we have in addition
    f^p :: t^v
    f^p = V[[e]]
    
    the code for f is not the original scalar code. Instead, it is defined as
    f :: t
    f = unvect f^p
    
  • For every data type T, we have in addition T^v.
  • For every function M.f :: a -> b imported from a scalar module M, we generate and use the following definition instead:
    f :: (a -> b)^v
    f = vect M.f
    

The functions vect and unvect are defined in the same type classes where t^v is defined as an associated type. !!TODO: Try to define these two functions, to be sure we can actually do it.

Moreover, we like to have a toplevel declarations of the form derive PA (T) that create a suitable PA instance of a previously defined (and possibly imported) data type T.

Various Ideas to Avoid Full Blown Vectorisation

We discussed some approaches that would lead to a certain degree of vectorisation, but avoid dealing with issues, such as arrays of functions.

  • We could have rewrite rules as follows (for a vectorised function):
    mapP f           -> f^
    mapP f^          -> inject f^
    mapP (inject f^) -> inject (mapP f^)
    
    where inject is the flatten/partition combination.

!!TODO: What else was there???

Last modified 7 years ago Last modified on Mar 19, 2007 7:25:20 AM