wiki:DataParallel/Replicate

Version 2 (modified by chak, 3 years ago) (diff)

--

Preventing space blow-up due to replicate

The problem

The vectorisation transformation lifts scalar computations into vector space. In the course of this lifting, scalar constants are duplicated to fill an array, using the function 'replicateP'. Array computations are lifted in a similar manner, which leads to array constants being replicated to form arrays of arrays, which are represented as a segmented arrays. A simple example is our 'smvm' example code:

smvm :: [:[: (Int, Double) :]:] -> [:Double:] -> [:Double:]
smvm m v = [: sumP [: x * (v !: i) | (i,x) <- row :] | row <- m :]

Here the variable 'v' is constant in the array comprehensions and will be replicated while lifting the expression v !: i. In other words, for every single element in a row, lifting implies the allocation of a separate copy of of the entire array v — and this only to perform a single indexing operation on that copy of v. More precisely, in the lifted code, lifted indexing (which we usually denote by (!^) is applied to a nested array consisting of multiple copies of v; i.e., it is applied to the result of replicateP (length row) v.

This is clearly wasted effort and space. However, the situation is even worse in Ben's pathological example:

treeLookup :: [:Int:] -> [:Int:] -> [:Int:]
treeLookup table xx
  | lengthP xx == 1
  = [: table !: (xx !: 0) :]
  | otherwise
  = let len     = lengthP xx
          half    = len `div` 2
          s1      = sliceP 0    half xx
          s2      = sliceP half half  xx           
      in concatP (mapP (treeLookup table) [: s1, s2 :])

Here table is constant in mapP (treeLookup table) [: s1, s2 :]; hence, the entire table gets duplicated on each level of the recursion, leading to space consumption that is exponential in the depth of the recursion.