Changes between Version 10 and Version 11 of DataParallel/Replicate


Ignore:
Timestamp:
Aug 7, 2011 8:24:22 AM (3 years ago)
Author:
chak
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • DataParallel/Replicate

    v10 v11  
    88smvm m v = [: sumP [: x * (v !: i) | (i,x) <- row :] | row <- m :] 
    99}}} 
    10 Here the variable 'v' is constant in the array comprehensions and will be replicated while lifting the expression `v !: i`.   In other words, for every single element in a `row`, lifting implies the allocation of a separate copy of of the entire array `v` — and this only to perform a single indexing operation on that copy of `v`.  More precisely, in the lifted code, lifted indexing (which we usually denote by `(!^)` is applied to a nested array consisting of multiple copies of `v`; i.e., it is applied to the result of `replicateP (length row) v`.   
     10Here the variable 'v' is constant in the array comprehensions and will be replicated while lifting the expression `v !: i`.   In other words, for every single element in a `row`, lifting implies the allocation of a separate copy of of the entire array `v` — and this only to perform a single indexing operation on that copy of `v`.  More precisely, in the lifted code, lifted indexing (which we usually denote by `(!:^)` is applied to a nested array consisting of multiple copies of `v`; i.e., it is applied to the result of `replicateP (length row) v`.   
    1111 
    1212This is clearly wasted effort and space.  However, the situation is even worse in Ben's pathological example: 
     
    5353In practice, segments descriptors store more information than just the segment length.  They at least additionally store the start position of each segment in the data array.  In the conventional representation, an invariant is that the start positions are such that the segments don't overlap.  To represent arrays with repeated segments more efficiently, we propose to relax that invariant.  Specifically, all start positions of a contiguous sequence of repeated segments are the same; i.e., the segment data is stored only once per sequence of repeated segments. 
    5454 
     55Then, we have for `[:[:1, 2, 3:], [:1, 2, 3:], [:1, 2, 3:]:]`, 
     56{{{ 
     57start: [:0, 0, 0:] 
     58len:   [:3, 3, 3:] 
     59data:  [:1, 2, 3:]) 
     60}}} 
     61and for `[:[:1, 2:], [:1, 2:], [:3:], [:3:], [:3:]:]`, 
     62{{{ 
     63start: [:0, 0, 2, 2, 2:] 
     64len:   [:2, 2, 1, 1, 1:] 
     65data:  [:1, 2, 3:]) 
     66}}} 
     67 
     68This is merely a change in the array representation that does not affect vectorisation. 
     69 
     70== Operations on arrays with repeated segments == 
     71 
     72As multiple segments overlap in arrays with repeated segments, array consumers need to be adapted to work correctly in this situation. 
     73 
     74=== Lifted indexing === 
     75 
     76In the `smvm` example, a replicated array is consumed by lifted indexing to extract matching elements of the vector for all non-zero elements of the matrix.  Using just an length array as a segment descriptor without overlapping segments, lifted indexing might be implemented as follows: 
     77{{{ 
     78(as_len, as_data) !:^ is = bpermute ((prescan (+) 0 as_len) +^ is) as_data 
     79}}} 
     80 
     81With overlapping segments, we have 
     82{{{ 
     83(as_start, as_len, as_data) !:^ is = bpermute (as_start +^ is) as_data 
     84}}} 
     85In the case of `smvm`, where the first argument is produced by `replicateP (length row) v`, we have `as_start = replicate (length row) 0` and `as-data = v`.  In other words, lifted indexing draws from a single copy of `v`, which is what we wanted. 
     86 
    5587== Related work == 
    5688