Changes between Version 32 and Version 33 of DataParallel/Replicate


Ignore:
Timestamp:
Aug 16, 2011 12:32:35 PM (3 years ago)
Author:
chak
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • DataParallel/Replicate

    v32 v33  
    8585=== Collapse repeated segments === 
    8686 
    87 In practice, segments descriptors store more information than just the segment length.  They at least additionally store the start position of each segment in the data array.  In the conventional representation, an invariant is that the start positions are such that the segments don't overlap.  To represent arrays with repeated segments more efficiently, we propose to relax that invariant.  Specifically, all start positions of a contiguous sequence of repeated segments are the same; i.e., the segment data is stored only once per sequence of repeated segments. 
     87In practice, segments descriptors store more information than just the segment length.  They at least additionally store the start position of each segment in the data array.  In the conventional representation, an invariant is that the start positions are such that the segments don't overlap.  To represent arrays with repeated segments more efficiently, we might relax that invariant.  Specifically, all start positions of a contiguous sequence of repeated segments may be the same; i.e., the segment data is stored only once per sequence of repeated segments. 
    8888 
    8989Then, we have for `[:[:1, 2, 3:], [:1, 2, 3:], [:1, 2, 3:]:]`, 
     
    102102This is merely a change in the array representation that does not affect vectorisation. 
    103103 
    104 === Segment descriptor representation === 
     104'''TODO:''' I am not sure whether we want to talk about this representation at all.  Maybe just go to the one with virtual segments straight away.  I suspect the latter will be less confusing in the paper. 
     105 
     106=== Segment descriptor representation with virtual segments === 
    105107 
    106108Instead, of repeating the start indices in a segment descriptor, we alternatively might want to represent a segmented array with repeated segments by distinguishing its ''physical'' from its ''logical'' (or ''virtual'') representation.  Specifically, instead of representing `[:[:1, 2, 3:], [:1, 2, 3:], [:1, 2, 3:]:]` as 
     
    132134}}} 
    133135 
    134 == Operations on arrays with repeated segments == 
    135  
    136 As multiple segments overlap in arrays with repeated segments, array consumers need to be adapted to work correctly in this situation. 
     136We choose the representation with virtual segments over the one with repeated start indicies for the following reasons: 
     137 * `packP` on a nested array that arose from replication only needs to operate on the virtual segments. 
     138 * Clean separation between the physical representation and the logical. 
     139 * Probably easier to subsequentyly move to Roman's proposal that moves replication information to the top of the representation. 
     140 
     141== Operations on arrays with virtual segments == 
     142 
     143Array consumers need to be adapted to work correctly on virtual segments and we need to be careful to make sure that although index space transformations introduced by vectorisation, such as `packP`, operate with a work complexity in proportion to the number of virtual segments (and not the size of the virtual number of elements).  Moreover, other consumers, such as folds, must operate in work complexity in proportion to the physical size of the segmented array (and not in proportion to the virtual size.) 
    137144 
    138145=== Lifted indexing === 
    139146 
    140 In the `smvm` example, a replicated array is consumed by lifted indexing to extract matching elements of the vector for all non-zero elements of the matrix.  Using just an length array as a segment descriptor without overlapping segments, lifted indexing might be implemented as follows: 
     147In the `smvm` example, a replicated array is consumed by lifted indexing to extract matching elements of the vector for all non-zero elements of the matrix.  Using just an length array as a segment descriptor without virtual segments, lifted indexing might be implemented as follows: 
    141148{{{ 
    142149(as_len, as_data) !:^ is = bpermutePA as_data ((prescanPA (+) 0 as_len) +^ is)