Changes between Version 20 and Version 21 of DataParallel/Regular


Ignore:
Timestamp:
Jan 19, 2010 9:47:28 AM (4 years ago)
Author:
gckeller
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • DataParallel/Regular

    v20 v21  
    296296such that they abstract over the exact dimensionality of their argument, and have the type 
    297297{{{ 
    298 f::(A.Shape dim, U.Elt e, U.Elt e') => DArray (dim :*: Int ..... :*: Int) e ->  DArray (dim :*: Int :*: .... :*: Int) e' 
     298f::(A.Shape dim, U.Elt e, U.Elt e') =>  
     299  DArray (dim :*: Int ..... :*: Int) e ->  DArray (dim :*: Int :*: .... :*: Int) e' 
    299300}}} 
    300301and those functions can be trivially mapped since 
     
    311312 
    312313 
    313  
     314== Example 1: Matrix multiplication == 
     315As a simple example, consider matrix-matrix multiplication. We can either implement it by directly manipulating the array 
     316function, or use the operations provided by the DArray library. Let as start with the former, which is more fairly similar to 
     317what we would write using loops over array indices: 
     318{{{ 
     319mmMult1::  
     320  DArray (() :*: Int :*: Int)  Double -> DArray (() :*: Int :*: Int)  Double -> DArray (() :*: Int :*: Int)  Double   
     321mmMult1 arr1@(DArray (() :*: m1 :*: n1) _) arr2@(DArray (() :*: m2 :*: n2) _) =  
     322  mapFold (+) 0 arrDP 
     323  where  
     324    arrDP = DArray (():*: m1 :*: n2 :*:n1)  
     325       (\(() :*: i :*: j :*: k) -> (index arr1 (() :*: i :*: k)) * (index arr2 (() :*: k :*: j))) 
     326}}} 
     327In the first step, we create the intermediate three dimensional array which contains the products of all  
     328sums and rows, and in the second step, we collapse each of the rows to it's sum, to obtain the two dimensional 
     329result matrix. It is important to note that the elements of `arrDP` are never all in memory (otherwise, the memory 
     330consumption would be cubic), but each value is consumed immediately by `mapFold`.  
     331 
     332This implementation suffers from the same problem a corresponding C implementation would - since we access one 
     333array row-major, the other column major, the locality is poor. Therefore, first transposing `arr2` and adjusting the 
     334access will actually improve the performance by approximately 40%:  
     335{{{ 
     336mmMult1::  
     337  DArray (() :*: Int :*: Int)  Double -> DArray (() :*: Int :*: Int)  Double -> DArray (() :*: Int :*: Int)  Double   
     338mmMult1 arr1@(DArray (() :*: m1 :*: n1) _) arr2@(DArray (() :*: m2 :*: n2) _) =  
     339  mapFold (+) 0 arrDP 
     340  where  
     341    arr2T = forceDArray $ transpose arr2 
     342    arrDP = DArray (():*: m1 :*: n2 :*:n1)  
     343       (\(() :*: i :*: j :*: k) -> (index arr1 (() :*: i :*: k)) * (index arr2T (() :*: j:*: k))) 
     344}}} 
     345However, we do need to force the actual creation of the transposed array, otherwise, the change would have no effect at all. We therefore  
     346use `forceDArray`, which converts it into an array whose array function is a simple indexing operation (see description of `forceDArray` above). This means that the second version requires more memory, but this is offset by improving the locality for each of the multiplications.  
     347 
     348{{{ 
     349-- mmMult:: (Array.RepFun dim, Array.InitShape dim, Array.Shape dim) =>  
     350--   DArray (dim :*: Int :*: Int)  Double -> DArray (dim :*: Int :*: Int)  Double -> DArray (dim :*: Int :*: Int)  Double   
     351mmMult:: 
     352   DArray (() :*: Int :*: Int)  Double -> DArray (() :*: Int :*: Int)  Double -> DArray (() :*: Int :*: Int)  Double   
     353mmMult arr1@(DArray (sh :*: m1 :*: n1) fn1) arr2@(DArray (sh' :*: m2 :*: n2) fn2) =  
     354  assert ((m1 == n2) && (sh == sh')) $  
     355    mapFold (+) 0 (arr1Ext * arr2Ext) 
     356--  'fold' doesn't fuse at the moment, so mapFold is significantly faster 
     357--  fold (+) 0 $ zipWith (*) arr1Ext arr2Ext 
     358  where 
     359    arr2T   = forceDArray $ transpose arr2  -- forces evaluation of 'transpose' 
     360    arr1Ext = replicate arr1 (Array.IndexAll (Array.IndexFixed m2 (Array.IndexAll Array.IndexNil))) 
     361    arr2Ext = replicate arr2T 
     362                 (Array.IndexAll (Array.IndexAll (Array.IndexFixed n1 Array.IndexNil))) 
     363 
     364}}} 
    314365 
    315366