Changes between Version 20 and Version 21 of DataParallel/Regular


Ignore:
Timestamp:
Jan 19, 2010 9:47:28 AM (6 years ago)
Author:
gckeller
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • DataParallel/Regular

    v20 v21  
    296296such that they abstract over the exact dimensionality of their argument, and have the type
    297297{{{
    298 f::(A.Shape dim, U.Elt e, U.Elt e') => DArray (dim :*: Int ..... :*: Int) e ->  DArray (dim :*: Int :*: .... :*: Int) e'
     298f::(A.Shape dim, U.Elt e, U.Elt e') =>
     299  DArray (dim :*: Int ..... :*: Int) e ->  DArray (dim :*: Int :*: .... :*: Int) e'
    299300}}}
    300301and those functions can be trivially mapped since
     
    311312
    312313
    313 
     314== Example 1: Matrix multiplication ==
     315As a simple example, consider matrix-matrix multiplication. We can either implement it by directly manipulating the array
     316function, or use the operations provided by the DArray library. Let as start with the former, which is more fairly similar to
     317what we would write using loops over array indices:
     318{{{
     319mmMult1::
     320  DArray (() :*: Int :*: Int)  Double -> DArray (() :*: Int :*: Int)  Double -> DArray (() :*: Int :*: Int)  Double 
     321mmMult1 arr1@(DArray (() :*: m1 :*: n1) _) arr2@(DArray (() :*: m2 :*: n2) _) =
     322  mapFold (+) 0 arrDP
     323  where
     324    arrDP = DArray (():*: m1 :*: n2 :*:n1)
     325       (\(() :*: i :*: j :*: k) -> (index arr1 (() :*: i :*: k)) * (index arr2 (() :*: k :*: j)))
     326}}}
     327In the first step, we create the intermediate three dimensional array which contains the products of all
     328sums and rows, and in the second step, we collapse each of the rows to it's sum, to obtain the two dimensional
     329result matrix. It is important to note that the elements of `arrDP` are never all in memory (otherwise, the memory
     330consumption would be cubic), but each value is consumed immediately by `mapFold`.
     331
     332This implementation suffers from the same problem a corresponding C implementation would - since we access one
     333array row-major, the other column major, the locality is poor. Therefore, first transposing `arr2` and adjusting the
     334access will actually improve the performance by approximately 40%:
     335{{{
     336mmMult1::
     337  DArray (() :*: Int :*: Int)  Double -> DArray (() :*: Int :*: Int)  Double -> DArray (() :*: Int :*: Int)  Double 
     338mmMult1 arr1@(DArray (() :*: m1 :*: n1) _) arr2@(DArray (() :*: m2 :*: n2) _) =
     339  mapFold (+) 0 arrDP
     340  where
     341    arr2T = forceDArray $ transpose arr2
     342    arrDP = DArray (():*: m1 :*: n2 :*:n1)
     343       (\(() :*: i :*: j :*: k) -> (index arr1 (() :*: i :*: k)) * (index arr2T (() :*: j:*: k)))
     344}}}
     345However, we do need to force the actual creation of the transposed array, otherwise, the change would have no effect at all. We therefore
     346use `forceDArray`, which converts it into an array whose array function is a simple indexing operation (see description of `forceDArray` above). This means that the second version requires more memory, but this is offset by improving the locality for each of the multiplications.
     347
     348{{{
     349-- mmMult:: (Array.RepFun dim, Array.InitShape dim, Array.Shape dim) =>
     350--   DArray (dim :*: Int :*: Int)  Double -> DArray (dim :*: Int :*: Int)  Double -> DArray (dim :*: Int :*: Int)  Double 
     351mmMult::
     352   DArray (() :*: Int :*: Int)  Double -> DArray (() :*: Int :*: Int)  Double -> DArray (() :*: Int :*: Int)  Double 
     353mmMult arr1@(DArray (sh :*: m1 :*: n1) fn1) arr2@(DArray (sh' :*: m2 :*: n2) fn2) =
     354  assert ((m1 == n2) && (sh == sh')) $
     355    mapFold (+) 0 (arr1Ext * arr2Ext)
     356--  'fold' doesn't fuse at the moment, so mapFold is significantly faster
     357--  fold (+) 0 $ zipWith (*) arr1Ext arr2Ext
     358  where
     359    arr2T   = forceDArray $ transpose arr2  -- forces evaluation of 'transpose'
     360    arr1Ext = replicate arr1 (Array.IndexAll (Array.IndexFixed m2 (Array.IndexAll Array.IndexNil)))
     361    arr2Ext = replicate arr2T
     362                 (Array.IndexAll (Array.IndexAll (Array.IndexFixed n1 Array.IndexNil)))
     363
     364}}}
    314365
    315366