DPH Matrix product memory usage
This report is from the post at Haskell-cafe "DPH matrix product", I'm reporting it here so developers can define if it's a bug or not and its priority.
On a (I think) standar implementation of matrix product on DPH I notice an excessive use of system memory. At execution time, on matrices of size 300*300 the program does finish (although it is very slow), but on 600*600 it consumes GBs of RAM until the process is aborted.
This is the system information:
-
Ubuntu 12.04 32-bit
-
Intel® Core™2 Duo CPU T5270 @ 1.40GHz × 2
-
2.9 GiB RAM
GHC version:
- GHC 7.4.1
DPH libraries:
-
dph-base-0.6.1.1
-
(dph-lifted-base-0.6.1.1)
-
(dph-lifted-vseg-0.6.1.2)
-
(dph-prim-interface-0.6.1.1)
-
(dph-prim-par-0.6.1.1)
-
(dph-prim-seq-0.6.1.1)
Compilation flags:
I'm using two combinations of flags, taken from different sources. In both cases results are identical:
-
From https://github.com/ghc/packages-dph: -rtsopts -threaded -fllvm -optlo-O3 -Odph -fcpr-off -fno-liberate-case -package dph-lifted-vseg
-
From dph-examples: -rtsopts -threaded -fllvm -Odph -package dph-lifted-vseg -fcpr-off -fno-liberate-case -fsimpl-tick-factor=1000
Execution flags:
+RTS -N
Tests:
-
Computing the product of two 400*400 matrices takes 6.037993 seconds.
-
Computing the product of two 600*600 matrices yields "out of memory (requested 1728053248 bytes)".
DPH code:
{-# LANGUAGE ParallelArrays, ParallelListComp #-}
{-# OPTIONS -fvectorise #-}
module DPH_mmult_wrapper (matMult_wrapper, Matrix_wrapper) where
import qualified Prelude
import Data.Array.Parallel
import Data.Array.Parallel.Prelude.Double as D
import Data.Array.Parallel.Prelude.Int as I
type MMultType = Double
type Matrix = [:[:MMultType:]:]
type MVector = [:MMultType:]
type Matrix_wrapper = PArray (PArray MMultType)
-- matMult_wrapper assumes mB is already transposed
{-# NOINLINE matMult_wrapper #-}
matMult_wrapper :: Matrix_wrapper -> Matrix_wrapper -> Matrix_wrapper
matMult_wrapper mA mB = toPArrayP (mapP toPArrayP (matMult (fromNestedPArrayP mA) (fromNestedPArrayP mB)))
matMult :: Matrix -> Matrix -> Matrix
matMult mA mB = mapP (\row -> mapP (\col -> dotp row col) mB) mA
dotp :: MVector -> MVector -> MMultType
dotp row col = D.sumP (zipWithP (D.*) row col)
I'm reporting this as I think it is the kind of problems intended to be solved in the last definition of the internal DPH structure (the one from "Work Efficient Higher-Order Vectorisation" paper).
If there is any information missing, please comment and I will update the report.
Thanks.