wiki:NestedCPR/Akio2017

This page tracks @akio's attempt at reviving @nomeata's work on NestedCPR.

Latest code can be found at https://github.com/takano-akio/ghc/compare/master...nested-cpr .

Design

Changes to CPRResult

  • In the product case (RedProd), keep one DmdResult for each component. This is the main change in the branch.
  • Add NeverReturns contructor. It means that no value is returned from this expression. It's different from ThrowsExn in that it might perform some side effect. Indeed it might be an IO-performing infinite loop. The addition of this constructor is necessary to infer a nested CPR property for a tail- recursive function that does some I/O.

In summary, CPRResult is changed from:

                   NoCPR
                   /    \
              RetProd   RetSum ConTag

to:

                   NoCPR
                   /    \
  RetProd [DmdResult]    RetSum ConTag
                  \     /
                NeverReturns

Changes to DmdResult

  • The new Converges contructor is added to track definite convergence.
               Dunno CPRResult
               /         \
           ThrowsExn      Converges CPRResult (new)
               /
        Diverges

This information is used to tell when it is safe to perform a nested CPR worker-wrapper transformation. Unpacking a nested component in the return value is safe only when that component definitely converges.

Changes to the demand analyzer

  • Infer nested CPR property when possible.
  • Slightly strenghten the strictness analysis so that in the following code, both foo and bar get the strictness S(SS) on their first argument. Previously only foo got S(SS), where bar got S.
{-# LANGUAGE BangPatterns #-}
data Complex a = !a :+ !a

foo :: Complex Double -> Int -> Complex Double
foo !x 0 = x
foo (a :+ b) _ = b :+ a

bar :: Complex Double -> Int -> Complex Double
bar x 1 = x
bar (a :+ b) _ = b :+ a

Changes to the worker-wrapper transformer

  • Apply nested CPR transformation. For example, if a function that returns (# State# RealWorld, (Int, Int, Int) #) has the CPR information m(t, tm(tm(t), m(t), t)), then the worker function would return the type (# State# RealWorld, Int#, Int, Int #), unboxing the triple and one of the Ints.

Examples

simple.hs

This is a simple recursive function with an easy-to-spot nested CPR property.

Status: ok

module Foo where
f :: Int -> (Int, Int)
f 0 = (1, 2)
f n
  | even n = f (div n 2)
  | otherwise = case f (n - 1) of
    (a, b) -> (a - 1, b)

strictness.hs

This one is trickier in that the analysis has to use the nested strictness of foo on p to give p a nested CPR property. Inspired by nofib/imaginary/x2n1.

Status: ok

{-# LANGUAGE BangPatterns #-}
module Foo where

foo :: Int -> (Int, Int) -> (Int, Int)
foo n p
  | even (n + uncurry (+) p), n /= 0 = foo (n - 1) p
  | n == 0 = (1, 2)
  | otherwise = p

strict_field.hs

This one needs a correct handling of strict constructor fields. Inspired by nofib/imaginary/x2n1.

Status: ok

{-# LANGUAGE BangPatterns #-}
module Foo where
data C a = C !a !a

pow :: C Double -> Int -> C Double
pow !_ 0 = C 0 1
pow !c 1 = c
pow c n
  | even n = let d = pow c (div n 2) in mul d d
  | otherwise = mul c (pow c (n - 1))

mul :: C Double -> C Double -> C Double
mul (C a b) (C d e) = C (a*d-b*e) (a*e+b*d)

strict_field1.hs

This is similar to strict_field.hs, but needs a more aggressive worker-wrapper.

Status: ok

Changing CPR analysis alone wouldn't help here. We need to give the function a better strictness as well.

module Foo (pow) where

data C a = C !a !a

pow :: C Double -> Int -> C Double
pow x y
  | even y    = pow (x `mul` x) (y `quot` 2)
  | y == 1    = x
  | otherwise = pow (x `mul` x) ((y - 1) `quot` 2) `mul` x

mul :: C Double -> C Double -> C Double
mul (C a b) (C d e) = C (a*d-b*e) (a*e+b*d)

peek.hs

This example involves reading a tuple of Ints from memory.

Status: needs -fcpr-depth=4 or higher.

{-# LANGUAGE BangPatterns #-}
module Foo (peek4) where

import Foreign.Storable
import Foreign.Ptr

peek4 :: Ptr Int -> Ptr Int -> (Ptr Int -> IO (Ptr Int, Ptr Int)) -> IO (Ptr Int, Ptr Int, (Int, Int, Int, Int))
peek4 ptr end req
  | end `minusPtr` ptr >= 32 = do
      val <- (,,,)
          <$> peekByteOff ptr 0
          <*> peekByteOff ptr 8
          <*> peekByteOff ptr 16
          <*> peekByteOff ptr 24
      let !ptr' = ptr `plusPtr` 32
      return (ptr', end, val)
  | otherwise = do
      (ptr', end') <- req ptr
      peek4 ptr' end' req

beam-word-poke.hs

This is a real-world example taken from https://github.com/tsurucapital/beamable/blob/master/src/Data/Beamable/Internal.hs#L159. It serializes a Word using a variable-length encoding.

Status: ok

module Foo(beamWordPoke) where

import Data.Bits
import Data.Monoid
import Data.Word
import Foreign.Ptr
import Foreign.Storable

newtype Poke = Poke (Ptr Word8 -> IO (Ptr Word8))

instance Monoid Poke where
  mempty = Poke return
  mappend (Poke a) (Poke b) = Poke $ \ptr -> a ptr >>= b

beamWordPoke :: Word -> Poke
beamWordPoke n
    | next == 0 = pokeWord8 firstSeptet
    | otherwise = pokeWord8 (firstSeptet .|. 0x80) <> beamWordPoke next
    where
        firstSeptet :: Word8
        firstSeptet = fromIntegral $ n .&. 0x7F
        next = n `shiftR` 7

pokeWord8 :: Word8 -> Poke
pokeWord8 w = Poke $ \p -> do poke p w; return $! p `plusPtr` 1
Last modified 8 weeks ago Last modified on Oct 2, 2017 6:54:40 AM