INLINE and manually inlining produce different code
Mikolaj reported that he was seeing significantly different code generated in the case of an INLINE
pragma versus manually inlining. I haven't looked into what the cause it and this isn't necessarily problematic; this is just a reminder to look into what is happening.
Edit, by Mikolaj: here is a minimal example:
-- ghc --make Main.hs -O1; ./Main +RTS -s -RTS
seqFrame2 :: [Int] -> IO ()
{-# NOINLINE seqFrame2 #-}
seqFrame2 l = do
let crux = attrFromInt
-- Total time 2.052s ( 2.072s elapsed)
-- but the following version is many times slower:
-- let crux = attrFromIntINLINE
-- Total time 7.896s ( 7.929s elapsed)
mapM_ (\a -> crux a `seq` return ()) l
main :: IO ()
main = seqFrame2 $ replicate 100000000 0
data Attr = Attr !Int --- the bang is essential
attrFromInt :: Int -> Attr
{-# NOINLINE attrFromInt #-}
attrFromInt w = Attr (w + (2 ^ (8 :: Int)))
fgFromInt :: Int -> Int
{-# INLINE fgFromInt #-} -- removing this INLINE makes it many times faster
-- just like the manually inlined version
-- and NOINLINE lands in between
fgFromInt w = w + (2 ^ (8 :: Int))
attrFromIntINLINE :: Int -> Attr
{-# NOINLINE attrFromIntINLINE #-}
attrFromIntINLINE w = Attr (fgFromInt w)
My guess is the inlining forced with INLINE is just broken in GHC 8.0.1 --- it omits some optimizations that ghc would normally do on manually inlined code, such as floating out common subexpressions.