Opened 6 years ago

Last modified 3 months ago

#3458 new bug

Allocation where none should happen

Reported by: guest Owned by:
Priority: lowest Milestone: 7.12.1
Component: Compiler Version: 6.10.4
Keywords: Cc:
Operating System: Unknown/Multiple Architecture: Unknown/Multiple
Type of failure: Runtime performance bug Test Case:
Blocked By: Blocking:
Related Tickets: #1216, #5945, #6040 Differential Revisions:

Description (last modified by simonpj)

These two functions, according to profiling, do a lot of allocation:

gen d r n m s p
    | r == ll   = do
        pokeElemOff p n 0x0a
        gen d 0     (n+1) (m+1) s p
    | n == m    = do
        pokeElemOff p n 0x0a
        return (s, if r == 0 then m else m+1)
    | otherwise = do
        let t = next s
        pokeElemOff p n (pick d t)
        gen d (r+1) (n+1) m     t p

------------------------------------------------------------------------

pick (c, p) r = loop 0 where
    loop i = if r < unsafeAt p i
              then fromIntegral $ unsafeAt c i :: Word8
              else loop (i+1)

Core for pick:

[GlobalId]
[Arity 3
 NoCafRefs
 Str: DmdType LLL]
$w$spick_r3kC =
  \ (ww_s33o :: GHC.Prim.ByteArray#)
    (ww1_s33v :: GHC.Prim.ByteArray#)
    (ww2_s33A :: GHC.Prim.Word#) ->
    letrec {
      $wloop_s38I :: GHC.Prim.Int# -> GHC.Prim.Word#
      [Arity 1
       Str: DmdType L]
      $wloop_s38I =
        \ (ww3_s339 :: GHC.Prim.Int#) ->
          __scc {pick main:Main !}
          case GHC.Prim.ltWord#
                 ww2_s33A (GHC.Prim.indexWord32Array# ww1_s33v ww3_s339)
          of wild_X3O {
            GHC.Bool.False -> $wloop_s38I (GHC.Prim.+# ww3_s339 1);
            GHC.Bool.True ->
              GHC.Prim.narrow8Word# (GHC.Prim.indexWord32Array# ww_s33o ww3_s339)
          }; } in
    case __scc {pick main:Main}
         case $wloop_s38I 0 of ww3_s33d { __DEFAULT ->
         GHC.Word.W8# ww3_s33d
         }
    of ww3_s33D { GHC.Word.W8# ww4_s33E ->
    ww4_s33E
    }

Core for gen (long):

Rec {
$s$wa_r3mi :: GHC.Prim.State# GHC.Prim.RealWorld
              -> GHC.Prim.Addr#
              -> GHC.Prim.Word#
              -> GHC.Prim.Int#
              -> GHC.Prim.Int#
              -> GHC.Prim.Int#
              -> GHC.Prim.ByteArray#
              -> GHC.Types.Int
              -> GHC.Types.Int
              -> GHC.Types.Int
              -> GHC.Prim.ByteArray#
              -> GHC.Types.Int
              -> GHC.Types.Int
              -> GHC.Types.Int
              -> (# GHC.Prim.State# GHC.Prim.RealWorld,
                    (GHC.Word.Word32, GHC.Types.Int) #)
[GlobalId]
[Arity 14
 NoCafRefs]
$s$wa_r3mi =
  \ (sc_s3es :: GHC.Prim.State# GHC.Prim.RealWorld)
    (sc1_s3et :: GHC.Prim.Addr#)
    (sc2_s3eu :: GHC.Prim.Word#)
    (sc3_s3ev :: GHC.Prim.Int#)
    (sc4_s3ew :: GHC.Prim.Int#)
    (sc5_s3ex :: GHC.Prim.Int#)
    (sc6_s3ey :: GHC.Prim.ByteArray#)
    (sc7_s3ez :: GHC.Types.Int)
    (sc8_s3eA :: GHC.Types.Int)
    (sc9_s3eB :: GHC.Types.Int)
    (sc10_s3eC :: GHC.Prim.ByteArray#)
    (sc11_s3eD :: GHC.Types.Int)
    (sc12_s3eE :: GHC.Types.Int)
    (sc13_s3eF :: GHC.Types.Int) ->
    let {
      m_s39b :: GHC.Types.Int
      []
      m_s39b = GHC.Types.I# sc3_s3ev } in
    ((__scc {gen main:Main !}
      case sc5_s3ex of wild_B1 {
        __DEFAULT ->
          case GHC.Prim.==# sc4_s3ew sc3_s3ev of wild1_X3F {
            GHC.Bool.False ->
              (\ (eta_a2vm :: GHC.Prim.State# GHC.Prim.RealWorld) ->
                 let {
                   ww_s33e :: GHC.Prim.Word#
                   []
                   ww_s33e =
                     GHC.Prim.remWord#
                       (GHC.Prim.narrow32Word#
                          (GHC.Prim.plusWord#
                             (GHC.Prim.narrow32Word# (GHC.Prim.timesWord# __word 3877 sc2_s3eu))
                             __word 29573))
                       __word 139968 } in
                 case $w$spick_r3k8 sc10_s3eC sc6_s3ey ww_s33e
                 of ww1_s33i { __DEFAULT ->
                 case GHC.Prim.writeWord8OffAddr#
                        @ GHC.Prim.RealWorld sc1_s3et sc4_s3ew ww1_s33i eta_a2vm
                 of s21_a2wV { __DEFAULT ->
                 $s$wa_r3mi
                   s21_a2wV
                   sc1_s3et
                   ww_s33e
                   sc3_s3ev
                   (GHC.Prim.+# sc4_s3ew 1)
                   (GHC.Prim.+# wild_B1 1)
                   sc6_s3ey
                   sc7_s3ez
                   sc8_s3eA
                   sc9_s3eB
                   sc10_s3eC
                   sc11_s3eD
                   sc12_s3eE
                   sc13_s3eF
                 }
                 })
              `cast` (sym ((GHC.IOBase.:CoIO) (GHC.Word.Word32, GHC.Types.Int))
                      :: GHC.Prim.State# GHC.Prim.RealWorld
                         -> (# GHC.Prim.State# GHC.Prim.RealWorld,
                               (GHC.Word.Word32, GHC.Types.Int) #)
                           ~
                         GHC.IOBase.IO (GHC.Word.Word32, GHC.Types.Int));
            GHC.Bool.True ->
              (\ (eta_a2vm :: GHC.Prim.State# GHC.Prim.RealWorld) ->
                 case GHC.Prim.writeWord8OffAddr#
                        @ GHC.Prim.RealWorld sc1_s3et sc4_s3ew __word 10 eta_a2vm
                 of s21_a2wV { __DEFAULT ->
                 (# s21_a2wV,
                    (GHC.Word.W32# sc2_s3eu,
                     case wild_B1 of wild2_X4o {
                       __DEFAULT -> GHC.Types.I# (GHC.Prim.+# sc3_s3ev 1); 0 -> m_s39b
                     }) #)
                 })
              `cast` (sym ((GHC.IOBase.:CoIO) (GHC.Word.Word32, GHC.Types.Int))
                      :: GHC.Prim.State# GHC.Prim.RealWorld
                         -> (# GHC.Prim.State# GHC.Prim.RealWorld,
                               (GHC.Word.Word32, GHC.Types.Int) #)
                           ~
                         GHC.IOBase.IO (GHC.Word.Word32, GHC.Types.Int))
          };
        60 ->
          (\ (eta_a2vm :: GHC.Prim.State# GHC.Prim.RealWorld) ->
             case GHC.Prim.writeWord8OffAddr#
                    @ GHC.Prim.RealWorld sc1_s3et sc4_s3ew __word 10 eta_a2vm
             of s21_a2wV { __DEFAULT ->
             $s$wa1_r3mk
               s21_a2wV
               sc1_s3et
               sc2_s3eu
               (GHC.Prim.+# sc3_s3ev 1)
               (GHC.Prim.+# sc4_s3ew 1)
               sc6_s3ey
               sc7_s3ez
               sc8_s3eA
               sc9_s3eB
               sc10_s3eC
               sc11_s3eD
               sc12_s3eE
               sc13_s3eF
             })
          `cast` (sym ((GHC.IOBase.:CoIO) (GHC.Word.Word32, GHC.Types.Int))
                  :: GHC.Prim.State# GHC.Prim.RealWorld
                     -> (# GHC.Prim.State# GHC.Prim.RealWorld,
                           (GHC.Word.Word32, GHC.Types.Int) #)
                       ~
                     GHC.IOBase.IO (GHC.Word.Word32, GHC.Types.Int))
      })
     `cast` ((GHC.IOBase.:CoIO) (GHC.Word.Word32, GHC.Types.Int)
             :: GHC.IOBase.IO (GHC.Word.Word32, GHC.Types.Int)
                  ~
                GHC.Prim.State# GHC.Prim.RealWorld
                -> (# GHC.Prim.State# GHC.Prim.RealWorld,
                      (GHC.Word.Word32, GHC.Types.Int) #)))
      sc_s3es
$s$wa1_r3mk :: GHC.Prim.State# GHC.Prim.RealWorld
               -> GHC.Prim.Addr#
               -> GHC.Prim.Word#
               -> GHC.Prim.Int#
               -> GHC.Prim.Int#
               -> GHC.Prim.ByteArray#
               -> GHC.Types.Int
               -> GHC.Types.Int
               -> GHC.Types.Int
               -> GHC.Prim.ByteArray#
               -> GHC.Types.Int
               -> GHC.Types.Int
               -> GHC.Types.Int
               -> (# GHC.Prim.State# GHC.Prim.RealWorld,
                     (GHC.Word.Word32, GHC.Types.Int) #)
[GlobalId]
[Arity 13
 NoCafRefs]
$s$wa1_r3mk =
  \ (sc_s3fH :: GHC.Prim.State# GHC.Prim.RealWorld)
    (sc1_s3fI :: GHC.Prim.Addr#)
    (sc2_s3fJ :: GHC.Prim.Word#)
    (sc3_s3fK :: GHC.Prim.Int#)
    (sc4_s3fL :: GHC.Prim.Int#)
    (sc5_s3fM :: GHC.Prim.ByteArray#)
    (sc6_s3fN :: GHC.Types.Int)
    (sc7_s3fO :: GHC.Types.Int)
    (sc8_s3fP :: GHC.Types.Int)
    (sc9_s3fQ :: GHC.Prim.ByteArray#)
    (sc10_s3fR :: GHC.Types.Int)
    (sc11_s3fS :: GHC.Types.Int)
    (sc12_s3fT :: GHC.Types.Int) ->
    let {
      m_s39b :: GHC.Types.Int
      []
      m_s39b = GHC.Types.I# sc3_s3fK } in
    ((__scc {gen main:Main !}
      case GHC.Prim.==# sc4_s3fL sc3_s3fK of wild_X3F {
        GHC.Bool.False ->
          (\ (eta_a2vm :: GHC.Prim.State# GHC.Prim.RealWorld) ->
             let {
               ww_s33e :: GHC.Prim.Word#
               []
               ww_s33e =
                 GHC.Prim.remWord#
                   (GHC.Prim.narrow32Word#
                      (GHC.Prim.plusWord#
                         (GHC.Prim.narrow32Word# (GHC.Prim.timesWord# __word 3877 sc2_s3fJ))
                         __word 29573))
                   __word 139968 } in
             case $w$spick_r3k8 sc9_s3fQ sc5_s3fM ww_s33e
             of ww1_s33i { __DEFAULT ->
             case GHC.Prim.writeWord8OffAddr#
                    @ GHC.Prim.RealWorld sc1_s3fI sc4_s3fL ww1_s33i eta_a2vm
             of s21_a2wV { __DEFAULT ->
             $s$wa_r3mi
               s21_a2wV
               sc1_s3fI
               ww_s33e
               sc3_s3fK
               (GHC.Prim.+# sc4_s3fL 1)
               1
               sc5_s3fM
               sc6_s3fN
               sc7_s3fO
               sc8_s3fP
               sc9_s3fQ
               sc10_s3fR
               sc11_s3fS
               sc12_s3fT
             }
             })
          `cast` (sym ((GHC.IOBase.:CoIO) (GHC.Word.Word32, GHC.Types.Int))
                  :: GHC.Prim.State# GHC.Prim.RealWorld
                     -> (# GHC.Prim.State# GHC.Prim.RealWorld,
                           (GHC.Word.Word32, GHC.Types.Int) #)
                       ~
                     GHC.IOBase.IO (GHC.Word.Word32, GHC.Types.Int));
        GHC.Bool.True ->
          (\ (eta_a2vm :: GHC.Prim.State# GHC.Prim.RealWorld) ->
             case GHC.Prim.writeWord8OffAddr#
                    @ GHC.Prim.RealWorld sc1_s3fI sc4_s3fL __word 10 eta_a2vm
             of s21_a2wV { __DEFAULT ->
             (# s21_a2wV, (GHC.Word.W32# sc2_s3fJ, m_s39b) #)
             })
          `cast` (sym ((GHC.IOBase.:CoIO) (GHC.Word.Word32, GHC.Types.Int))
                  :: GHC.Prim.State# GHC.Prim.RealWorld
                     -> (# GHC.Prim.State# GHC.Prim.RealWorld,
                           (GHC.Word.Word32, GHC.Types.Int) #)
                       ~
                     GHC.IOBase.IO (GHC.Word.Word32, GHC.Types.Int))
      })
     `cast` ((GHC.IOBase.:CoIO) (GHC.Word.Word32, GHC.Types.Int)
             :: GHC.IOBase.IO (GHC.Word.Word32, GHC.Types.Int)
                  ~
                GHC.Prim.State# GHC.Prim.RealWorld
                -> (# GHC.Prim.State# GHC.Prim.RealWorld,
                      (GHC.Word.Word32, GHC.Types.Int) #)))
      sc_s3fH
end Rec }

$s$wa2_r3mm :: GHC.Prim.State# GHC.Prim.RealWorld
               -> GHC.Prim.Addr#
               -> GHC.Word.Word32
               -> GHC.Prim.Int#
               -> GHC.Prim.Int#
               -> (Data.Array.Base.UArray GHC.Types.Int GHC.Word.Word32,
                   Data.Array.Base.UArray GHC.Types.Int GHC.Word.Word32)
               -> (# GHC.Prim.State# GHC.Prim.RealWorld,
                     (GHC.Word.Word32, GHC.Types.Int) #)
[GlobalId]
[Arity 6
 NoCafRefs]
$s$wa2_r3mm =
  \ (sc_s3eH :: GHC.Prim.State# GHC.Prim.RealWorld)
    (sc1_s3eI :: GHC.Prim.Addr#)
    (sc2_s3eJ :: GHC.Word.Word32)
    (sc3_s3eK :: GHC.Prim.Int#)
    (sc4_s3eL :: GHC.Prim.Int#)
    (sc5_s3eM :: (Data.Array.Base.UArray GHC.Types.Int GHC.Word.Word32,
                  Data.Array.Base.UArray GHC.Types.Int GHC.Word.Word32)) ->
    let {
      m_s39b :: GHC.Types.Int
      []
      m_s39b = GHC.Types.I# sc3_s3eK } in
    ((__scc {gen main:Main !}
      case GHC.Prim.==# sc4_s3eL sc3_s3eK of wild_X3F {
        GHC.Bool.False ->
          (\ (eta_a2vm :: GHC.Prim.State# GHC.Prim.RealWorld) ->
             case sc5_s3eM of w_X34x { (ww_s32X, ww1_s334) ->
             case ww_s32X
             of ww2_X34F
             { Data.Array.Base.UArray ww3_s32Z ww4_s330 ww5_s331 ww6_s332 ->
             case ww1_s334
             of ww7_X34X
             { Data.Array.Base.UArray ww8_s336 ww9_s337 ww10_s338 ww11_s339 ->
             case __scc {next main:Main}
                  case sc2_s3eJ of wild1_a2Bw { GHC.Word.W32# y#_a2By ->
                  GHC.Word.W32#
                    (GHC.Prim.remWord#
                       (GHC.Prim.narrow32Word#
                          (GHC.Prim.plusWord#
                             (GHC.Prim.narrow32Word# (GHC.Prim.timesWord# __word 3877 y#_a2By))
                             __word 29573))
                       __word 139968)
                  }
             of w1_X35g { GHC.Word.W32# ww12_s33e ->
             case $w$spick_r3k8 ww6_s332 ww11_s339 ww12_s33e
             of ww13_s33i { __DEFAULT ->
             case GHC.Prim.writeWord8OffAddr#
                    @ GHC.Prim.RealWorld sc1_s3eI sc4_s3eL ww13_s33i eta_a2vm
             of s21_a2wV { __DEFAULT ->
             $s$wa_r3mi
               s21_a2wV
               sc1_s3eI
               ww12_s33e
               sc3_s3eK
               (GHC.Prim.+# sc4_s3eL 1)
               1
               ww11_s339
               ww10_s338
               ww9_s337
               ww8_s336
               ww6_s332
               ww5_s331
               ww4_s330
               ww3_s32Z
             }
             }
             }
             }
             }
             })
          `cast` (sym ((GHC.IOBase.:CoIO) (GHC.Word.Word32, GHC.Types.Int))
                  :: GHC.Prim.State# GHC.Prim.RealWorld
                     -> (# GHC.Prim.State# GHC.Prim.RealWorld,
                           (GHC.Word.Word32, GHC.Types.Int) #)
                       ~
                     GHC.IOBase.IO (GHC.Word.Word32, GHC.Types.Int));
        GHC.Bool.True ->
          (\ (eta_a2vm :: GHC.Prim.State# GHC.Prim.RealWorld) ->
             case GHC.Prim.writeWord8OffAddr#
                    @ GHC.Prim.RealWorld sc1_s3eI sc4_s3eL __word 10 eta_a2vm
             of s21_a2wV { __DEFAULT ->
             (# s21_a2wV, (sc2_s3eJ, m_s39b) #)
             })
          `cast` (sym ((GHC.IOBase.:CoIO) (GHC.Word.Word32, GHC.Types.Int))
                  :: GHC.Prim.State# GHC.Prim.RealWorld
                     -> (# GHC.Prim.State# GHC.Prim.RealWorld,
                           (GHC.Word.Word32, GHC.Types.Int) #)
                       ~
                     GHC.IOBase.IO (GHC.Word.Word32, GHC.Types.Int))
      })
     `cast` ((GHC.IOBase.:CoIO) (GHC.Word.Word32, GHC.Types.Int)
             :: GHC.IOBase.IO (GHC.Word.Word32, GHC.Types.Int)
                  ~
                GHC.Prim.State# GHC.Prim.RealWorld
                -> (# GHC.Prim.State# GHC.Prim.RealWorld,
                      (GHC.Word.Word32, GHC.Types.Int) #)))
      sc_s3eH

$wa1_r3mo :: (Data.Array.Base.UArray GHC.Types.Int GHC.Word.Word32,
              Data.Array.Base.UArray GHC.Types.Int GHC.Word.Word32)
             -> GHC.Prim.Int#
             -> GHC.Prim.Int#
             -> GHC.Prim.Int#
             -> GHC.Word.Word32
             -> GHC.Prim.Addr#
             -> GHC.Prim.State# GHC.Prim.RealWorld
             -> (# GHC.Prim.State# GHC.Prim.RealWorld,
                   (GHC.Word.Word32, GHC.Types.Int) #)
[GlobalId]
[Arity 7
 NoCafRefs
 Str: DmdType LLLLLLL]
$wa1_r3mo =
  \ (w_s33r :: (Data.Array.Base.UArray GHC.Types.Int GHC.Word.Word32,
                Data.Array.Base.UArray GHC.Types.Int GHC.Word.Word32))
    (ww_s33u :: GHC.Prim.Int#)
    (ww1_s33y :: GHC.Prim.Int#)
    (ww2_s33C :: GHC.Prim.Int#)
    (w1_s33E :: GHC.Word.Word32)
    (ww3_s33H :: GHC.Prim.Addr#)
    (w2_s33J :: GHC.Prim.State# GHC.Prim.RealWorld) ->
    let {
      m_s39b :: GHC.Types.Int
      []
      m_s39b = GHC.Types.I# ww2_s33C } in
    ((__scc {gen main:Main !}
      case ww_s33u of wild_B1 {
        __DEFAULT ->
          case GHC.Prim.==# ww1_s33y ww2_s33C of wild1_X3F {
            GHC.Bool.False ->
              (\ (eta_a2vm :: GHC.Prim.State# GHC.Prim.RealWorld) ->
                 case w_s33r of w3_X34x { (ww4_s32X, ww5_s334) ->
                 case ww4_s32X
                 of ww6_X34F
                 { Data.Array.Base.UArray ww7_s32Z ww8_s330 ww9_s331 ww10_s332 ->
                 case ww5_s334
                 of ww11_X34X
                 { Data.Array.Base.UArray ww12_s336 ww13_s337 ww14_s338 ww15_s339 ->
                 case __scc {next main:Main}
                      case w1_s33E of wild11_a2Bw { GHC.Word.W32# y#_a2By ->
                      GHC.Word.W32#
                        (GHC.Prim.remWord#
                           (GHC.Prim.narrow32Word#
                              (GHC.Prim.plusWord#
                                 (GHC.Prim.narrow32Word# (GHC.Prim.timesWord# __word 3877 y#_a2By))
                                 __word 29573))
                           __word 139968)
                      }
                 of w4_X35g { GHC.Word.W32# ww16_s33e ->
                 case $w$spick_r3k8 ww10_s332 ww15_s339 ww16_s33e
                 of ww17_s33i { __DEFAULT ->
                 case GHC.Prim.writeWord8OffAddr#
                        @ GHC.Prim.RealWorld ww3_s33H ww1_s33y ww17_s33i eta_a2vm
                 of s21_a2wV { __DEFAULT ->
                 $s$wa_r3mi
                   s21_a2wV
                   ww3_s33H
                   ww16_s33e
                   ww2_s33C
                   (GHC.Prim.+# ww1_s33y 1)
                   (GHC.Prim.+# wild_B1 1)
                   ww15_s339
                   ww14_s338
                   ww13_s337
                   ww12_s336
                   ww10_s332
                   ww9_s331
                   ww8_s330
                   ww7_s32Z
                 }
                 }
                 }
                 }
                 }
                 })
              `cast` (sym ((GHC.IOBase.:CoIO) (GHC.Word.Word32, GHC.Types.Int))
                      :: GHC.Prim.State# GHC.Prim.RealWorld
                         -> (# GHC.Prim.State# GHC.Prim.RealWorld,
                               (GHC.Word.Word32, GHC.Types.Int) #)
                           ~
                         GHC.IOBase.IO (GHC.Word.Word32, GHC.Types.Int));
            GHC.Bool.True ->
              (\ (eta_a2vm :: GHC.Prim.State# GHC.Prim.RealWorld) ->
                 case GHC.Prim.writeWord8OffAddr#
                        @ GHC.Prim.RealWorld ww3_s33H ww1_s33y __word 10 eta_a2vm
                 of s21_a2wV { __DEFAULT ->
                 (# s21_a2wV,
                    (w1_s33E,
                     case wild_B1 of wild2_X4o {
                       __DEFAULT -> GHC.Types.I# (GHC.Prim.+# ww2_s33C 1); 0 -> m_s39b
                     }) #)
                 })
              `cast` (sym ((GHC.IOBase.:CoIO) (GHC.Word.Word32, GHC.Types.Int))
                      :: GHC.Prim.State# GHC.Prim.RealWorld
                         -> (# GHC.Prim.State# GHC.Prim.RealWorld,
                               (GHC.Word.Word32, GHC.Types.Int) #)
                           ~
                         GHC.IOBase.IO (GHC.Word.Word32, GHC.Types.Int))
          };
        60 ->
          (\ (eta_a2vm :: GHC.Prim.State# GHC.Prim.RealWorld) ->
             case GHC.Prim.writeWord8OffAddr#
                    @ GHC.Prim.RealWorld ww3_s33H ww1_s33y __word 10 eta_a2vm
             of s21_a2wV { __DEFAULT ->
             $s$wa2_r3mm
               s21_a2wV
               ww3_s33H
               w1_s33E
               (GHC.Prim.+# ww2_s33C 1)
               (GHC.Prim.+# ww1_s33y 1)
               w_s33r
             })
          `cast` (sym ((GHC.IOBase.:CoIO) (GHC.Word.Word32, GHC.Types.Int))
                  :: GHC.Prim.State# GHC.Prim.RealWorld
                     -> (# GHC.Prim.State# GHC.Prim.RealWorld,
                           (GHC.Word.Word32, GHC.Types.Int) #)
                       ~
                     GHC.IOBase.IO (GHC.Word.Word32, GHC.Types.Int))
      })
     `cast` ((GHC.IOBase.:CoIO) (GHC.Word.Word32, GHC.Types.Int)
             :: GHC.IOBase.IO (GHC.Word.Word32, GHC.Types.Int)
                  ~
                GHC.Prim.State# GHC.Prim.RealWorld
                -> (# GHC.Prim.State# GHC.Prim.RealWorld,
                      (GHC.Word.Word32, GHC.Types.Int) #)))
      w2_s33J

a2_r3mq :: (Data.Array.Base.UArray GHC.Types.Int GHC.Word.Word32,
            Data.Array.Base.UArray GHC.Types.Int GHC.Word.Word32)
           -> GHC.Types.Int
           -> GHC.Types.Int
           -> GHC.Types.Int
           -> GHC.Word.Word32
           -> GHC.Ptr.Ptr GHC.Word.Word8
           -> GHC.Prim.State# GHC.Prim.RealWorld
           -> (# GHC.Prim.State# GHC.Prim.RealWorld,
                 (GHC.Word.Word32, GHC.Types.Int) #)
[GlobalId]
[Arity 7
 NoCafRefs
 Str: DmdType LU(L)U(L)U(L)LU(L)L]
a2_r3mq =
  __inline_me (\ (w_s33r :: (Data.Array.Base.UArray
                               GHC.Types.Int GHC.Word.Word32,
                             Data.Array.Base.UArray GHC.Types.Int GHC.Word.Word32))
                 (w1_s33s :: GHC.Types.Int)
                 (w2_s33w :: GHC.Types.Int)
                 (w3_s33A :: GHC.Types.Int)
                 (w4_s33E :: GHC.Word.Word32)
                 (w5_s33F :: GHC.Ptr.Ptr GHC.Word.Word8)
                 (w6_s33J :: GHC.Prim.State# GHC.Prim.RealWorld) ->
                 case w1_s33s of w7_X35h { GHC.Types.I# ww_s33u ->
                 case w2_s33w of w8_X35q { GHC.Types.I# ww1_s33y ->
                 case w3_s33A of w9_X35z { GHC.Types.I# ww2_s33C ->
                 case w5_s33F of w10_X35J { GHC.Ptr.Ptr ww3_s33H ->
                 $wa1_r3mo w_s33r ww_s33u ww1_s33y ww2_s33C w4_s33E ww3_s33H w6_s33J
                 }
                 }
                 }
                 })

Change History (20)

comment:1 Changed 6 years ago by simonpj

  • Description modified (diff)
  • difficulty set to Unknown

What are ll and next? It's easier if you give a reproducible test case. Thanks

Simon

comment:2 Changed 6 years ago by guest

ll = 60,

next :: Word32 -> Word32
next s = (ia*s + ic) `rem` im

ia = 3877
ic = 29573
im = 139968

The whole program (changed to use IOUArray everywhere now, but the problem remains):

{-# OPTIONS -O2 -funbox-strict-fields -fexcess-precision -fvia-C -optc-O3 -optc-ffast-math -optc-fomit-frame-pointer -optc-march=native -optc-mfpmath=sse -optc-msse3 #-}

------------------------------------------------------------------------

---
---  The Computer Language Benchmarks Game
---
---    http://shootout.alioth.debian.org
---
---            Fasta Benchmark
---
---         Program by Rohan Lean
---

------------------------------------------------------------------------

import Control.Arrow
import Control.Concurrent
import Data.Array.Base
import Data.Array.IO
import Data.Array.Unboxed
import Data.ByteString.Internal
import Data.Word
import System
import System.IO

------------------------------------------------------------------------

main = do n <- readIO . head =<< getArgs

          putStrLn ">ONE Homo sapiens alu"
          write_alu (2*n)

          putStrLn ">TWO IUB ambiguity codes"
          s <- write iub (3*n) 42

          putStrLn ">THREE Homo sapiens frequency"
          write hom (5*n) s


------------------------------------------------------------------------

ll = 60 -- line length

------------------------------------------------------------------------

write_alu n = loop n =<< newListArray (1,bs) ul where
    cc = length alu_string `lcm` ll
    bs = cc + quot cc ll
    un = \s -> (take ll s) ++ [0x0a] ++ un (drop ll s)
    ul = un $ cycle $ map c2w alu_string
    loop n b
        | cc <= n   = do
              hPutArray stdout b bs
              loop (n-cc) b
        | otherwise = do
              hPutArray stdout b (n + quot n ll)
              if rem n ll /= 0
                then putChar '\n'
                else return ()

------------------------------------------------------------------------

---
---  Constants for the linear congruential PRNG
---

ia = 3877
ic = 29573
im = 139968

------------------------------------------------------------------------

next :: Word32 -> Word32
next s = (ia*s + ic) `rem` im

skip n s = foldr id s $ replicate n next

------------------------------------------------------------------------

tn = 1     -- number of working threads

lc = 250   -- threads prepare that many lines

cc = lc*ll -- thus many characters

bs = cc+lc -- buffersize

------------------------------------------------------------------------

write d n s = do
    go_1 <- newMVar ()
    done <- newEmptyMVar
    spawn tn (convert d) n s go_1 go_1 done

------------------------------------------------------------------------

spawn 1 d n s go_k go_1 done = do
    a <- newArray (1,bs) 0x0a
    forkIO $ writer d n a s go_k go_1 done
    takeMVar done

spawn t d n s go_k go_1 done = do
    go_next <- newEmptyMVar
    a <- newArray (1,bs) 0x0a
    forkIO $ writer d n a s go_k go_next done
    spawn (t-1) d (max (n-cc) 0) (skip cc s) go_next go_1 done

------------------------------------------------------------------------

writer d 0 a s go go_next done = killThread =<< myThreadId
writer d n a s go go_next done = do
    (t,br) <- gen d 0 0 cr s a
    takeMVar go
    hPutArray stdout a br
    putMVar go_next ()
    if n-cr == 0 then putMVar done t
                 else return ()
    let u = skip (cc*(tn-1)) t
    writer d n' a u go go_next done
    where
        cr = min n cc
        n' = max 0 (n-cc*tn)

------------------------------------------------------------------------

gen d r n m s a
    | r == ll   = gen d 0     (n+1) (m+1) s a
    | n == m    = do
        unsafeWrite a n 0x0a
        return (s, if r == 0 then m else m+1)
    | otherwise = do
        let t = next s
        unsafeWrite a n (pick d t)
        gen d (r+1) (n+1) m     t a

------------------------------------------------------------------------

pick (c,p) r = loop 0 where
    loop i = if r < unsafeAt p i
               then fromIntegral $ unsafeAt c i :: Word8
               else loop (i+1)

------------------------------------------------------------------------

convert :: [(Char, Float)] -> ((UArray Int Word32), (UArray Int Word32))
convert t = (a c, a p)
    where
        a s   = listArray (1, fromIntegral $ length t) s
        (c,p) = map fromIntegral *** map (ceiling . (* fromIntegral im))
              $ map c2w          *** scanl1 (+)
              $ unzip t

------------------------------------------------------------------------

alu_string = "GGCCGGGCGCGGTGGCTCACGCCTGTAATCCCAGCACTTTGG\
             \GAGGCCGAGGCGGGCGGATCACCTGAGGTCAGGAGTTCGAGA\
             \CCAGCCTGGCCAACATGGTGAAACCCCGTCTCTACTAAAAAT\
             \ACAAAAATTAGCCGGGCGTGGTGGCGCGCGCCTGTAATCCCA\
             \GCTACTCGGGAGGCTGAGGCAGGAGAATCGCTTGAACCCGGG\
             \AGGCGGAGGTTGCAGTGAGCCGAGATCGCGCCACTGCACTCC\
             \AGCCTGGGCGACAGAGCGAGACTCCGTCTCAAAAA"

------------------------------------------------------------------------

iub = [('a',0.27),('c',0.12),('g',0.12),('t',0.27),('B',0.02)
      ,('D',0.02),('H',0.02),('K',0.02),('M',0.02),('N',0.02)
      ,('R',0.02),('S',0.02),('V',0.02),('W',0.02),('Y',0.02)]

hom = [('a',0.3029549426680),('c',0.1979883004921)
      ,('g',0.1975473066391),('t',0.3015094502008)]

comment:3 Changed 6 years ago by guest

  • Version changed from 6.10.1 to 6.10.4

A few more notes:
I could reproduce the bug with 6.10.4.
When profiling, there is roughly ten times more allocation happening, which gets attributed to the program. Nevertheless, even when running a normal build the program allocates around 4.6GB here, and I don't see where all that allocation comes from.

comment:4 Changed 6 years ago by igloo

  • Milestone set to 6.12 branch

comment:5 Changed 5 years ago by simonpj

I've had a look at this. I ran it with

./runST 100000 > /dev/null

Interesting. The allocation you are concerned about arises from the local 'loop' function in 'pick'. GHC allocates a function closure for it, once per call of 'pick', which isn't really necessary.

There are two ways to stop this happening. One way is to make 'loop' a top level function like this

pick (c,p) r = myloop c p r 0

myloop c p r i = if r < unsafeAt p i
                 then fromIntegral $ unsafeAt c i :: Word8
                 else myloop c p r (i+1)

This reduces allocation from 35Mbytes to 3Mbytes, by getting rid of those loop closures.

But this is not very cool. In fact 'loop' is totally tail-recursive, and should never be heap allocated. It's very nearly a let-no-escape thing but not quite. Here is a smaller example:

f x = let g y = if y then x else g y
      in g x

If you compile with -ddump-stg you'll see that g gets the "let-no-escape" property, which means that it doesn't get a heap closure allocated for it.

But if you make this little change:

f x = let g y = if y then x else g y
      in case g x of
           True -> False
           False -> True

then g doesn't get the let-no-escape property. And for good reason: you can't just adjust the stack pointer and jump to g.

But you could do so if the 'let' was floated inwards, just before core-to-stg, thus:

f x = case let g y = if y then x else g y
           in g x 
       of
           True -> False
           False -> True

Now it has the let-no-escape property again.

I'd never really thought of that. Food for thought here.

(These remarks are intended mainly for me and other over-interested parties. As a workaround to get your performance up, try the lambda-lifting thing I show first.)

Simon

comment:6 Changed 5 years ago by simonpj

  • Type of failure set to None/Unknown

I found that #1216 is another example of the same phenomenon.

Simon

comment:7 Changed 5 years ago by igloo

  • Milestone changed from 6.12 branch to 6.12.3

comment:8 Changed 5 years ago by igloo

  • Milestone changed from 6.12.3 to 6.14.1
  • Priority changed from normal to low

comment:9 Changed 4 years ago by igloo

  • Milestone changed from 7.0.1 to 7.0.2

comment:10 Changed 4 years ago by batterseapower

Is there ever any reason to float lets out of a case scrutinee (as long as you are careful that this doesn't disable case-of-known-constructor)?

comment:11 Changed 4 years ago by simonpj

In general, GHC floats lets out of every strict context (eg strict function arguments, as well as case scrutinees) to attempt to make things like case-of-known-constructor fire more easily.

This ticket is about quite a low-level question... I think it's best tackled by either

  • Late lambda lifting, or
  • Late float-in

where by "late" I mean just before CorePrep or something... ie transformations are all done, and we're just trying to put the program in the best form for code generation.

Simon

comment:12 Changed 4 years ago by igloo

  • Milestone changed from 7.0.2 to 7.2.1

comment:13 Changed 3 years ago by igloo

  • Milestone changed from 7.2.1 to 7.4.1

comment:14 Changed 3 years ago by igloo

  • Milestone changed from 7.4.1 to 7.6.1
  • Priority changed from low to lowest

comment:15 Changed 3 years ago by igloo

  • Milestone changed from 7.6.1 to 7.6.2

comment:16 Changed 15 months ago by nomeata

Just for the record: The code

f x = let g y = if y then x else g y
      in case g x of
           True -> False
           False -> True

in comment:5 would also be improved by the common context transformation proposed here: http://www.haskell.org/pipermail/ghc-devs/2013-December/003481.html

comment:17 Changed 9 months ago by thoughtpolice

  • Milestone changed from 7.6.2 to 7.10.1

Moving to 7.10.1.

comment:18 Changed 4 months ago by thomie

  • Architecture changed from x86_64 (amd64) to Unknown/Multiple
  • Operating System changed from Linux to Unknown/Multiple
  • Type of failure changed from None/Unknown to Runtime performance bug

comment:19 Changed 3 months ago by thoughtpolice

  • Milestone changed from 7.10.1 to 7.12.1

Moving to 7.12.1 milestone; if you feel this is an error and should be addressed sooner, please move it back to the 7.10.1 milestone.

comment:20 Changed 3 months ago by thoughtpolice

Moving to 7.12.1 milestone; if you feel this is an error and should be addressed sooner, please move it back to the 7.10.1 milestone.

Note: See TracTickets for help on using tickets.