Opened 9 years ago

Last modified 19 months ago

#2439 new bug

Missed optimisation with dictionaries and loops

Reported by: rl Owned by: simonpj
Priority: lowest Milestone:
Component: Compiler Version: 6.9
Keywords: Cc: carter.schonwald@…, ekmett
Operating System: Unknown/Multiple Architecture: Unknown/Multiple
Type of failure: Runtime performance bug Test Case:
Blocked By: Blocking:
Related Tickets: Differential Rev(s):
Wiki Page:

Description

{-# LANGUAGE BangPatterns #-}
module Foo (sum') where

foldl' :: (a -> b -> a) -> a -> [b] -> a
{-# INLINE foldl' #-}
foldl' f !z xs = loop z xs
  where
    loop !z [] = z
    loop !z (x:xs) = loop (f z x) xs

sum' :: Num a => [a] -> a
sum' xs = foldl' (+) 0 xs

This is the code before LiberateCase:

Foo.sum' =
  \ (@ a_a9T) ($dNum_aa1 [ALWAYS Just L] :: GHC.Num.Num a_a9T) ->
    let {
      lit_scm [ALWAYS Just L] :: a_a9T
      [Str: DmdType]
      lit_scm =
        case $dNum_aa1
        of tpl_B1 [ALWAYS Just A]
        { GHC.Num.:DNum tpl_B2 [ALWAYS Just A]
                        tpl_B3 [ALWAYS Just A]
                        tpl_B4 [ALWAYS Just A]
                        tpl_B5 [ALWAYS Just A]
                        tpl_B6 [ALWAYS Just A]
                        tpl_B7 [ALWAYS Just A]
                        tpl_B8 [ALWAYS Just A]
                        tpl_B9 [ALWAYS Just A]
                        tpl_Ba [ALWAYS Just C(S)] ->
        tpl_Ba lvl_sbH
        } } in
    letrec {
      loop_sck [ALWAYS LoopBreaker Nothing] :: a_a9T -> [a_a9T] -> a_a9T
      [Arity 2
       Str: DmdType SS]
      loop_sck =
        \ (z_a6Y :: a_a9T) (ds_db7 :: [a_a9T]) ->
          case z_a6Y of z_X7h [ALWAYS Just L] { __DEFAULT ->
          case ds_db7 of wild_B1 [ALWAYS Just A] {
            [] -> z_a6Y;
            : x_a72 [ALWAYS Just L] xs_a74 [ALWAYS Just S] ->
              case $dNum_aa1
              of tpl_Xl [ALWAYS Just A]
              { GHC.Num.:DNum tpl_B2 [ALWAYS Just A]
                              tpl_B3 [ALWAYS Just A]
                              tpl_B4 [ALWAYS Just C(C(S))]
                              tpl_B5 [ALWAYS Just A]
                              tpl_B6 [ALWAYS Just A]
                              tpl_B7 [ALWAYS Just A]
                              tpl_B8 [ALWAYS Just A]
                              tpl_B9 [ALWAYS Just A]
                              tpl_Ba [ALWAYS Just A] ->
              loop_sck (tpl_B4 z_a6Y x_a72) xs_a74
              }
          }
          }; } in
    \ (xs_a76 :: [a_a9T]) -> loop_sck lit_scm xs_a76

Note that the Num dictionary is scrutinised in the loop even though sum' is actually strict in the dictionary (by virtue of being strict in lit_scm) and it would make sense to take it apart before entering the loop. LiberateCase does nail this but only if the loop is small enough and at the expense of code size.

Attachments (1)

T2439.hs (262 bytes) - added by morabbin 5 years ago.
Exhibits problems described in #2439

Download all attachments as: .zip

Change History (30)

comment:1 Changed 9 years ago by simonpj

difficulty: Unknown

Interesting example. I've thought about it a bit, and I don't see an obvious, general transformation that would help. The loop is not strict in the dictionary; it's just this particular call that is.

A long-standing thing I've wanted to try is to make all dictionary arguments strict, and see where that takes us. After all, evaluating a dictionary should always terminate. I'll leave this ticket open to help me remember to try that.

Simon

comment:2 Changed 9 years ago by igloo

Milestone: 6.12 branch
Owner: set to simonpj

comment:3 Changed 9 years ago by simonmar

Architecture: UnknownUnknown/Multiple

comment:4 Changed 9 years ago by simonmar

Operating System: UnknownUnknown/Multiple

comment:5 Changed 8 years ago by simonmar

Type of failure: Runtime performance bug

comment:6 Changed 8 years ago by rl

I'm not sure how this translates to the new inliner. This is code that is generated now:

Foo.sum' =
  \ (@ a_aj6) ($dNum_aje :: GHC.Num.Num a_aj6) ->
    let {
      lit_skO [Dmd=Just L] :: a_aj6
      [LclId, Str=DmdType]
      lit_skO = GHC.Num.fromInteger @ a_aj6 $dNum_aje Foo.sum'1 } in
    letrec {
      loop_skM [Occ=LoopBreaker] :: a_aj6 -> [a_aj6] -> a_aj6
      [LclId, Arity=2, Str=DmdType SS]
      loop_skM =
        \ (z_afp :: a_aj6) (ds_djs :: [a_aj6]) ->
          case z_afp of z1_XfG { __DEFAULT ->
          case ds_djs of _ {
            [] -> z1_XfG;
            : x_afr xs_afs ->
              loop_skM (GHC.Num.+ @ a_aj6 $dNum_aje z1_XfG x_afr) xs_afs
          }
          }; } in
    \ (xs_afu :: [a_aj6]) -> loop_skM lit_skO xs_afu

Instead of scrutinising the dictionary, we are now calling the method selector in every iteration. Is this even worse than before?

comment:7 Changed 8 years ago by simonpj

I think the underling issue hasn't changed. The only difference is that before we inlined the method selector and now we don't. (Deliberately -- little is gained by doing so.) So it's more or less the same with new and old inliner.

I think that strict dictionaries are still the answer here.

Simon

comment:8 Changed 8 years ago by igloo

Milestone: 6.12 branch6.12.3

comment:9 Changed 8 years ago by igloo

Milestone: 6.12.36.14.1
Priority: normallow

comment:10 Changed 7 years ago by simonpj

See #4191 for more discussion.

In particular

  • Can -fdicts-strict be a dynamic flag?

comment:11 Changed 7 years ago by igloo

Milestone: 7.0.17.0.2

comment:12 Changed 7 years ago by igloo

Milestone: 7.0.27.2.1

comment:13 Changed 6 years ago by igloo

Milestone: 7.2.17.4.1

comment:14 Changed 6 years ago by igloo

Milestone: 7.4.17.6.1
Priority: lowlowest

comment:15 Changed 5 years ago by igloo

Milestone: 7.6.17.6.2

Changed 5 years ago by morabbin

Attachment: T2439.hs added

Exhibits problems described in #2439

comment:16 Changed 5 years ago by morabbin

For the original example, -fdicts-strict seems to make no difference now (i.e., it seems the "right" code is now produced even without -fdicts-strict):

Foo.sum' :: forall a_aeH. GHC.Num.Num a_aeH => [a_aeH] -> a_aeH
[GblId, Arity=2, Caf=NoCafRefs]
Foo.sum' =
  \ (@ a_c) ($dNum_afi :: GHC.Num.Num a_c) (xs_afb :: [a_c]) ->
    let {
      f_af2 :: a_c -> a_c -> a_c
      [LclId]
      f_af2 = GHC.Num.+ @ a_c $dNum_afi } in
    case GHC.Num.fromInteger @ a_c $dNum_afi (__integer 0)
    of z_Xf9 { __DEFAULT ->
    letrec {
      loop_aff [Occ=LoopBreaker] :: a_c -> [a_c] -> a_c
      [LclId, Arity=2]
      loop_aff =
        \ (z1_af6 :: a_c) (ds_dfG :: [a_c]) ->
          case z1_af6 of z2_Xfg { __DEFAULT ->
          case ds_dfG of _ {
            [] -> z2_Xfg;
            : x_af8 xs1_af9 -> loop_aff (f_af2 z2_Xfg x_af8) xs1_af9
          }
          }; } in
    loop_aff z_Xf9 xs_afb
    }

comment:17 Changed 5 years ago by simonpj

The -fdicts-strict flag exists but it has had no love for years, and I doubt it works right.

One obstacle is that at the moment we implement a single-method type class with a newtype rather than a data type. We don't want to make that strict, lest we force the method rather than the dictionary.

I rather think we sould remove the special case, so that all dictionaries are data types; then we could be uniformly strict. But we should do some bemchmarking to check that nothing slows down.

Just awaiting someone with a bit of time and benchmarking bandwidth...

Simon

comment:18 Changed 5 years ago by carter

Cc: carter.schonwald@… added

@simon changing the single method type class encoding would break edwardk's reflection library http://hackage.haskell.org/package/reflection (which exploits that newtype encoding for single method type classes)

Is there a natural example where forcing that single method to whnf (if you mean that form of strictness) would be bad?

comment:19 Changed 5 years ago by simonpj

Can you explain eaxtly how making single-method classes be a data type would break edward's library?

And example where forcing a single method would be bad:

class C a where
  foo :: a

instance C Int where
  foo = undefined

bar :: C a => Int -> Int
bar x = x

main = print (bar (3 :: Int))

comment:20 Changed 3 years ago by thoughtpolice

Milestone: 7.6.27.10.1

Moving to 7.10.1.

comment:21 Changed 3 years ago by thoughtpolice

Milestone: 7.10.17.12.1

Moving to 7.12.1 milestone; if you feel this is an error and should be addressed sooner, please move it back to the 7.10.1 milestone.

comment:22 Changed 3 years ago by thoughtpolice

Moving to 7.12.1 milestone; if you feel this is an error and should be addressed sooner, please move it back to the 7.10.1 milestone.

comment:23 Changed 2 years ago by ekmett

Cc: ekmett added

@simonpj:

Right now the reflection library (and a few other less savory libraries of mine) rely on the internals of a dictionary with one member and that member being the same thing so I can unsafeCoerce to build one.

An example outside of the reflection library where we need to be able to build a dictionary out of whole cloth:

data SNat (n :: Nat) where
  Snat :: KnownNat n => SNat n

Even if in a context we know (KnownNat n, KnownNat m), we still can't derive KnownNat (n + m) with the current Nat solver. Yet having an operator

(+) :: SNat n -> SNat m -> SNat (n + m) 

for adding such SNat's at the value level makes sense and things like that are important for many (if not most) applications of having a type level natural number type.

I wind up having to backfill these things in by constructing dictionaries by hand.

In theory, if I had to, I could start to rely on, say, it being the same as a one member data type instead if we decided to make that transition, so long as I have something available to me that has the right form that I could unsafeCoerce into a dictionary.

Regarding performance impact: it'd introduce an extra pointer-dereference everywhere for dictionaries like KnownNat or Reifies, which currently just carry an Integer or a value in newtype form, but now would have an extra reference involved. It is not terribly significant, assuming light usage, but it is not free.

comment:24 Changed 2 years ago by simonpj

You don't give an actual example, and my brain is too small to reconstruct the code you have in mind.

Still, I do agree that wrapping a single value in a data type carries an indirection cost. We could make GHC strict in non-single-method dictionaries; indeed that's the way it's supposed to work right now. But someone needs to actually try it out, benchmark etc.

In short, I'm in no hurry to make this change.

Meanwhile, it's a crime that you have to fake this stuff up. The more concrete your examples of what you can't do type-securely, the more likely someone is to fix them. At the moment I am clueless.

Simon

comment:25 Changed 2 years ago by ekmett

Here's a worked version of the SNat thing I was talking about:

{-# LANGUAGE RankNTypes, TypeOperators, GADTs, DataKinds #-}
module SNat where

import GHC.TypeLits
import Unsafe.Coerce

data SNat n where
  SNat :: KnownNat n => SNat n

snatVal :: SNat n -> Integer
snatVal n@SNat = natVal n

instance Show (SNat n) where
  showsPrec d n = showsPrec d (snatVal n)

newtype Magic n = Magic (KnownNat n => SNat n)

add :: SNat n -> SNat m -> SNat (n + m)
add n m = unsafeCoerce (Magic SNat) (snatVal n + snatVal m)

mul :: SNat n -> SNat m -> SNat (n * m)
mul n m = unsafeCoerce (Magic SNat) (snatVal n * snatVal m)

With that at the ghci prompt we can now look at SNat's

ghci> SNat :: SNat 32
32

and I can work around the limitations of the nat solver, by helping it out as needed at the value level:

ghci> add (SNat :: SNat 1) (SNat :: SNat 2)
3

Then if I need the fabricated KnownNat instance for the sum I can just open up my SNat constructor and bring it into scope, like the Show instance just did.

Here I had to make up my own SNat type as the one that is provided by GHC is actually not exported.

add and mul are evil and unsafe, but here they do precisely the right thing. Their results are correct, even if the methodology is suspect.

We get them by coercing Magic n into a function and passing it the value that is the representation of KnownNat.

The core that gets generated is just that of explicit dictionary passing, same as with reflection. With reflection I relied on the quantifier when reifying to ensure things were generative.

Here I rely on the fact that I'm working with singleton values, but the mechanism used to cheat the system is the same.

The current reflection code looks similar, just using a different Magic type and some quantifier and Proxy noise to permit use in more situation.

class Reifies s a | s -> a where
  reflect :: proxy s -> a

newtype Magic a r = Magic (forall (s :: *). Reifies s a => Proxy s -> r)

-- | Reify a value at the type level, to be recovered with 'reflect'.
reify :: forall a r. a -> (forall (s :: *). Reifies s a => Proxy s -> r) -> r
reify a k = unsafeCoerce (Magic k :: Magic a r) (const a) Proxy

In a world that was modified to make all dictionaries live inside a data type so they could be forced without consequence then I'd be changing reify to something more like:

data Dictionary a = Dictionary a

reify :: forall a r. a -> (forall (s :: *). Reifies s a => Proxy s -> r) -> r
reify a k = unsafeCoerce (Magic k :: Magic a r) (Dictionary (const a)) Proxy

assuming that the dictionary representation and the representation of a data type with exactly one field (like Dictionary) were the same.

I could patch around it if needed, but the proposal here seems odd to me.

comment:26 Changed 2 years ago by simonpj

I can't say I understand what is going on here, but in reify/reflect it seems that you want something akin to a local instance declaration. You want to write:

  reify (x :: a) (\ (p :: Proxy s) ->
                  ...In here we have (Reifies s a)...
                 )

And you want to supply the local instance of (Reifies s a) yourself.

Isn't this just what implicit parameters are for? They give you local instance declarations, in effect.

comment:27 Changed 2 years ago by thoughtpolice

Milestone: 7.12.18.0.1

Milestone renamed

comment:28 Changed 23 months ago by thomie

Milestone: 8.0.1

comment:29 in reply to:  26 Changed 19 months ago by dfeuer

Replying to simonpj:

I can't say I understand what is going on here, but in reify/reflect it seems that you want something akin to a local instance declaration. You want to write:

  reify (x :: a) (\ (p :: Proxy s) ->
                  ...In here we have (Reifies s a)...
                 )

And you want to supply the local instance of (Reifies s a) yourself.

Isn't this just what implicit parameters are for? They give you local instance declarations, in effect.

The reflection package was largely motivated by Functional Pearl: Implicit Configurations -- or, Type Classes Reflect the Values of Types, by Oleg Kiselyov and Chung-chieh Shan, although their implementation is entirely different. Section 6.2 of the paper explains in depth why implicit parameters are insufficient.

Note: See TracTickets for help on using tickets.