Opened 6 years ago

Last modified 10 months ago

#5775 new bug

Inconsistency in demand analysis

Reported by: rl Owned by:
Priority: high Milestone: 8.4.1
Component: Compiler Version: 7.5
Keywords: Cc: pho@…, v.dijk.bas@…
Operating System: Unknown/Multiple Architecture: Unknown/Multiple
Type of failure: Runtime performance bug Test Case:
Blocked By: Blocking:
Related Tickets: Differential Rev(s):
Wiki Page:

Description (last modified by bgamari)

A small program:

{-# LANGUAGE MagicHash, UnboxedTuples #-}
module U where
import GHC.Prim
import GHC.Types

idx :: Addr# -> Int -> Int
{-# INLINE idx #-}
idx a (I# i) = case readIntOffAddr# a i realWorld# of (# _, y #) -> I# y

f :: Int -> Int -> Int
{-# INLINE f #-}
f x y = y + x

foo :: Addr# -> Int -> Int
foo a n = n `seq` loop (idx a 0) 1
    loop x i = case i >= n of
      False -> loop (f x (idx a i)) (i+1)
      True  -> x

GHC infers the demand LU(L) for loop, only unboxes the second argument, ultimately generates a loop of type Int -> Int# -> Int and always allocates a thunk for the first argument:

      $wloop_si9 [Occ=LoopBreaker] :: Int -> Int# -> Int
      [LclId, Arity=2, Str=DmdType LL]
      $wloop_si9 =
        \ (w1_shU :: Int) (ww1_shX :: Int#) ->
          case >=# ww1_shX ww_si5 of _ {
            False ->
                (case readIntOffAddr# @ RealWorld w_si2 ww1_shX realWorld#
                 of _ { (# _, y_a9S #) ->
                 case w1_shU of _ { I# y1_ahb -> I# (+# y_a9S y1_ahb) }
                (+# ww1_shX 1);
            True -> w1_shU
          }; }

But if I change the pragma on f from INLINE to NOINLINE, loop gets the demand U(L)U(L)m and GHC manages to unbox both arguments as well as the result, producing a nice tight loop:

      $wloop_sii [Occ=LoopBreaker] :: Int# -> Int# -> Int#
      [LclId, Arity=2, Str=DmdType LL]
      $wloop_sii =
        \ (ww1_shW :: Int#) (ww2_si0 :: Int#) ->
          case >=# ww2_si0 ww_sib of _ {
            False ->
              case readIntOffAddr# @ RealWorld w_si8 ww2_si0 realWorld#
              of _ { (# _, y1_Xac #) ->
              case f (I# ww1_shW) (I# y1_Xac) of _ { I# ww3_Xin ->
              $wloop_sii ww3_Xin (+# ww2_si0 1)
            True -> ww1_shW
          }; }

It would be nice if this happened in both cases.

Change History (18)

comment:1 Changed 6 years ago by rl

Type of failure: None/UnknownRuntime performance bug

comment:2 Changed 6 years ago by igloo

difficulty: Unknown
Milestone: 7.6.1

comment:3 Changed 6 years ago by igloo


comment:4 Changed 6 years ago by simonpj

OK I understand what is happening here. Consider this:

f x = do { writeMutVar v 3
         ; if x then ... else .. }
foo xs = catch (f (head xs))
               (\_ -> readMutVar v)

Question: can I use call-value for f? Is f strict in x?

Well, when you call f, it'll certainly evalute x (I'm going to ignore the state-token bit here; it's not the point.) But

  • if you use call-by-value, the the call foo [] will throw an exception before executing the writeMutVar.
  • if you use cally-by-need, the call foo [] will throw an exception only after executing the writeMutVar.

The difference is observable.

It's even more stark if in instead of writeMutVar you have throwIO, so that execution should not proceed beyond the throwIO.

Tickets #148 and #1592 concerned exactly this point. The fix in #148 (nine years ago!) added a grotesque HACK in the demand analyser, commented thus:

	-- There's a hack here for I/O operations.  Consider
	-- 	case foo x s of { (# s, r #) -> y }
	-- Is this strict in 'y'.  Normally yes, but what if 'foo' is an I/O
	-- operation that simply terminates the program (not in an erroneous way)?
	-- In that case we should not evaluate y before the call to 'foo'.
	-- Hackish solution: spot the IO-like situation and add a virtual branch,
	-- as if we had
	-- 	case foo x s of 
	--	   (# s, r #) -> y 
	--	   other      -> return ()
	-- So the 'y' isn't necessarily going to be evaluated
	-- A more complete example (Trac #148, #1592) where this shows up is:
	--	do { let len = <expensive> ;
	--	   ; when (...) (exitWith ExitSuccess)
	--	   ; print len }

Now, Roman has discovered something else.

  • Not only is the hack unsavory,
  • not only does it make loop lazy when it should be strict,
  • but in addition it is incredibly fragile.

By not-inlining f the hack doesn't fire, and loop gets stricter! This is simply unacceptable.

I don't really know what the right thing to do is. I'm very inclined simply to remove the hack and see whether anyone complains. But I'm open to suggestions.

comment:5 Changed 6 years ago by rl

Hmm, isn't it only fragile in the presence of unsafePerformIO and friends (which is what idx in my example does)? If so, does it really matter? If I use unsafePerformIO, I'm on my own!

This whole example is really an artefact of Storable living in IO rather than in ST. It seems that is causing even more problems than I realised.

comment:6 Changed 6 years ago by simonpj

Maybe fragility only happens with unsafePerformIO. I'm not certain.

comment:7 Changed 6 years ago by igloo

Should bindIO use lazy, something like:

bindIO (IO m) k = IO $ \ s -> case m s of (# new_s, a #) -> unIO (lazy (k a)) new_s

or would that be too pessimistic?

comment:8 Changed 6 years ago by simonmar

If the fragility of the hack is only exposed by unsafePerformIO, then I'm in favour of keeping the hack. I think it's important that evaluation not commute with side effects in IO, otherwise I'm afraid we'll start seeing lots of strange effects (Simon's example of exceptions floating outside of catch is a great example, and I'm sure there will be lots more).

I think the Eval monad might be relying on this hack too, but I'll have to check.

comment:9 Changed 6 years ago by PHO

Cc: pho@… added

comment:10 Changed 6 years ago by basvandijk

Cc: v.dijk.bas@… added

comment:11 Changed 5 years ago by igloo


comment:12 Changed 5 years ago by igloo


comment:13 Changed 3 years ago by thoughtpolice


Moving to 7.10.1.

comment:14 Changed 3 years ago by thoughtpolice


Moving to 7.12.1 milestone; if you feel this is an error and should be addressed sooner, please move it back to the 7.10.1 milestone.

comment:15 Changed 2 years ago by thoughtpolice


Milestone renamed

comment:16 Changed 22 months ago by bgamari

Description: modified (diff)
Owner: set to bgamari
Priority: normalhigh

For the record this is still reproducible with GHC 7.10.3.

It sounds like this ought to be revisited (if for no other reason than to ensure that the fragility really is only exposed with unsafePerformIO).

comment:17 Changed 10 months ago by bgamari

Owner: bgamari deleted

I will not have time to look further into this.

comment:18 Changed 10 months ago by bgamari


Bumping off to 8.4 due to lack of owner.

Note: See TracTickets for help on using tickets.