Opened 9 years ago

Last modified 21 months ago

#3034 new bug

divInt# floated into a position which leads to low arity

Reported by: batterseapower Owned by:
Priority: lowest Milestone:
Component: Compiler Version: 6.10.1
Keywords: Cc: twhitehead@…
Operating System: Unknown/Multiple Architecture: Unknown/Multiple
Type of failure: Runtime performance bug Test Case:
Blocked By: Blocking:
Related Tickets: Differential Rev(s):
Wiki Page:

Description

Tyson Whitehead saw this in the Core output of one of his programs compiled using the MTL StateT:

$wdigit_s1GR [ALWAYS LoopBreaker Nothing] :: GHC.Prim.Int#
                                           -> GHC.Types.Int
                                           -> Control.Monad.State.Strict.StateT
                                                GHC.Types.Int
                                                (Control.Monad.Error.ErrorT
                                                   GHC.Base.String
                                                   Control.Monad.Identity.Identity)
                                                (GHC.Types.Int, GHC.Types.Int)
[Arity 1
Str: DmdType L]
$wdigit_s1GR =
\ (ww_X1H6 :: GHC.Prim.Int#) ->
  let {
    lvl_s1H5 [ALWAYS Just D(T)] :: GHC.Types.Int
    [Str: DmdType]
    lvl_s1H5 =
      case GHC.Prim.-# 2147483647 ww_X1H6 of wild2_a1xs [ALWAYS Just L] {
        __DEFAULT ->
          case GHC.Base.divInt# wild2_a1xs 10
          of wild21_a1xt [ALWAYS Just L] { __DEFAULT ->
          GHC.Types.I# wild21_a1xt
          };
        (-2147483648) -> lvl_s1Ha
      } } in
  (\ (eta_X1nK :: GHC.Types.Int) (eta_s1Dl :: GHC.Types.Int) ->
     case eta_s1Dl
     of y_XrP [ALWAYS Just A] { GHC.Types.I# ipv_s19d [ALWAYS Just L] ->
     case GHC.Prim.<=# ipv_s19d 214748363
     of wild_a19h [ALWAYS Dead Just A] {
       GHC.Bool.False ->
         case lvl_s1H5
         of wild1_X1zB [ALWAYS Just A]
         { GHC.Types.I# y_X1zG [ALWAYS Just L] ->
         case GHC.Prim.<=# ipv_s19d y_X1zG
         of wild_X1z [ALWAYS Dead Just A] {
           GHC.Bool.False ->
             a_s1Hk
             `cast` (right

This REALLY SHOULD have arity 3 because that allows:

  • More worker/wrapper
  • Less sharing of trivial partial applications elsewhere in his program

Here is my reply to him, explaining why it all happens:

Yes - GHC wants to share the work of (maxBound-x)`div`10 between
several partial applications of "digit". This is usually a good idea,
but in this case it sucks because it's resulted in a massively
increased arity. IMHO GHC should fix this by:
* Marking divInt# INLINE in the base library. This would result in
your code would just containing uses of quotInt#
* Making some operations cheap even if they may fail
(PrimOp.primpOpIsCheap should change). Though this might mean that we
turn non-terminating programs into terminating ones (such operations
get pushed inside lambdas) but this is consistent with our treatment
of lambdas generally.

Actually, your divInt# call wouldn't even usually be floated out to
between two lambdas, but at the time FloatOut runs there is something
in between the \x lambda and the lambdas from the state monad - the
monadic bind operator! So FloatOut feels free to move the computation
for "x" up even though that >>= will go away as soon as we run the
simplifier. What a disaster!

So one of FloatOut and primOpIsCheap probably needs to be fixed.

I've attached a program that can reproduce this issue.

Attachments (1)

Parse.hs (2.2 KB) - added by batterseapower 9 years ago.

Download all attachments as: .zip

Change History (23)

Changed 9 years ago by batterseapower

Attachment: Parse.hs added

comment:1 Changed 9 years ago by batterseapower

Cc: twhitehead@… added

comment:2 Changed 9 years ago by twhitehead

For some reason, using quot instead of div produces the desired code. This is despite quotInt# not being cheap according to ghci 6.10.1

Prelude> PrimOp.primOpIsCheap PrimOp.IntQuotOp False

Maybe it has something to do with there being a primitive IntQuotOp but not a corresponding IntDivOp (tidy core gives GHC.Prim.quotInt# versus GHC.Base.divInt#)

comment:3 Changed 9 years ago by twhitehead

I should avoiding the division altogether through quotRem generates really bad code as well

where digit :: Int -> Int -> ParseInt Int Int

digit !x = do !y <- lift get

( if y < ym
y==ym && x<=xm

then lift (put $ y*10+x) >> s2 else throwError "integer overflow" )

where (ym,xm) = maxBoundquotRem10

Putting separate "ym = maxBoundquot10" and "xm = maxBoundrem10" bindings gives good code.

Examining the assembler for a seperate routine like

demo :: Int -> Int -> Int demo x y = q+r

where q = quot x y

r = rem x y

shows this isn't a really great option in general because the GHC does it through two identical idivs on the x86_64. I am guessing the underlying issue is much the same due to no IntQuotRem prim op.

comment:4 Changed 9 years ago by igloo

difficulty: Unknown
Milestone: 6.12 branch

comment:5 Changed 8 years ago by simonmar

Type of failure: Runtime performance bug

comment:6 Changed 7 years ago by igloo

Milestone: 6.12 branch6.12.3

comment:7 Changed 7 years ago by igloo

Milestone: 6.12.36.14.1
Priority: normallow

comment:8 Changed 7 years ago by igloo

Milestone: 7.0.17.0.2

comment:9 Changed 7 years ago by igloo

Milestone: 7.0.27.2.1

comment:10 Changed 6 years ago by igloo

Milestone: 7.2.17.4.1

comment:11 Changed 6 years ago by igloo

Milestone: 7.4.17.6.1
Priority: lowlowest

comment:12 Changed 5 years ago by igloo

Milestone: 7.6.17.6.2

comment:13 Changed 4 years ago by simonpj

Might be worth trying making primOpIsCheap be true more often. Perhaps always true. Perhaps only if not out-of-line. Or perhaps a new property.

But we think it can be true of primops that can fail (like division), especially in lazy contexts like this one.

Simon

comment:14 Changed 4 years ago by nomeata

Status: newpatch

Changing primOpIsCheap to always True validates, and has no allocation change in nofib. I guess I should be measuring runtime numbers, but for that I’ll need the machine to be unloaded, maybe later today....

Alright, numbers are in:

            Min          -0.0%     -0.0%     -7.1%     -6.5%     -1.8%
            Max          +0.0%     +0.0%     +2.1%     +1.5%    +19.8%
 Geometric Mean          -0.0%     -0.0%     -0.6%     -0.5%     +0.2%

so it seems to be worth it.

Not pushing quite yet: Assuming we make the cheapness configurable in primops.txt.pp, what primops should not be cheap? To me all of them look cheap (e.g. arithmetic) or are side-effecting (like copyArray#, in which case the primOpIsCheap information hopefully has no effect. So it seems to me that we do not need to a flag in primops.txt.pp just yet; maybe something will show up later.

(Marking as for review without a patch because the patch is trivial; it is this question that need to be reviewed.)

comment:15 Changed 4 years ago by simonmar

For an example where we don't want to duplicate primops, see #5623

comment:16 Changed 4 years ago by simonpj

And we don't have a reproducible test case. So let's just park this.

Simon

comment:17 Changed 3 years ago by thoughtpolice

Milestone: 7.6.27.10.1
Status: patchinfoneeded

Moving out of patch (and to 7.10.1 anyway).

comment:18 Changed 3 years ago by thoughtpolice

Milestone: 7.10.17.12.1

Moving to 7.12.1 milestone; if you feel this is an error and should be addressed sooner, please move it back to the 7.10.1 milestone.

comment:19 Changed 3 years ago by thoughtpolice

Moving to 7.12.1 milestone; if you feel this is an error and should be addressed sooner, please move it back to the 7.10.1 milestone.

comment:20 Changed 2 years ago by thoughtpolice

Milestone: 7.12.18.0.1

Milestone renamed

comment:21 Changed 2 years ago by thomie

Status: infoneedednew

comment:22 Changed 21 months ago by thomie

Milestone: 8.0.1
Note: See TracTickets for help on using tickets.