Opened 7 years ago

Closed 20 months ago

#5654 closed bug (fixed)

Profiling semantics bug

Reported by: simonmar Owned by:
Priority: low Milestone: 8.2.1
Component: Profiling Version: 7.2.1
Keywords: Cc: ezyang@…, scpmw
Operating System: Unknown/Multiple Architecture: Unknown/Multiple
Type of failure: None/Unknown Test Case: profiling/should_run/scc004
Blocked By: Blocking:
Related Tickets: #10007 Differential Rev(s):
Wiki Page:

Description

GHC doesn't quite implement the new cost-centre stack semantics correctly. e.g.

{-# NOINLINE f #-}
f :: Int -> Int
f = {-# SCC f #-} g

{-# NOINLINE g #-}
g :: Int -> Int
g x = {-# SCC g #-} x + 1

main = print (f 3)

What we want is

 CAF        Main                    106           0    0.0    0.7     0.0   21.7
  f         Main                    201           1    0.0    0.0     0.0    0.0
  main      Main                    200           1    0.0   20.9     0.0   21.0
   f        Main                    202           0    0.0    0.0     0.0    0.1
    g       Main                    203           1    0.0    0.1     0.0    0.1

but we get

 CAF        Main                    106           0    0.0    0.7     0.0   21.6
  f         Main                    201           1    0.0    0.0     0.0    0.0
  main      Main                    200           1    0.0   20.9     0.0   20.9
   g        Main                    202           1    0.0    0.0     0.0    0.0

The original flat CC profiler used to use IND_PERM indirections to handle this kind of thing, but we aren't doing that now.

Change History (24)

comment:1 Changed 7 years ago by igloo

Milestone: 7.4.17.6.1
Priority: normallow

comment:2 Changed 6 years ago by igloo

Milestone: 7.6.17.6.2

comment:3 Changed 6 years ago by ezyang

Cc: ezyang@… added

comment:4 Changed 4 years ago by thoughtpolice

Milestone: 7.6.27.10.1

Moving to 7.10.1.

comment:5 Changed 4 years ago by thomie

difficulty: Unknown
Status: newinfoneeded

The results with HEAD are both different from "what we want" and "what we get". What are we aiming for precisely?

ghc-7.9.20141125 -prof -fforce-recomp test.hs; ./test +RTS -p

  MAIN        MAIN                     43           0    0.0   22.2     0.0  100.0
   CAF        Main                     85           0    0.0    0.1     0.0    0.1
    g         Main                     87           1    0.0    0.0     0.0    0.0
    f         Main                     86           1    0.0    0.0     0.0    0.0

And with -O2:

  MAIN        MAIN                     43           0    0.0   22.2     0.0  100.0
   CAF        Main                     85           0    0.0    0.1     0.0    0.1
    g         Main                     87           1    0.0    0.0     0.0    0.0
    f         Main                     86           1    0.0    0.0     0.0    0.0

comment:6 Changed 4 years ago by thoughtpolice

Milestone: 7.10.17.12.1

Moving to 7.12.1 milestone; if you feel this is an error and should be addressed sooner, please move it back to the 7.10.1 milestone.

comment:7 Changed 4 years ago by simonmar

Cc: scpmw added
Status: infoneedednew

-fprof-auto is needed to get the main SCC too.

comment:8 Changed 3 years ago by thoughtpolice

Milestone: 7.12.18.0.1

Milestone renamed

comment:9 Changed 3 years ago by bgamari

comment:10 Changed 3 years ago by nomeata

In Phab:D1618, which removes IND_PERM, simonmar wrote that the functionality removed there might be needed to fix this.

comment:11 Changed 3 years ago by thomie

Milestone: 8.0.1

comment:12 Changed 3 years ago by Simon Marlow <marlowsd@…>

In 85daac5/ghc:

Fix cost-centre-stack bug when creating new PAP (#5654)

See comment in `AutoApply.h`.  This partly fixes #5654.  New test
added, and renamed the old test to match the ticket number.

comment:13 Changed 3 years ago by thomie

Commit that nomeata mentioned in comment:10 (Phab:D1821 actually):

commit f42db1574935b088cfc13cca7c935990002651dc

Author: Joachim Breitner <mail@joachim-breitner.de>
Date:   Sat Jan 23 13:12:10 2016 +0100

    Remove unused IND_PERM
    
    it seems that this closure type has not been in use since 5d52d9, so all
    this is dead and untested code. This removes it. Some of the code might
    be useful for a counting indirection as described in #10613, so when
    implementing that, have a look at what this commit removes.
    
    Test Plan: validate on harbormaster
    
    Reviewers: austin, bgamari, simonmar
    
    Reviewed By: simonmar
    
    Subscribers: thomie
    
    Differential Revision: https://phabricator.haskell.org/D1821

comment:14 Changed 3 years ago by thomie

Simon: in 85daac593c498f581d46f44982ee5dcf1001f611, you defined two new tests (T5654b-O0 and T5654b-O0), but didn't add .hs files for them.

comment:15 Changed 3 years ago by simonmar

Fixed, sorry about that.

comment:16 Changed 2 years ago by Ben Gamari <ben@…>

In 394231b3/ghc:

Fix cost-centre-stacks bug (#5654)

This fixes some cases of wrong stacks being generated by the profiler.
For background and details on the fix see
`Note [Evaluating functions with profiling]` in `rts/Apply.cmm`.

This does have an impact on allocations for some programs when
profiling.  nofib results:

```
   k-nucleotide          +0.0%     +8.8%    +11.0%    +11.0%      0.0%
         puzzle          +0.0%    +12.5%     0.244     0.246      0.0%
      typecheck           0.0%     +8.7%    +16.1%    +16.2%      0.0%
------------------------------------------------------------------------
--------
            Min          -0.0%     -0.0%    -34.4%    -35.5%    -25.0%
            Max          +0.0%    +12.5%    +48.9%    +49.4%    +10.6%
 Geometric Mean          +0.0%     +0.6%     +2.0%     +1.8%     -0.3%

```

But runtimes don't seem to be affected much, and the examples I looked
at were completely legitimate.  For example, in puzzle we have this:

```
position :: ItemType -> StateType ->  BankType
position Bono = bonoPos
position Edge = edgePos
position Larry = larryPos
position Adam = adamPos
```

where the identifiers on the rhs are all record selectors.  Previously
the profiler gave a stack that looked like

```
  position
  bonoPos
  ...
```

i.e. `bonoPos` was at the same level of the call stack as `position`,
but now it looks like

```
  position
   bonoPos
   ...
```

I used the normaliser from the testsuite to diff the profiling output
from other nofib programs and they all looked better.

Test Plan:
* the broken test passes
* validate
* compiled and ran all of nofib, measured perf, diff'd several .prof
files

Reviewers: niteria, erikd, austin, scpmw, bgamari

Reviewed By: bgamari

Subscribers: thomie

Differential Revision: https://phabricator.haskell.org/D2804

GHC Trac Issues: #5654, #10007

comment:17 Changed 2 years ago by bgamari

Milestone: 8.2.1
Resolution: fixed
Status: newclosed

comment:18 Changed 2 years ago by simonmar

Owner: simonmar deleted
Resolution: fixed
Status: closednew

I uncovered a bug in the patch, triggered by using GHCi with profiling. Fix incoming.

comment:19 Changed 2 years ago by Ben Gamari <ben@…>

In 2a02040b/ghc:

Fix bug in previous fix for #5654

I forgot to account for BCOs, which have a different layout from
functions.  This caused crashes when using profiling with GHCi (via
-fexternal-interpreter -prof), which unfortunately is not tested at all
by validate, even when profiling is enabled.  I'm going to add some
testing that would have caught this in a separate patch.

Test Plan:
```
cd nofib/spectral/puzzle && make NoFibWithGHCi=YES
EXTRA_RUNTEST_OPTS='-fexternal-interpreter -prof'
```
New testsuite tests coming in a separate diff.

Reviewers: niteria, austin, erikd, bgamari

Reviewed By: bgamari

Subscribers: thomie

Differential Revision: https://phabricator.haskell.org/D2868

GHC Trac Issues: #5654

comment:20 Changed 2 years ago by Ben Gamari <ben@…>

In 90cfa849/ghc:

Run some tests with -fexternal-interpreter -prof

We don't have any other tests for this, except one Template Haskell
test.  This would have caught the bug I just fixed in D2868, at least
when validating with profiling on.

Test Plan: Ran tests

Reviewers: niteria, austin, erikd, bgamari

Reviewed By: bgamari

Subscribers: thomie

Differential Revision: https://phabricator.haskell.org/D2869

GHC Trac Issues: #5654

comment:21 Changed 2 years ago by bgamari

Resolution: fixed
Status: newclosed

comment:22 Changed 23 months ago by Simon Marlow <marlowsd@…>

In 3a18baf/ghc:

More fixes for #5654

* In stg_ap_0_fast, if we're evaluating a thunk, the thunk might
  evaluate to a function in which case we may have to adjust its CCS.

* The interpreter has its own implementation of stg_ap_0_fast, so we
  have to do the same shenanigans with creating empty PAPs and copying
  PAPs there.

* GHCi creates Cost Centres as children of CCS_MAIN, which enterFunCCS()
  wrongly assumed to imply that they were CAFs.  Now we use the is_caf
  flag for this, which we have to correctly initialise when we create a
  Cost Centre in GHCi.

comment:23 Changed 20 months ago by bgamari

Status: closedmerge

comment:24 Changed 20 months ago by bgamari

Status: mergeclosed

Reverted reverting commits on ghc-8.2 and merged fix to #13433.

Note: See TracTickets for help on using tickets.