Opened 7 years ago

Last modified 8 months ago

#4960 new task

Better inlining test in CoreUnfold

Reported by: simonpj Owned by:
Priority: low Milestone:
Component: Compiler Version: 7.0.1
Keywords: newcomer, Inlining Cc: bgamari, bredelings
Operating System: Unknown/Multiple Architecture: Unknown/Multiple
Type of failure: Runtime performance bug Test Case:
Blocked By: Blocking:
Related Tickets: Differential Rev(s):
Wiki Page:

Description

Consider this

f x = let 
        $j y = case x of { True -> e1; False -> e2 }
      in
      case x of 
        True -> ...$j y1...
        False -> ...$j y2...

If a function that scrutinises its argument is applied to a constructor, it becomes much keener to inline. But in this example the function scrutinises a free variable that is evaluated to a known constructor at the call site. At the moment this is ignored, and $j may well not be inlined in situations where it would be jolly good to do so.

This shows up in test perf/should_run/MethSharing where the join points created early only inline because exprIsDupable is just generous enough to do so. If you reduce CoreUtils.dupAppSize by 1, the join point doesn't inline any more. But it should.

The solution is fairly easy: compute discounts on free varaibles as well as arguments. I'm making a ticket because I don't want to execute on this today.

Change History (16)

comment:1 Changed 7 years ago by igloo

Milestone: 7.4.1

comment:2 Changed 6 years ago by simonpj

See #3781, #3755, which would find this optimisation useful.

comment:3 Changed 6 years ago by igloo

Milestone: 7.4.17.6.1
Priority: normallow

comment:4 Changed 5 years ago by igloo

Milestone: 7.6.17.6.2

comment:5 Changed 3 years ago by thoughtpolice

Milestone: 7.6.27.10.1

Moving to 7.10.1.

comment:6 Changed 3 years ago by thomie

difficulty: Unknown
Type: bugtask
Type of failure: None/UnknownRuntime performance bug

comment:7 Changed 3 years ago by thoughtpolice

Milestone: 7.10.17.12.1

Moving to 7.12.1 milestone; if you feel this is an error and should be addressed sooner, please move it back to the 7.10.1 milestone.

comment:8 Changed 3 years ago by bgamari

Cc: bgamari added

comment:9 Changed 2 years ago by thoughtpolice

Milestone: 7.12.18.0.1

Milestone renamed

comment:10 Changed 23 months ago by thomie

Milestone: 8.0.1

comment:11 Changed 17 months ago by mpickering

Keywords: newcomer added

Seems well specified and unimplemented currently.

comment:12 Changed 17 months ago by mpickering

Keywords: Inlining added

comment:13 Changed 11 months ago by alexbiehl

I don't know if this is that well defined. I have tried to implement this. Please take a look at https://github.com/alexbiehl/ghc/commit/fd9608ea93fc2389907b82c3fe540805d986c28e. Is this what you were talking about @simonpj?

comment:14 Changed 11 months ago by alexbiehl

I have refined the above patch in https://github.com/alexbiehl/ghc/commit/e29f88b5d952f2f40f68e2bb49f051b6684d2686 and ran ./validate --testsuite-only --fast on it

It result in longer compile time (about 27-35%) for T1969 and T3294 I have no clue why. But both haddock.base and haddock.Cabal show ~30% less allocations and `T9203' even 57%. I think this is nice. I hope I am on the right track here.

Unexpected stat failures:
   /var/folders/7n/mkhl20_55n95s7ytnm9b11lw0000gp/T/ghctest-jl0wiuid/test   spaces/./perf/compiler/T1969.run         T1969 [stat not good enough] (normal)
   /var/folders/7n/mkhl20_55n95s7ytnm9b11lw0000gp/T/ghctest-jl0wiuid/test   spaces/./perf/compiler/T5631.run         T5631 [stat too good] (normal)
   /var/folders/7n/mkhl20_55n95s7ytnm9b11lw0000gp/T/ghctest-jl0wiuid/test   spaces/./perf/compiler/T3294.run         T3294 [stat not good enough] (normal)
   /var/folders/7n/mkhl20_55n95s7ytnm9b11lw0000gp/T/ghctest-jl0wiuid/test   spaces/./perf/haddock/haddock.base.run   haddock.base [stat too good] (normal)
   /var/folders/7n/mkhl20_55n95s7ytnm9b11lw0000gp/T/ghctest-jl0wiuid/test   spaces/./perf/haddock/haddock.Cabal.run  haddock.Cabal [stat too good] (normal)
   /var/folders/7n/mkhl20_55n95s7ytnm9b11lw0000gp/T/ghctest-jl0wiuid/test   spaces/./perf/should_run/T5205.run       T5205 [stat too good] (normal)
   /var/folders/7n/mkhl20_55n95s7ytnm9b11lw0000gp/T/ghctest-jl0wiuid/test   spaces/./perf/should_run/T9203.run       T9203 [stat too good] (normal)

peak_megabytes_allocated value is too high:
    Expected    T1969(normal) peak_megabytes_allocated: 68 +/-20%
    Lower bound T1969(normal) peak_megabytes_allocated: 54 
    Upper bound T1969(normal) peak_megabytes_allocated: 82 
    Actual      T1969(normal) peak_megabytes_allocated: 87 
    Deviation   T1969(normal) peak_megabytes_allocated: 27.9 %
max_bytes_used value is too high:
    Expected    T1969(normal) max_bytes_used: 17285216 +/-15%
    Lower bound T1969(normal) max_bytes_used: 14692433 
    Upper bound T1969(normal) max_bytes_used: 19877999 
    Actual      T1969(normal) max_bytes_used: 23225104 
    Deviation   T1969(normal) max_bytes_used:     34.4 %
*** unexpected stat test failure for T1969(normal)

max_bytes_used value is too high:
    Expected    T3294(normal) max_bytes_used: 52992688 +/-20%
    Lower bound T3294(normal) max_bytes_used: 42394150 
    Upper bound T3294(normal) max_bytes_used: 63591226 
    Actual      T3294(normal) max_bytes_used: 64107552 
    Deviation   T3294(normal) max_bytes_used:     21.0 %
*** unexpected stat test failure for T3294(normal)

bytes allocated value is too low:
(If this is because you have improved GHC, please
update the test so that GHC doesn't regress again)
    Expected    T5631(normal) bytes allocated: 1077429456 +/-5%
    Lower bound T5631(normal) bytes allocated: 1023557983 
    Upper bound T5631(normal) bytes allocated: 1131300929 
    Actual      T5631(normal) bytes allocated: 1006121416 
    Deviation   T5631(normal) bytes allocated:       -6.6 %
*** unexpected stat test failure for T5631(normal)

bytes allocated value is too low:
(If this is because you have improved GHC, please
update the test so that GHC doesn't regress again)
    Expected    haddock.base(normal) bytes allocated: 32855223200 +/-5%
    Lower bound haddock.base(normal) bytes allocated: 31212462040 
    Upper bound haddock.base(normal) bytes allocated: 34497984360 
    Actual      haddock.base(normal) bytes allocated: 22651400376 
    Deviation   haddock.base(normal) bytes allocated:       -31.1 %
*** unexpected stat test failure for haddock.base(normal)

	bytes allocated value is too low:
	(If this is because you have improved GHC, please
	update the test so that GHC doesn't regress again)
	    Expected    haddock.Cabal(normal) bytes allocated: 25478853176 +/-5%
	    Lower bound haddock.Cabal(normal) bytes allocated: 24204910517 
	    Upper bound haddock.Cabal(normal) bytes allocated: 26752795835 
	    Actual      haddock.Cabal(normal) bytes allocated: 17772058640 
	    Deviation   haddock.Cabal(normal) bytes allocated:       -30.2 %

bytes allocated value is too low:
(If this is because you have improved GHC, please
update the test so that GHC doesn't regress again)
    Expected    T5205(normal) bytes allocated: 56208 +/-5%
    Lower bound T5205(normal) bytes allocated: 53397 
    Upper bound T5205(normal) bytes allocated: 59019 
    Actual      T5205(normal) bytes allocated: 53360 
    Deviation   T5205(normal) bytes allocated:  -5.1 %
*** unexpected stat test failure for T5205(normal)

bytes allocated value is too low:
(If this is because you have improved GHC, please
update the test so that GHC doesn't regress again)
    Expected    T9203(normal) bytes allocated:  95451192 +/-5%
    Lower bound T9203(normal) bytes allocated:  90678632 
    Upper bound T9203(normal) bytes allocated: 100223752 
    Actual      T9203(normal) bytes allocated:  40617752 
    Deviation   T9203(normal) bytes allocated:     -57.4 %
*** unexpected stat test failure for T9203(normal)

Last edited 11 months ago by alexbiehl (previous) (diff)

comment:15 Changed 11 months ago by mpickering

Perhaps you could push a branch so that gipeda can pick it up?

comment:16 Changed 8 months ago by bredelings

Cc: bredelings added
Note: See TracTickets for help on using tickets.