A program which benefits from a late specialisation pass

changed weight to 5

Interesting! Do you have any insight about why it benefits from late specialisaion? In general that's unusual.

The code in question first uses a type class to generate an overloaded function. The overloaded function is not immediately apparent, it is defined in terms of combinators which must be inlined and then later we get calls of fmap next to a dictionary which can be specialised upon.

Diffing the core output immediately shows where the difference is. In the bad version we have lots of calls to fmap which are not eliminated because the function they are contained in is not specialised.

1486,1497c1514,1560
< -- RHS size: {terms: 13, types: 12, coercions: 13, joins: 0/0}
< $s$fGHasTypeskaK1_$cgtypes_$s$dHastypes'
< $s$fGHasTypeskaK1_$cgtypes_$s$dHastypes'
<   = \ eta_B2 eta1_B1 ->
<       case eta1_B1 of {
<         [] -> [] `cast` <Co:4>;
<         : g1_ab8Q g2_ab8R ->
<           (: ((eta_B2 g1_ab8Q) `cast` <Co:2>)
<              (($s$fGHasTypeskaK1_$cgtypes_$s$dHastypes' eta_B2 g2_ab8R)
<               `cast` <Co:3>))
<           `cast` <Co:4>
<       }
---
> -- RHS size: {terms: 63, types: 791, coercions: 308, joins: 0/0}
> $s$fGHasTypeskaK1_$cgtypes1
> $s$fGHasTypeskaK1_$cgtypes1
>   = \ @ f_a5xv $dApplicative_a5xx eta_B2 eta1_B1 ->
>       fmap
>         ($p1Applicative $dApplicative_a5xx)
>         $fGeneric[]_$cto
>         ((fmap
>             ($p1Applicative $dApplicative_a5xx)
>             ($s$fGHasTypeskaK1_$cgtypes8 `cast` <Co:121>)
>             (case eta1_B1 of {
>                [] ->
>                  fmap
>                    ($p1Applicative $dApplicative_a5xx)
>                    L1
>                    (fmap
>                       ($p1Applicative $dApplicative_a5xx)
>                       ($s$fGHasTypeskaK1_$cgtypes7 `cast` <Co:22>)
>                       (pure $dApplicative_a5xx U1));
>                : g1_abai g2_abaj ->
>                  fmap
>                    ($p1Applicative $dApplicative_a5xx)
>                    R1
>                    (fmap
>                       ($p1Applicative $dApplicative_a5xx)
>                       ($s$fGHasTypeskaK1_$cgtypes6 `cast` <Co:78>)
>                       (<*>
>                          $dApplicative_a5xx
>                          (fmap
>                             ($p1Applicative $dApplicative_a5xx)
>                             :*:
>                             (fmap
>                                ($p1Applicative $dApplicative_a5xx)
>                                ($s$fGHasTypeskaK1_$cgtypes5 `cast` <Co:28>)
>                                (fmap
>                                   ($p1Applicative $dApplicative_a5xx)
>                                   ($s$fGHasTypeskaK1_$cgtypes4 `cast` <Co:11>)
>                                   (eta_B2 g1_abai))))
>                          (fmap
>                             ($p1Applicative $dApplicative_a5xx)
>                             ($s$fGHasTypeskaK1_$cgtypes3 `cast` <Co:28>)
>                             (fmap
>                                ($p1Applicative $dApplicative_a5xx)
>                                ($s$fGHasTypeskaK1_$cgtypes2 `cast` <Co:13>)
>                                ($s$fGHasTypeskaK1_$cgtypes1 $dApplicative_a5xx eta_B2 g2_abaj)))))
>              }))
>          `cast` <Co:7>)

The odd thing is that this function still exists

> $s$fGHasTypeskaK1_$cgtypes1
>   = \ @ f_a5xv $dApplicative_a5xx eta_B2 eta1_B1 ->

It has a dictionary argument so it'd ususally have been specialised earlier. Looking at it, it could originally have been a function of type

foo :: forall a. C a => foralll b. D b => blah

Now, I think the specialiser might specialise only one "layer" of a function like that at a time. And that might be fixable, if that's the problem.

I don't know if this is related. I asked Johan Tibell the other day why unordered-containers marks almost everything INLINE instead of INLINABLE. He replied that when an INLINE function calls an INLINABLE one, we end up calling to specialize. He also indicated that he'd opened a ticket about this long ago; I don't know which one.

Trac metadata

Trac field	Value
CC	- → dfeuer

added Pnormal label

mentioned in commit afad5561

We think this is the patch

commit afad5561d88f04744c398ef0640d846db6262aa0
Author: Matthew Pickering <matthewtpickering@gmail.com>
Date:   Mon Mar 19 13:29:14 2018 -0400

    Add -flate-specialise which runs a later specialisation pass
    
    Runs another specialisation pass towards the end of the optimisation
    pipeline. This can catch specialisation opportunities which arose from
    the previous specialisation pass or other inlining.
    
    You might want to use this if you are you have a type class method
    which returns a constrained type. For example, a type class where one
    of the methods implements a traversal.
    
    It is not enabled by default or any optimisation level. Only by
    manually enabling the flag `-flate-specialise`.
    
    Reviewers: bgamari
    
    Reviewed By: bgamari
    
    Subscribers: rwbarton, thomie, carter
    
    Differential Revision: https://phabricator.haskell.org/D4457

Using 8.10 the program given as motivation does not seem to benefit from -flate-specialise.

There has been a refactor of the specialiser somewhat recently that might have affected this.

BTW, at

https://downloads.haskell.org/ghc/latest/docs/users_guide/using-optimisation.html#ghc-flag--flate-specialise

describing this options there is a typo "if you are you have".

Trac field	Value
Version	8.2.2
Type	Bug
TypeOfFailure	OtherFailure
Priority	normal
Resolution	Unresolved
Component	Compiler
Test case
Differential revisions
BlockedBy
Related
Blocking
CC
Operating system
Architecture

A program which benefits from a late specialisation pass

Child items 0

Activity