Opened 9 years ago

Closed 9 years ago

Last modified 9 years ago

#3027 closed bug (fixed)

Specialisation rules fail because dictionary projections do not match

Reported by: malcolm.wallace@… Owned by:
Priority: normal Milestone:
Component: Compiler Version: 6.8.2
Keywords: Cc:
Operating System: Unknown/Multiple Architecture: Unknown/Multiple
Type of failure: None/Unknown Test Case:
Blocked By: Blocking:
Related Tickets: Differential Rev(s):
Wiki Page:


Here is an apparent bug in ghc's specialisation rules. The rewrite rule generated by a SPECIALISE pragma seems to want to pattern-match on exact dictionaries (as well as types). But the compiler is not necessarily able to fully resolve dictionaries before the rules are supposed to fire.

First, the source code we want to specialise:

        hedgehog :: Float -> Vector3 Float
                          -> [Cell_8 (Coord3 Float)]
                          -> [Cell_8 (Vector3 Float)]
                          -> [(Coord3 Float, Coord3 Float)]
    hedgehog  :: ( Fractional a, Cell cell vert, Eq vert
                 , Geom coord, Geom vector, Embed vector coord ) =>
                 a -> vector a
                   -> [cell (coord a)]
                   -> [cell (vector a)]
                   -> [(coord a, coord a)]

The core + interface generated for this module contains the rule:

 "SPEC Hedgehog.hedgehog" ALWAYS forall
  Hedgehog.hedgehog @ GHC.Float.Float
                    @ RectGrid.Cell_8
                    @ RectGrid.MyVertex
                    @ Geometries.Coord3
                    @ Geometries.Vector3

But in a different module (Viewer.hs), here is what the usage site looks like just before the specialisation rules are supposed to fire:

hedgehog_aWr =
    @ GHC.Float.Float
    @ RectGrid.Cell_8
    @ RectGrid.MyVertex
    @ Geometries.Coord3
    @ Geometries.Vector3
    (Dataset.$p2Embed @ Geometries.Vector3 @ Geometries.Coord3 Geometries.$f1)
    (Dataset.$p1Embed @ Geometries.Vector3 @ Geometries.Coord3 Geometries.$f1)

Notice how there are a couple of dictionary projection functions still sitting there, so although some of the dictionaries match, not all do, and the rule does not fire. However, later the worker-wrapper transformation is able to resolve those outstanding dictionaries, giving eventually:

hedgehog_r2at =
    @ GHC.Float.Float
    @ RectGrid.Cell_8
    @ RectGrid.MyVertex
    @ Geometries.Coord3
    @ Geometries.Vector3

So I'm left calling the worker for the polymorphic version of the function, rather than the specialised monomorphic code I wanted. Given how many dictionaries are involved, and that this is the inner loop of the program, I'm hoping there is a big performance win waiting for me, if only I can get that specialised code to run!

A code archive is attached, to help you reproduce the behaviour. I have cut down the code considerably already, but it is still spread over 5 modules: I was unable to cut it down much further without the bug disappearing (probably through inlining or something).

Classes are defined in Dataset.hs, instances in Geometries.hs. The code I want to specialise is in Hedgehog.hs, and the usage site is in Viewer.hs (the main program).

Attachments (1)

specialisation-bug.tar.gz (5.4 KB) - added by malcolm.wallace@… 9 years ago.
code archive to illustrate specialiser bug

Download all attachments as: .zip

Change History (10)

Changed 9 years ago by malcolm.wallace@…

Attachment: specialisation-bug.tar.gz added

code archive to illustrate specialiser bug

comment:1 Changed 9 years ago by malcolm.wallace@…

Resolution: fixed
Status: newclosed

This bug appears to have been fixed between 6.8.2 and 6.10.1. (Although sadly the specialised code in 6.10.1 actually runs slower in some cases than the unspecialised code in 6.8.2.)

comment:2 Changed 9 years ago by simonpj

difficulty: Unknown

I hate hearing that 6.10.1 goes slower than 6.8.2. If you can get any insight into why, that'd be really great.


comment:3 Changed 9 years ago by malcolm.wallace@…

Unfortunately performance tuning remains something of a black art for me. Here are timing figures for the real code in question, with attempted (but failed) specialisation in 6.8.2, and actual specialisation in 6.10.1:

ghc-6.8.2: 11.88s

ghc-6.10.1: 14.75s

Later, having profiled this same code, and hit two hotspots with a small algorithmic improvement, and some INLINE pragmas, I now get the following comparison:

ghc-6.8.2: 2.71s

ghc-6.10.1: 2.92s

Big improvement overall, which I'm very happy with, but 6.8.2 still wins. Make a tiny change to the outer loop of this version, basically to remove a filter and thus do more work:

ghc-6.8.2: 16.18s

ghc-6.10.1: 11.55s

and suddenly 6.10.1 overtakes 6.8.2. So the newer ghc oscillates between 25% faster, and 25% slower, on very slight variations of the same program.

comment:4 Changed 9 years ago by simonpj

OK. Could you dump a reproducible version of this into the ticket?



comment:5 Changed 9 years ago by simonpj

How do I run it to show the performance differences? I found 'main' in Viewer, but could not run:

time ./viewer-6.10
Geometry []
viewer-6.10: Viewer.hs:(128,0)-(129,29): Non-exhaustive patterns in function cell_size_2D

real	0m0.086s
user	0m0.068s
sys	0m0.018s

comment:6 Changed 9 years ago by malcolm.wallace@…

The currently-attached archive is a cut-down version of the real code, intended only to show the bug in the core output. The full code that can be used to illustrate the performance differences requires several extra packages (OpenGL, smallcheck, polyparse), plus some large datasets. Sorry I have not recently had the time to collect all these materials together so that you can reproduce it.

comment:7 Changed 9 years ago by simonpj

OK, but what bug in the core output?!! Can you give instructions to reproduce?

Is this the bug that no longer exists in 6.10?


comment:8 Changed 9 years ago by malcolm.wallace@…

Yes, the code in the attached archive "specialisation-bug.tar.gz" is for the originally stated bug, which has been fixed in 6.10.1. To reproduce it, first compile everything with ghc Viewer.hs --make -O2, then recompile just the module containing the usage site (Viewer.hs), with -O2 -ddump-simpl or -dverbose-core2core, to see the lack of specialisation (in ghc-6.8.2) or the presence of the desired specialisation (in ghc-6.10.1).

However, you asked for more information, because the code produced by 6.10.1 is often significantly slower, despite the better job of specialisation. I have not yet gathered all the pieces you would require to reproduce this performance regression, which is likely to be a separate bug/issue altogether.

comment:9 Changed 9 years ago by simonpj

Milestone: _|_

Oh I see. I'm not going to investigate 6.8.2, because we won't make a new release on that branch.

If and when you have time to characterise the performance lossage wrt 6.8.2 I'd be v happy to investigate. I hate things going slower.

But no rush. I'll leave the ticket open (although you might want to close it and open a new one when you get to this).


Note: See TracTickets for help on using tickets.