Opened 2 years ago

Closed 2 years ago

#12367 closed bug (fixed)

Commit adding instances to GHC.Generics regression compiler performance

Reported by: bgamari Owned by: bgamari
Priority: normal Milestone: 8.0.2
Component: Compiler Version: 8.0.1
Keywords: Generics Cc: RyanGlScott, bgamari, ezyang, simonmar
Operating System: Unknown/Multiple Architecture: Unknown/Multiple
Type of failure: Compile-time performance bug Test Case:
Blocked By: Blocking:
Related Tickets: Differential Rev(s): Phab:D2411
Wiki Page:

Description (last modified by bgamari)

Recently 673efccb3b348e9daf23d9e65460691bbea8586e added a number of new instances for types defined in GHC.Generics. Unfortunately, it seems that this has regressed nofib compilation allocations (and also, it seems, compile time) by between 0 and 20% (on average about 5%).

Change History (23)

comment:1 Changed 2 years ago by bgamari

Description: modified (diff)
Test name Absolute change relative change
compile-allocs/MyList 20238384 0.208624775040415
compile-allocs/MyList 20281496 0.206217801180845
compile-allocs/Prog 20334192 0.202959317331712
compile-allocs/Prog 20344656 0.20053198007393
compile-allocs/Basics 22115840 0.185615422473695
compile-allocs/Basics 22149816 0.183089910175923
compile-allocs/Assemble_loadvec 20347184 0.166966893079396
compile-allocs/Digraph 21606168 0.166243308385695
compile-allocs/Assemble_loadvec 20357600 0.165346296105217
compile-allocs/Shows 20296024 0.165204453781781
compile-allocs/Digraph 21659944 0.165084805640568
compile-allocs/Shows 20338616 0.163719820616552
compile-allocs/Tol_cal 20737568 0.161829271665394
compile-allocs/Queue 20579112 0.161181557853612
compile-allocs/Tol_cal 20731568 0.160175008827888
compile-allocs/Queue 20600368 0.159599282181737
compile-allocs/Vtslib 20374904 0.155144898631996
compile-allocs/Match 20975896 0.154183839207216
compile-allocs/Vtslib 20368168 0.153618623974637
compile-allocs/Match 21000088 0.152948131340001
compile-allocs/Preds 20369168 0.150245373920468
compile-allocs/Preds 20382840 0.14899133906394
compile-allocs/BinConv 21814376 0.147565217236013
compile-allocs/BinConv 21843560 0.146507544982022
compile-allocs/Shapes 20121512 0.144706659929922

comment:2 Changed 2 years ago by bgamari

To characterize this further I chose a representative module from nofib (the Psfuns module of the reptile test) and compiled it with 6319a8cf79cc1f1e25220113149ab48e5083321b (the parent of the change) and 673efccb3b348e9daf23d9e65460691bbea8586e built with ticky enabled.

The largest absolute changes are summarized below.

Change alloc A alloc B name
+2898040.0 13460520 16358560 $wa5 (ghc-8.1:Encoding)
+2247360.0 6593640 8841000 (ghc-8.1:Unique.mkUnique)
+1976688.0 8486408 10463096 a4 (ghc-8.1:IOEnv)
+1734288.0 5736672 7470960 (ghc-8.1:FastString.uniq)
+1632784.0 5415504 7048288 $cgetUnique2 (ghc-8.1:Unique)
+1258016.0 3014864 4272880 (ghc-8.1:Binary.getByte1)
+1229472.0 4354032 5583504 $ccompare1 (ghc-8.1:Module)
+1210656.0 3661152 4871808 $ccompare (ghc-8.1:Module)
+1083360.0 2529840 3613200 $cget6 (ghc-8.1:IfaceType)
+994560.0 2327440 3322000 $wa1 (ghc-8.1:BinIface.)
+960600.0 2005200 2965800 $cget5 (ghc-8.1:IfaceType)
+946824.0 2020656 2967480 $wa3 (ghc-8.1:BinIface)
+558992.0 1389872 1948864 $wa8 (ghc-8.1:Binary.)
+558200.0 1193720 1751920 $cget1 (ghc-8.1:IfaceType)
+551320.0 1149720 1701040 $cget2 (ghc-8.1:IfaceType)
+473088.0 1268960 1742048 (ghc-8.1:TysWiredIn.isBuiltInOcc_maybe)
+318136.0 1234352 1552488 a5 (ghc-8.1:IOEnv)
+305280.0 4939520 5244800 (ghc-8.1:UniqFM.lookupUFM)
+297216.0 691920 989136 (ghc-8.1:UniqFM.lookupUFM_Directly)
+286200.0 889680 1175880 $wa17 (ghc-8.1:Binary.)
+253952.0 560000 813952 a18 (ghc-8.1:BinIface)
+217992.0 1167832 1385824 $wa (ghc-8.1:FastString.)
+187792.0 595672 783464 (ghc-8.1:IfaceEnv.lookupOrig)
+173720.0 566160 739880 (ghc-8.1:FastString.unpackFS)
+173504.0 397760 571264 $wa11 (ghc-8.1:Binary.)
+170560.0 694272 864832 a (ghc-8.1:IOEnv)
+148992.0 3369216 3518208 a23 (ghc-8.1:UniqFM)
+137376.0 456528 593904 (ghc-8.1:Unique.mkVarOccUnique)
+134320.0 2260808 2395128 (ghc-8.1:DynFlags.dopt)
+133448.0 413840 547288 $wxs (ghc-8.1:Binary)
+117936.0 363664 481600 (ghc-8.1:IfaceEnv.extendNameCache)
+117544.0 401520 519064 (ghc-8.1:Name.mkExternalName)
+114240.0 265632 379872 $fBinary(,)1 (ghc-8.1:Binary.)
+110992.0 243824 354816 $wxs (ghc-8.1:BinIface)
+108480.0 222640 331120 $cget (ghc-8.1:TyCoRep)
+108480.0 222640 331120 $cget4 (ghc-8.1:IfaceType)
+106920.0 473448 580368 $wa83 (ghc-8.1:Binary)
+105888.0 229248 335136 (ghc-8.1:TcRnMonad.forkM)
+104232.0 340752 444984 $wa2 (ghc-8.1:Encoding.)
+103200.0 385296 488496 (ghc-8.1:UniqSupply.takeUniqFromSupply)
+98576.0 263312 361888 $wa6 (ghc-8.1:Binary.)
+93600.0 197040 290640 $cget1 (ghc-8.1:OccName)

It's perhaps not surprising that the changes are largely coming from the interface file parsing: the size of GHC.Generics' interface file grows from 436kByte to 613kByte as a result of this change.

What is surprising is that Psfuns, which makes no use of Generics, is affected. The reason for this is that Data.Foldable and other modules imported by Prelude import GHC.Generics in order to provide instances for the generics types. This is apparently the case due to an import cycle with Data.Monoid,

Module imports form a cycle:
         module ‘GHC.Generics’ (libraries/base/GHC/Generics.hs)
        imports ‘Data.Foldable’ (libraries/base/Data/Foldable.hs)
  which imports ‘Data.Monoid’ (libraries/base/Data/Monoid.hs)
  which imports ‘GHC.Generics’ (libraries/base/GHC/Generics.hs)

Which arises due to,

  • GHC.Generics imports Data.Foldable: to provide Foldable instances for Generics' types
  • Data.Foldable imports Data.Monoid: since various types (e.g. Min and Endo) are used in defining the default definitions of Foldable
  • Data.Monoid imports GHC.Generics: to provide Monoid instances for Generics' types

It seems to me that we should find a way to keep GHC.Generics's instances isolated in GHC.Generics. Making all users pay for these instances regardless of whether they use the instances (or generics at all) is quite bad.

Last edited 2 years ago by bgamari (previous) (diff)

comment:3 Changed 2 years ago by bgamari

It turns out that the problem is actually a fair bit worse than described above. While Foldable imports GHC.Generics in order to provide instances for types in the latter (which should be fairly easy to fix with some reorganization), the other modules import GHC.Generics merely to derive Generic instances for types they themselves define. One example of this is Data.Monoid, which defines a number of newtypes resembling,

newtype All = All { getAll :: Bool }
        deriving (Eq, Ord, Read, Show, Bounded, Generic)

It seems to me like we may want to consider moving the data types defined in GHC.Generics to a new GHC.Generic.Internal module. They could then be re-exported in GHC.Generics, which could also derive instances for these types. This would also allow us to move the Foldable instances for the generics types to GHC.Generics, since the latter could likely import Data.Foldable without fear of cycles. Moreover, users like Data.Monoid which only needs the types, not the instances, could import GHC.Generics.Internal.

Really, though, this arguably leaves a large portion of the problem unsolved: users who use GHC.Generics (which is increasingly popular) still need to pay the full cost of importing it.

Last edited 2 years ago by bgamari (previous) (diff)

comment:4 Changed 2 years ago by bgamari

Cc: RyanGlScott added

Ccing RyanGlScott, who authored 673efccb3b348e9daf23d9e65460691bbea8586e.

comment:5 Changed 2 years ago by bgamari

I should note that the problem of adding instances to core libraries degrading compiler performance is a rather general theme. It can also be found in 4e6bcc2c8134f9c1ba7d715b3206130f23c529fb, which adds a number of instances to Data.Monoid and regresses compiler allocations by 5 to 10%.

comment:6 Changed 2 years ago by RyanGlScott

Keywords: Generics added

comment:7 Changed 2 years ago by bgamari

In our call yesterday Simon expressed skepticism that the performance regression is entirely due to a larger interface file. More like, he said, is that an interface is now being loaded which wasn't previously necessary. Indeed this is the case: interface files for both GHC.Generics and GHC.Ptr are now loaded while compiling Data.Foldable but were not previously,

$ grep 'Reading interface' dump-if.log
Reading interface for Prelude; reason: Prelude is directly imported                
Reading interface for Geomfuns;                                                    
Reading interface for Auxprogfuns;                                                 
Reading interface for GHC.Base;                                                    
Reading interface for GHC.Float;                                                   
Reading interface for GHC.Types;                                                   
Reading interface for Data.Foldable;                                               
Reading interface for GHC.List;                                                    
Reading interface for GHC.Show;                                                    
Reading interface for GHC.Prim;                                                    
Reading interface for GHC.Classes; reason: Need decl for Eq                        
Reading interface for GHC.Num; reason: Need decl for Num                           
Reading interface for GHC.Stack.Types; reason: Need decl for SrcLoc                
Reading interface for GHC.Integer.Type;                                            
Reading interface for GHC.Generics; reason: Need decl for V1                       
Reading interface for GHC.Ptr; reason: Need decl for Ptr                           
Reading interface for Data.Monoid; reason: Need decl for Sum                       
Reading interface for Data.Proxy; reason: Need decl for Proxy                      
Reading interface for Data.Either; reason: Need decl for Either                    
Reading interface for GHC.Arr; reason: Need decl for Array                         
Reading interface for GHC.Tuple;                                                   
Reading interface for GHC.Enum; reason: Need decl for enumFromTo                   
Reading interface for GHC.CString;                                                 

comment:8 Changed 2 years ago by bgamari

It's not entirely clear how to proceed from here. The problem here is that we want to provide Generic instances for many types in base, while at the same time providing, e.g., Foldable and Monoid instances for the generic representation types.

As far as I can tell we have a few options,

  1. Keep the status quo and accept the fact that much of base pulls is GHC.Generics
  2. Move the data types in GHC.Generics to a new GHC.Generics.Internal module, placing the the instances in GHC.Generics. This would mean that GHC.Generics would be full of orphans but that fewer modules within base would need to import the full bulk of the instances. That being said, this does nothing to help users who use GHC.Generics but none of the instances.
  3. Wire-in the generics representation types to hopefully avoid the need to pull in the interface file at all when deriving Generic (assuming that GHC in fact is capable to avoid pulling in interface files when loading only wired-in declarations).
  4. Add hs-boot files for Foldable, Traversable, et al., allowing us to move the instances for the GHC.Generics types to GHC.Generics
  5. Something else?

comment:9 Changed 2 years ago by simonpj

We can't move to a solution until we know WHY GHC.Ptr and GHC.Generics are being read. Why do we need V1 and Ptr when we didn't before.

comment:10 Changed 2 years ago by bgamari

Simonpj, GHC.Generics is read because Data.Foldable provides a Foldable instance for GHC.Generics.V1. GHC.Ptr is read because GHC.Generics needs GHC.Ptr.Ptr to define instances for representing Addr# fields.

comment:11 Changed 2 years ago by bgamari

Cc: bgamari added

comment:12 Changed 2 years ago by bgamari

So it appears that something is indeed calling for the V1 dictionary and is therefore forcing the load of the GHC.Generics interface file,

...
end stage interact with inerts }                                       
runStage top-level reactions {                                         
  workitem   =  [W] $dFoldable_a2pp :: Foldable [] (CDictCan)          
doTopReact [W] $dFoldable_a2pp :: Foldable [] (CDictCan)               
matchClassInst pred = Foldable []                                      
Starting fork { Dict fun $fFoldable[]                                  
...
} ending fork Dict fun $fFoldable[]                                    
Starting fork { Dict fun $fFoldableV1                                                                        
Starting fork { Declaration for $fFoldableV1                                                                 
Loading decl for $fFoldableV1                                                                                
updating EPS_                                                                                                
Need decl for V1                                                                                             
Considering whether to load GHC.Generics {- SYSTEM -}                                                        
Reading interface for GHC.Generics; reason: Need decl for V1                                                 
...

Now to work out what.

comment:13 Changed 2 years ago by bgamari

This minimal program is enough to trigger GHC.Generics to be loaded with 673efccb3b348e9daf23d9e65460691bbea8586e yet not without,

module OhNoTooManyInterfaces where                               
                                              
concat' :: [[a]] -> [a]                  
concat' = concat
Last edited 2 years ago by bgamari (previous) (diff)

comment:14 Changed 2 years ago by bgamari

I've made a bit of progress tracking down the cause of GHC.Generics being loaded when looking for a Foldable [] instance. It looks like the cause is the evaluation of the mod binding in IfaceEnv.instIsVisible. The RHS of this binding pulls on is_dfun to get its idName (which it only need the Module from).

Last edited 2 years ago by bgamari (previous) (diff)

comment:15 Changed 2 years ago by simonpj

Cc: ezyang added

Great catch! This is all Edward's fault :-).

commit 4c834fdddf4d44d12039da4d6a2c63a660975b95
Author: Edward Z. Yang <ezyang@cs.stanford.edu>
Date:   Mon Nov 17 21:23:52 2014 -0800

The is_dfun field of a ClsInst should be lazy, only pulled on if the instance is actually needed. See the forkM in TcIface.tcIfaceInst.

But this instIsVisible stuff is pulling on the is_dfun which forces lots of stuff to get loaded that is entirely unnecessary.

Solution: add an is_mod :: Module field to ClsInst, which gives the Module for the ClsInst. Invariant: it's the same Module as that for the is_dfun. And use that is_mod field in instIsVisible.

comment:16 Changed 2 years ago by bgamari

Differential Rev(s): Phab:D2411
Status: newpatch

Here's a quick attempt at fixing this.

comment:17 Changed 2 years ago by bgamari

One thing I also noticed while debugging this is that the relevantBindings logic in TcErrors appears to pull in GHC.Generics when generating an error, even when no relevant bindings are shown in the resulting message (and even with -fmax-relevant-binds=0). Judging from the comments surrounding relevantBindings, it sounds like this is expected behavior but I thought I'd mention it anyways.

comment:18 Changed 2 years ago by simonpj

Patch looks good. I have commented it.

I don't understand why relevantBindings would pull in anything. What "surrounding comments"? Do you know why it pulls GHC.Generics in? Worth understanding...

comment:19 Changed 2 years ago by simonmar

Cc: simonmar added

comment:20 Changed 2 years ago by Ben Gamari <ben@…>

In ed480981/ghc:

InstEnv: Ensure that instance visibility check is lazy

Previously instIsVisible had completely broken the laziness of
lookupInstEnv' since it would examine is_dfun_name to check the name of
the defining module (to know whether it is an interactive module). This
resulted in the visibility check drawing in an interface file
unnecessarily. This contributed to the unnecessary regression in
compiler allocations reported in #12367.

Test Plan: Validate, check nofib changes

Reviewers: simonpj, ezyang, austin

Reviewed By: ezyang

Subscribers: thomie, ezyang

Differential Revision: https://phabricator.haskell.org/D2411

GHC Trac Issues: #12367

comment:21 Changed 2 years ago by bgamari

The commit in comment:20 should hopefully make a significant dent in allocations.

comment:22 Changed 2 years ago by bgamari

comment:23 Changed 2 years ago by bgamari

Resolution: fixed
Status: patchclosed

It seems that the underlying issue here (the lack of laziness in the instance visibility check resulting in unnecessary typechecking) has been solved. Closing.

Note: See TracTickets for help on using tickets.