Opened 2 years ago

Closed 23 months ago

#6099 closed bug (fixed)

filepath library a lot bigger in 7.4.2 RC 1

Reported by: igloo Owned by: pcapriotti
Priority: highest Milestone: 7.4.2
Component: Compiler Version: 7.4.2-rc1
Keywords: Cc:
Operating System: Unknown/Multiple Architecture: Unknown/Multiple
Type of failure: None/Unknown Difficulty: Unknown
Test Case: Blocked By:
Blocking: Related Tickets:

Description

filepath is a lot bigger in 7.4.2 RC 1 than in 7.4.1, e.g.:

Change Size changed:
    "ghc-7.4.1/libraries/filepath/dist-install/build/libHSfilepath-1.3.0.0.a"
    "ghc-7.4.1.20120508/libraries/filepath/dist-install/build/libHSfilepath-1.3.0.0.a"
    388984  ->  760936

This probably isn't a show-stopper, but let's at least try to take a look at what's going on before release.

Attachments (2)

6099.patch (1.4 KB) - added by pcapriotti 2 years ago.
T6099.hs (1.8 KB) - added by pcapriotti 2 years ago.

Download all attachments as: .zip

Change History (10)

comment:1 Changed 2 years ago by pcapriotti

  • Owner set to pcapriotti

comment:2 Changed 2 years ago by igloo

  • Version changed from 7.4.1 to 7.4.2-rc1

Changed 2 years ago by pcapriotti

comment:3 Changed 2 years ago by pcapriotti

The difference with 7.4.1 is that the readDrive* functions in System.FilePath.Internal are inlined, increasing code size.

The attached patch brings the size down to that of the binary compiled with 7.4.1.

comment:4 follow-up: Changed 2 years ago by simonpj

Wait. Are you saying that GHC was automatically inlining

readDriveLetter :: String -> Maybe (FilePath, FilePath) 
readDriveLetter (x:':':y:xs) | isLetter x && isPathSeparator y 
                             = Just $ addSlash [x,':'] (y:xs) 
readDriveLetter (x:':':xs) | isLetter x = Just ([x,':'], xs) 
readDriveLetter _ = Nothing 

and that this inlining (plus three other fns) led to doubling the size of the binary? Wow. Maybe we should be more careful about inlining.

Do you know why it makes such a big difference? Are there a lot of calls to readDriveLetter?

Can you make a small test case that demonstrates the unwanted inlining? (I assume inlining is happening that does not give a big pay-off.)

I don't want people to have to add pragmas to suppress code bloat that gives no useful performance benefit.

Simon

Changed 2 years ago by pcapriotti

comment:5 in reply to: ↑ 4 Changed 2 years ago by pcapriotti

Replying to simonpj:

and that this inlining (plus three other fns) led to doubling the size of the binary?

Actually, the whole splitDrive function is being inlined (and it wasn't in 7.4.1).

Do you know why it makes such a big difference? Are there a lot of calls to readDriveLetter?

There are quite a few calls to splitDrive, which is rather large.

Can you make a small test case that demonstrates the unwanted inlining? (I assume inlining is happening that does not give a big pay-off.)

I attached a slimmed-down version of System.FilePath.Internal which shows the doubling in binary size.

I don't want people to have to add pragmas to suppress code bloat that gives no useful performance benefit.

Sure, the patch wasn't meant as a solution, just a demonstration of the problem.

comment:6 Changed 23 months ago by simonpj@…

commit 4fa3f16ddb9fa8e5d59bde5354918a39e0430a74

Author: Simon Peyton Jones <simonpj@microsoft.com>
Date:   Mon May 28 17:33:42 2012 +0100

    Be less aggressive about the result discount
    
    This patch fixes Trac #6099 by reducing the result discount in CoreUnfold.conSize.
    See Note [Constructor size and result discount] in CoreUnfold.
    
    The existing version is definitely too aggressive. Simon M found it an
    "unambiguous win" but it is definitely what led to the bloat. In a function
    with a lot of case branches, all returning a constructor, the discount could
    grow arbitrarily large.
    
    I also had to increase the -funfolding-creation-threshold from 450 to 750,
    otherwise some functions that should inline simply never get an unfolding.
    (The massive result discount was allow the unfolding to appear before.)
    
    The nofib results are these, picking a handful of outliers to show.
    
            Program           Size    Allocs   Runtime   Elapsed  TotalMem
    --------------------------------------------------------------------------------
             fulsom          -0.5%     -1.6%     -2.8%     -2.6%    +31.1%
           maillist          -0.2%     -0.0%      0.09      0.09     -3.7%
             mandel          -0.4%     +6.6%      0.12      0.12     +0.0%
           nucleic2          -0.2%    +18.5%      0.11      0.11     +0.0%
            parstof          -0.4%     +4.0%      0.00      0.00     +0.0%
    --------------------------------------------------------------------------------
                Min          -0.9%     -1.6%    -19.7%    -19.7%     -3.7%
                Max          +0.3%    +18.5%     +2.7%     +2.7%    +31.1%
     Geometric Mean          -0.3%     +0.4%     -3.0%     -3.0%     +0.2%
    
    Turns out that nucleic2 has a function
      Main.$wabsolute_pos =
        \ (ww_s4oj :: Types.Tfo) (ww1_s4oo :: Types.FloatT)
          (ww2_s4op :: Types.FloatT) (ww3_s4oq :: Types.FloatT) ->
          case ww_s4oj
          of _
          { Types.Tfo a_a1sS b_a1sT c_a1sU d_a1sV e_a1sW f_a1sX g_a1sY h_a1sZ i_a1t0 tx_a1t1 ty_a1t2 tz_a1t3 ->
          (# case ww1_s4oo of _ { GHC.Types.F# x_a2sO ->
             case a_a1sS of _ { GHC.Types.F# y_a2sS ->
             case ww2_s4op of _ { GHC.Types.F# x1_X2y9 ->
             case d_a1sV of _ { GHC.Types.F# y1_X2yh ->
             case ww3_s4oq of _ { GHC.Types.F# x2_X2yj ->
             case g_a1sY of _ { GHC.Types.F# y2_X2yr ->
             case tx_a1t1 of _ { GHC.Types.F# y3_X2yn ->
             GHC.Types.F#
               (GHC.Prim.plusFloat#
                  (GHC.Prim.plusFloat#
                     (GHC.Prim.plusFloat#
                        (GHC.Prim.timesFloat# x_a2sO y_a2sS)
                        (GHC.Prim.timesFloat# x1_X2y9 y1_X2yh))
                     (GHC.Prim.timesFloat# x2_X2yj y2_X2yr))
                  y3_X2yn)
             } } }}}}},
    
            <similar>,
            <similar> )
    
    This is pretty big, but inlining it does get rid of that F# allocation.
    But we'll also get rid of it with deep CPR: Trac #2289. For now we just
    accept the change.

 compiler/coreSyn/CoreUnfold.lhs |   73 ++++++++++++++++++++++-----------------
 compiler/main/StaticFlags.hs    |    7 +++-
 2 files changed, 47 insertions(+), 33 deletions(-)

comment:7 Changed 23 months ago by simonpj

Paolo, can you check and close?

comment:8 Changed 23 months ago by pcapriotti

  • Resolution set to fixed
  • Status changed from new to closed
Note: See TracTickets for help on using tickets.