Opened 3 years ago

Closed 2 years ago

#4474 closed bug (fixed)

3 ways to write a function (unexpected performance difference and regression)

Reported by: claus Owned by: igloo
Priority: normal Milestone: 7.4.1
Component: Compiler Version: 7.1
Keywords: Cc: fischer@…, michal.terepeta@…
Operating System: Windows Architecture: Unknown/Multiple
Type of failure: Runtime performance bug Difficulty:
Test Case: T4474a T4474b T4474c Blocked By:
Blocking: Related Tickets:

Description

Consider the attached comparison.hs, a small performance test for difference lists. I was somewhat surprised by the performance differences between flatListCons and flatDList, and I tried to reproduce the issue by writing 3 equivalent versions of flatListCons.

flatListCons t = flat t []
  where
  flat (Leaf n)   ns = n:ns
  flat (Fork a b) ns = flat a (flat b ns) 

flatListCons2 t = flat t []
  where
  flat (Leaf n)   = \ns -> n:ns
  flat (Fork a b) = \ns -> flat a (flat b ns) 

flatListCons3 t = flat t []
  where
  flat (Leaf n)   = (n:)
  flat (Fork a b) = flat a . flat b

Not only does GHC not give equal performance for the 3 versions (factor of 2), but GHC-6.12.3 and GHC-7.1.20101022 differ in optimizations. With GHC-6.12.3, flatListCons and flatListCons2 are fast, flatListCons3 is slow. With GHC-7.1.20101022, only flatListCons is fast, flatListCons2 and flatListCons3 are slow:

$ time ./comparison-6.12.3.exe c 26
67108864

real    0m4.574s
user    0m0.015s
sys     0m0.358s

$ time ./comparison-6.12.3.exe c2 26
67108864

real    0m4.442s
user    0m0.046s
sys     0m0.312s

$ time ./comparison-6.12.3.exe c3 26
67108864

real    0m10.694s
user    0m0.046s
sys     0m0.328s

$ time ./comparison-7.1.20101022.exe c 26
67108864

real    0m4.473s
user    0m0.046s
sys     0m0.327s

$ time ./comparison-7.1.20101022.exe c2 26
67108864

real    0m10.791s
user    0m0.046s
sys     0m0.327s

$ time ./comparison-7.1.20101022.exe c3 26
67108864

real    0m10.729s
user    0m0.078s
sys     0m0.280s

Ideally, I'd like all three versions (as well as flatDList) to be equally fast.

Attachments (1)

comparison.hs (1.3 KB) - added by claus 3 years ago.
simple difference lists

Download all attachments as: .zip

Change History (6)

Changed 3 years ago by claus

simple difference lists

comment:1 Changed 3 years ago by sebf

  • Cc fischer@… added

comment:2 Changed 3 years ago by michalt

  • Cc michal.terepeta@… added

comment:3 Changed 3 years ago by igloo

  • Milestone set to 7.2.1

Thanks for the examples.

comment:4 Changed 3 years ago by simonpj

  • Owner set to igloo

Right this is fixed by

Tue Dec 21 16:58:00 GMT 2010  simonpj@microsoft.com
  * Add a simple arity analyser
  
  I've wanted to do this for ages, but never gotten around to
  it.  The main notes are in Note [Arity analysis] in SimplUtils.
  
  The motivating example for arity analysis is this:
  
    f = \x. let g = f (x+1)
            in \y. ...g...
  
  What arity does f have?  Really it should have arity 2, but a naive
  look at the RHS won't see that.  You need a fixpoint analysis which
  says it has arity "infinity" the first time round.
  
  This makes things more robust to the way in which you write code.  For
  example, see Trac #4474 which is fixed by this change.
  
  Not a huge difference, but worth while:
  
          Program           Size    Allocs   Runtime   Elapsed
  --------------------------------------------------------------------------------
              Min          -0.4%     -2.2%    -10.0%    -10.0%
              Max          +2.7%     +0.3%     +7.1%     +6.9%
   Geometric Mean          -0.3%     -0.2%     -2.1%     -2.2%
  
  I don't really believe the runtime numbers, because the machine was
  busy, but the bottom line is that not much changes, and what does
  change reliably (allocation and size) is in the right direction.

    M ./compiler/coreSyn/CoreArity.lhs -51 +35
    M ./compiler/coreSyn/CoreUtils.lhs -5 +6
    M ./compiler/simplCore/SimplUtils.lhs -1 +94

Ian, since Clause has given us a nice timeable program, could you turn it into a performance test? (NB the "n" is the naive way, which should be slow.)

Simon

comment:5 Changed 2 years ago by igloo

  • Resolution set to fixed
  • Status changed from new to closed
  • Test Case set to T4474a T4474b T4474c

Tests added.

Note: See TracTickets for help on using tickets.