GHC: Ticket #2143: Yhc's sort is faster than GHC's
http://ghc.haskell.org/trac/ghc/ticket/2143
<p>
The sort code in the Yhc libraries is faster than GHC. In some cases its asymptotically better. Some benchmarks have shown a doubling in performance. I know why this is, but will need to make sure the code is as good as it can be, check all the benchmarks agree etc.
</p>
<p>
Original work by Ian: <a class="ext-link" href="http://www.haskell.org/pipermail/glasgow-haskell-users/2002-May/003376.html"><span class="icon"></span>http://www.haskell.org/pipermail/glasgow-haskell-users/2002-May/003376.html</a>
</p>
<p>
I will track this down and aim for a libraries proposal in a month or so.
</p>
en-usGHChttp://ghc.haskell.org/trac/ghc/chrome/site/ghc_logo.png
http://ghc.haskell.org/trac/ghc/ticket/2143
Trac 1.0.1iglooFri, 14 Mar 2008 21:27:51 GMTdifficulty, milestone set
http://ghc.haskell.org/trac/ghc/ticket/2143#comment:1
http://ghc.haskell.org/trac/ghc/ticket/2143#comment:1
<ul>
<li><strong>difficulty</strong>
set to <em>Unknown</em>
</li>
<li><strong>milestone</strong>
set to <em>Not GHC</em>
</li>
</ul>
TicketsimonmarTue, 30 Sep 2008 15:37:29 GMTarchitecture changed
http://ghc.haskell.org/trac/ghc/ticket/2143#comment:2
http://ghc.haskell.org/trac/ghc/ticket/2143#comment:2
<ul>
<li><strong>architecture</strong>
changed from <em>Unknown</em> to <em>Unknown/Multiple</em>
</li>
</ul>
TicketsimonmarTue, 30 Sep 2008 15:51:43 GMTos changed
http://ghc.haskell.org/trac/ghc/ticket/2143#comment:3
http://ghc.haskell.org/trac/ghc/ticket/2143#comment:3
<ul>
<li><strong>os</strong>
changed from <em>Unknown</em> to <em>Unknown/Multiple</em>
</li>
</ul>
Ticketdaniel.is.fischerThu, 19 Nov 2009 02:56:12 GMTfailure set
http://ghc.haskell.org/trac/ghc/ticket/2143#comment:4
http://ghc.haskell.org/trac/ghc/ticket/2143#comment:4
<ul>
<li><strong>failure</strong>
set to <em>None/Unknown</em>
</li>
</ul>
<p>
The Yhc code doesn't seem to be in the library yet. If it's really faster, shouldn't it be included even if there's still room for improvement?
</p>
<p>
I'll run a few benchmarks in the next days.
</p>
Ticketdaniel.is.fischerSun, 22 Nov 2009 22:13:59 GMT
http://ghc.haskell.org/trac/ghc/ticket/2143#comment:5
http://ghc.haskell.org/trac/ghc/ticket/2143#comment:5
<p>
For pseudo-random lists, my measurements showed the Yhc code consistently faster, though not very much (3-12%). For (rev)sorted or almost (rev)sorted lists, Yhc's code is up to 10 times faster (according to my measurements).
</p>
<p>
Any chance to get the Yhc code into GHC?
</p>
Ticketmalcolm.wallace@…Mon, 23 Nov 2009 01:44:47 GMTattachment set
http://ghc.haskell.org/trac/ghc/ticket/2143
http://ghc.haskell.org/trac/ghc/ticket/2143
<ul>
<li><strong>attachment</strong>
set to <em>SortBy.hs</em>
</li>
</ul>
<p>
A stable O(n log n) sorting algorithm, by Thomas Nordin, 2002
</p>
Ticketmalcolm.wallace@…Mon, 23 Nov 2009 01:48:19 GMT
http://ghc.haskell.org/trac/ghc/ticket/2143#comment:6
http://ghc.haskell.org/trac/ghc/ticket/2143#comment:6
<p>
The attachment is the sorting algorithm from Yhc (actually, nhc98), as originally contributed by Thomas Nordin.
</p>
TicketNeilMitchellMon, 23 Nov 2009 10:08:18 GMT
http://ghc.haskell.org/trac/ghc/ticket/2143#comment:7
http://ghc.haskell.org/trac/ghc/ticket/2143#comment:7
<p>
If memory serves, Lennart said he was the originally author of this code, so the path probably originates with HBC. The code in Yhc was taken unmodified from nhc98.
</p>
TicketsimonpjMon, 23 Nov 2009 10:47:15 GMTowner changed; cc set
http://ghc.haskell.org/trac/ghc/ticket/2143#comment:8
http://ghc.haskell.org/trac/ghc/ticket/2143#comment:8
<ul>
<li><strong>cc</strong>
<em>lennart@…</em> added
</li>
<li><strong>owner</strong>
changed from <em>NeilMitchell</em> to <em>igloo</em>
</li>
</ul>
<p>
OK, well assuming Lennart is happy (I've added him to cc), let's just adopt this. Ian can you action? Although the milestone says "not ghc", these sorting routines are in the base libraries which we ship with GHC, aren't they?
</p>
<p>
Simon
</p>
TicketguestWed, 23 Dec 2009 21:02:35 GMTcc changed
http://ghc.haskell.org/trac/ghc/ticket/2143#comment:9
http://ghc.haskell.org/trac/ghc/ticket/2143#comment:9
<ul>
<li><strong>cc</strong>
<em>gwern0@…</em> added
</li>
</ul>
TicketguestThu, 24 Dec 2009 02:37:33 GMTattachment set
http://ghc.haskell.org/trac/ghc/ticket/2143
http://ghc.haskell.org/trac/ghc/ticket/2143
<ul>
<li><strong>attachment</strong>
set to <em>sort.txt</em>
</li>
</ul>
<p>
The terminal output of my criterion run
</p>
TicketguestThu, 24 Dec 2009 02:38:17 GMTattachment set
http://ghc.haskell.org/trac/ghc/ticket/2143
http://ghc.haskell.org/trac/ghc/ticket/2143
<ul>
<li><strong>attachment</strong>
set to <em>sort.hs</em>
</li>
</ul>
<p>
The module running Criterion on GHC & YHC sorts
</p>
TicketguestThu, 24 Dec 2009 02:44:01 GMT
http://ghc.haskell.org/trac/ghc/ticket/2143#comment:10
http://ghc.haskell.org/trac/ghc/ticket/2143#comment:10
<p>
I put together a little module tonight which uses Criterion 0.2 (0.4 won't install under 6.10.2) to benchmark GHC & YHC sorts using ascending ints, descending ints, random ints, and the Shakesperean corpus (<a class="ext-link" href="http://www.gutenberg.org/etext/100"><span class="icon"></span>http://www.gutenberg.org/etext/100</a>). (Besides the attachment, you can see <a class="ext-link" href="http://hpaste.org/fastcgi/hpaste.fcgi/view?id=14873#a14875"><span class="icon"></span>http://hpaste.org/fastcgi/hpaste.fcgi/view?id=14873#a14875</a> )
</p>
<p>
YHC, as measured by mean, won on all of them, even random and Shakespeare. The margins were not all 2x, but the narrowest YHC victory was Shakespeare - YHC averaged 25.3s while GHC puffed along at 29.5s.
</p>
<p>
Caveats: this was my first time using Criterion, no doubt there are major inefficiencies and bad style etc.
</p>
<p>
But I doubt any of my errors were sufficient to upend the results. Given the margins in a core function which is used all over the place (I grepped through all the repos I've downloaded; 'sort' is used a *lot*), that the replacement is correct and faster in all cases, I think not making the change as soon as possible is a mistake. This is something that should have been done yesterday, as the expression goes.
</p>
TicketguestThu, 24 Dec 2009 13:18:10 GMTattachment set
http://ghc.haskell.org/trac/ghc/ticket/2143
http://ghc.haskell.org/trac/ghc/ticket/2143
<ul>
<li><strong>attachment</strong>
set to <em>sort-reverse.txt</em>
</li>
</ul>
<p>
a run with YHC functions benchmarked before GHC; YHC still wins, showing that other run wasn't due to GHC being penalized by going first
</p>
Ticketmalcolm.wallace@…Thu, 24 Dec 2009 15:25:13 GMTstatus changed; resolution set
http://ghc.haskell.org/trac/ghc/ticket/2143#comment:11
http://ghc.haskell.org/trac/ghc/ticket/2143#comment:11
<ul>
<li><strong>status</strong>
changed from <em>new</em> to <em>closed</em>
</li>
<li><strong>resolution</strong>
set to <em>fixed</em>
</li>
</ul>
<p>
Pushed the faster version to the base library.
It should probably be merged into ghc-6.12.2.
</p>
TicketsimonmarWed, 30 Dec 2009 09:48:48 GMT
http://ghc.haskell.org/trac/ghc/ticket/2143#comment:12
http://ghc.haskell.org/trac/ghc/ticket/2143#comment:12
<p>
Thanks Malcolm.
</p>
Ticket