wiki:Status/Oct10

Version 18 (modified by simonpj, 3 years ago) (diff)

--

GHC Status October 2010

GHC is humming along. We are currently deep into the release cycle for GHC 7.0. We have finally bumped the major version number, because GHC 7.0 has quite a bit of new stuff

  • As long promised, Simon PJ and Dimitrios have spent a good chunk of the summer doing a complete rewrite of the constraint solver in the type inference engine. Because of GHC's myriad type-system extensions, especially GADTs and type families, the old engine had begun to resemble the final stages of a game of Jenga. It was a delicately-balanced pile of blocks that lived in constant danger of complete collapse, and had become extremely different to modify (or even to understand). The new inference engine is much more modular and robust; it is described in detail in our paper http://haskell.org/haskellwiki/Simonpj/Talk:OutsideIn OutsideIn. A blog post describes some consequential changes to let generalisation [LetGen].

As a result we have closed dozens of open type inference bugs, especially related to GADTs and type families.

  • There is a new, robust implementation of INLINE pragmas, that behaves much more intuitively. GHC now captures the original RHS of an INLINE function, and keeps it more-or-less pristine, ready to inline at call sites. Separately, the original RHS is optimised in the usual way. Suppose you say
    {-# INLINE f #-}
    f x = ...blah...
    
    g1 y = f y + 1
    g2 ys = map f ys
    
    Here, f will be inlined into g1 as you'd expect, but obviously not into g2 (since it's not applied to anything). However f's right hand side will be optimised (separately from the copy retained for inlining) so that the call from g2 runs optimised code.

There's a raft of other small changes to the optimisation pipeline too. The net effect can be dramatic: Bryan O'Sullivan reports some five-fold (!) improvements in his text-equality functions, and concludes "The difference between 6.12 and 7 is so dramatic, there's a strong temptation for me to say 'wait for 7!' to people who report weaker than desired performance." http://www.serpentine.com/blog/2010/10/19/a-brief-tale-of-faster-equality/ Bryan

  • David Terei implemented a new back end for GHC using LLVM. In certain situations using the LLVM backend can give fairly substantial performance improvements to your code, particularly if you're using the Vector libraries, DPH or making heavy use of fusion. In the general case it should give as good performance or slightly better than GHC's native code generator and C backend. You can use it through the '-fllvm' compiler flag. More details of the backend can be found in David's and Manuel Chakravarty's Haskell Symposium paper http://www.cse.unsw.edu.au/~davidt/downloads/ghc-llvm-hs10.pdf Llvm.
  • Bryan O’Sullivan and Johan Tibell and implemented a new, highly-concurrent I/O manager. GHC now supports over a hundred thousand open I/O connections. The new I/O manager defines a separate backend per operating system, using the most efficient system calls for that particular operating system (e.g. epoll on Linux.) This means that GHC can now be used to implement servers that make use of e.g. HTTP long polling, where the server needs to handle a large number of open idle connections.
  • Simon M did a lot of work on the runtime system. In particular he substantially improved the way that thunks are handled in a concurrent programs. Simon: a little more info?
  • Simon M designed and implemented a new API for asynchronous exceptions. Simon: describe

We are fortunate to have a growing team of people willing to roll up their sleeves and help us with GHC. Amongst those who have got involved recently are:

  • Daniel Fischer, who worked on improving the performance of the numeric libraries
  • Milan Straka, for great work improving the performance of the widely-used containers package [Containers]
  • Greg Wright is leading a strike team to make GHC work better on Macs
  • Evan Laforge who has taken on some of the long-standing issues with the Mac installer
  • Sam Anklesaria implemented rebindable syntax for conditionals
  • ..who else..?

At GHC HQ we are having way too much fun; if you wait for us to do something you have to wait a long time. So don't wait; join in!

Language developments, especially types

GHC continues to act as an incubator for interesting new language developments. Here's a selection that we know about.

  • Pedro Magalhaes is implementing the derivable type classes mechanism described in his 2010 Haskell Symposium paper [Derivable]. I plan for this to replace GHC's current derivable-type-class mechanism, which has a poor power-to-weight ratio and is little used.
  • Stephanie Weirich and Steve Zdancewic had a great sabbatical year at Cambridge. One of the things we worked on, with Brent Yorgey who came as an intern, was to close the embarrassing hole in the type system concerning newtype deriving (see Trac bug #1496). I have delayed fixing until I could figure out a Decent Solution, but now we know; see our 2011 POPL paper [Newtype]. Brent is working on some infrastructal changes to GHC's Core language, and then we'll be ready to tackle the main issue.
  • Next after that is a mechanism for promoting types to become kinds, and data constructors to become types, so that you can do typed functional programming at the type level. Conor McBride's SHE prototype is the inspiration here http://personal.cis.strath.ac.uk/~conor/pub/she/ SHE. Currently it is, embarrassingly, essentially untyped.
  • Iavor Diatchki plans to add numeric types, so that you can have a type like Bus 8, and do simple arithmetic at the type level. You can encode this stuff, but it's easier to use and more powerful to do it directly.
  • David Mazieres at Stanford wants to implement Safe Haskell, a flag for GHC that will guarantee that your program does not use unsafePerformIO, foreign calls, RULES, and other stuff stuff. This is part of his project to ... David pls fill in.

7.0 also has support for the Haskell 2010 standard, and the libraries that it specifies.

Packages and the runtime system

Simon M

  • Independent parallel garbage collection [Simon M]
  • Better package management (esp wrt profiling)
  • Glorious new back end -fuse-new-codegen

The Parallel Haskell Project

Duncan to write

Data Parallel Haskell

Since the last report, we have continued to improve support for nested parallel divide-and-conquer algorithms. We started with http://darcs.haskell.org/packages/dph/dph-examples/spectral/QuickHull/dph/QuickHullVect.hs QuickHull and are now working on an implementation of the http://darcs.haskell.org/packages/dph/dph-examples/real/BarnesHut/Solver/NestedBH/Solver.hs Barnes-Hut n-body algorithm. The latter is not only significantly more complex, but also requires the vectorisation of recursive tree data-structures, going well beyond the capabilities of conventional parallel-array languages. In time for the stable branch of GHC 7.0, we replaced the old, per-core sequential array infrastructure (which was part of the sub-package dph-prim-seq) by the http://hackage.haskell.org/package/vector vector package — vector started its life as a next-generation spin off of dph-prim-seq, but now enjoys significant popularity independent of DPH.

The new handling of INLINE pragmas as well as other changes to the Simplifier improved the stability of DPH optimisations (and in particular, array stream fusion) substantially. However, the current candidate for GHC 7.0.1 still contains some performance regressions that affect the DPH and http://hackage.haskell.org/package/repa Repa libraries and to avoid holding up the 7.0.1 release, we decided to push fixing these regressions to GHC 7.0.2. More precisely, we are planning a release of DPH and Repa that is suitable for use with GHC 7.0 for the end of the year, to coincide with the release of GHC 7.0.2. From GHC 7.0 onwards, the library component of DPH will be shipped separately from GHC itself and will be available to download and install from Hackage as for other libraries.

To catch DPH performance regressions more quickly in the future, Ben Lippmeier implemented a performance regression testsuite that we run nightly on the HEAD. The results can be enjoyed on the GHC developer mailing list.

Sadly, Roman Leshchinskiy has given up his full-time engagement with DPH to advance the use of Haskell in the financial industry. We are looking forward to collaborating remotely with him.

Installers

The GHC installers have also received some attention for this release.

The Windows installer includes a much more up-to-date copy of the MinGW system, which in particular fixes a couple of issues on Windows 7. Thanks to Claus Reinke, the installer also allows more control over the registry associations etc.

Meanwhile, the Mac OS X installer has received some attention from Evan Laforge. Most notably, it is now possible to install different versions of GHC side-by-side.

Bibliography

  • [Derivable] "A generic deriving mechanism for Haskell", Magalhães, Dijkstra, Jeuring and Löh, Haskell Symposium 2010, www.dreixel.net/research/pdf/gdmh_nocolor.pdf.