Version 1 (modified by simonpj, 9 years ago) (diff)


Suggestions for projects related to GHC

Here are some suggestions for projects related to GHC that could be undertaken by an intern or undergraduate project student.

Projects that should be within reach of a good undergraduate

  • Implement overlap and exhaustiveness checking for pattern matching. GHC's current overlap and exhaustiveness checker is old and inadequate. Furthermore, it takes no account of GADTs and type families.
  • Improve parallel profiling tools. Satnam Singh and Simon Marlow have made a start on some tools for visualising the behaviour of parallel programs, but there is much more to do here, and it'll be eagerly adopted by users.
  • Implement some low-level C-- optimisations. During 2009 we expect to have the new C-- code generation route in place, and that will open up new opportunities for doing classic compiler-course optimisations on the imperative C-- code. There is more than routine stuff here, because we can use our dataflow framework to do the heavy lifting. Here are some particular ideas for optimisations we'd like to implement.

More ambitious or less-well-defined projects (PhD students)

Programming environment and tools

  • Make GHC work with GCSpy, a generic heap visualiser tool.

Turning GHC into a platform

Projects aimed at making GHC into a user-extensible plug-in platform, and less of a monolithic compiler.

  • Allow much finer and more modular control over the way in which rewrite rules and inlining directives are ordered. See this email thread

  • Support dynamically-linked Core-to-Core plug-ins, so that people can add passes simply by writing a Core-to-Core function, and dynamically linking it to GHC. This would need to be supported by an extensible mechanism like attributes in mainstream OO languages, so that programmers can add declarative information to the source program that guides the transformation pass. Likewise the pass might want to construct information that is accessible later. This mechanism could obviously be used for optimisations, but also for program verifiers, and perhaps also for domain-specific code generation (the pass generates a GPU file, say, replacing the Core code with a foreign call to the GPU program). See Plugins for some early thoughts on this.
  • Improve the GHC API, whereby you can import GHC as a library. We make improvements now and then, but it would benefit from some sustained attention. A particular project would be to port the Haskell refactorer HaRE to use the GHC API.


  • Allow unboxed tuples as function arguments. Currently unboxed tuples are second class; fixing this would be a nice simplification.
  • Extend kinds beyond * and k1->k2. With GADTs etc we clearly want to have kinds like Nat, so that advanced hackery at the type level can be done in a typed language; currently it's all effectively untyped. A neat approach would be to re-use any data type declaration as a kind declaration.
  • Extensible constraint domains. Andrew Kennedy shows how to incorporate dimensional analysis into an ML-like type system. Maybe we could do an extensible version of this, so that it wasn't restricted to dimensions. Integer arithmetic is another obvious domain.

Parallel stuff

  • Experiment with multiprocessor Haskell and/or STM by building and measuring applications, investigate improvements
  • Continue work on parallel GC: particularly independent minor-generation collections.

Build system

  • Build a Windows-native version of GHC (using MS tools instead of gcc).