Custom Query (91 matches)


Show under each result:

Topic: Bindings (10 matches)

Ticket Summary Status Owner Priority Created Modified
#1598 Improve/rewrite HDBC backends new good 5 years 9 months


Although there are a number of issues with HDBC backends, the main one is that nearly everything goes through strings, even when drivers offer support for proper wire formats. This makes large inserts of selects much slower than necessary, and consume much more space than necessary as well. Along with this, there's no implementation of batched inserts. And finally, there's BLOB support.

Additionally, the backends could be simplified. That is to say, that when HDBC was produced, there were no seperate bindings packages for most databases. So the backends include the bindings themselves. But now there are seperate packages that just provide simple bindings to e.g. sqlite and postgres. It would be very good to switch over to those libraries where possible, and where such packages don't exist to split out, e.g., the direct ODBC bindings to a separate package as well. This would help maintainability and future development.

HDBC remains by far the most used Haskell database package, and now that John Goerzen has announced plans to switch to BSD licensing, there's no obstacle to its use across the board except for it's current limitations. It would be far better for the Haskell community to bring HDBC up to snuff as a single intermediate tool for accessing multiple database backends than to continue down the path of a profusion of competing single-db bindings that higher-level libraries are forced to access individually.

Major points from the above email reproduced below:

  1. I have no Windows development platform. I can't test the releases on Windows. Many Windows users don't have the skill to diagnose problems. These problems do eventually get fixed when a Windows user with that skill comes along -- and I appreciate their efforts very much! -- but it takes longer than it ought to.
  1. The ODBC documentation is monumentally terrible, and the API is perhaps only majestically terrible, and it is not 100% clear to me all the time that I have it right. A seasoned ODBC person would be ideal here.
  1. Issues exist with transferring binary data and BLOBs to/from at least PostgreSQL databases and perhaps others. There appear to be bugs in the backend for this, but BLOB support in the API may need to be developed as well.
  1. Although the API supports optimizations for inserting many rows at once and precompiled queries, most backends to not yet take advantage of these optimization.
  1. I have received dueling patches for whether foreign imports should be marked "safe" or "unsafe" on various backends. There seems to be disagreement in the community about this one.
  1. Many interactions with database backends take place using a String when a more native type could be used for efficiency. This project maybe rather complex given varying types of column data in a database -- what it expects for bound parameters and what it returns. The API support is there for it though.
#1592 C++ -> Haskell FFI Generator using SWIG new good 5 years 4 years

I'd like to head up the implementation of a basic SWIG module that will properly generate appropriate C wrappers and hsc files that implement C++ classes, inheritance, and method calls appropriately. This would include generating type classes that emulate upcasting and public methods, proper handling of typedefs to correspond in Haskell, generating accessors for public class members, and creating equivalent constant variables in the Haskell code, and finally converting enums into data types.


#1591 GObject Introspection based static binding generator for Gtk2hs new OK 5 years 5 years

This was my project idea from last year, but I ended up not doing GSoC. It would be useful to have a static binding generator for Gtk2hs using data from gobject-introspection to do most of the work, making it easier to update and maintain the bindings to Gtk+ and company, and easier to add bindings for new GObject based libraries.

#1564 C Bindings to Haskell Values new aluink OK 7 years 9 months

Write an tool to convert Haskell values, structures and functions, to C bindings.


  • Convert Functions
  • Convert Data Structures
  • Typing
  • Documentation

Convert Functions

We need to find a way to convert Haskell functions, or pointers to, into C function pointers. This would allow many things to follow. Haskell class(HC) instances could have C Class(CC) definitions. So then calling the HC functions would be as simple as referring to the functions pointers stored in the CC. How to deal with general module functions is something to think about still. Something to also think about is how to convert the function pointers back to Haskell functions. This will be needed when passing function pointers to Haskell function calls. Do we want to allow the "lifting" of regular C functions into the pointers to Haskell functions so they can be passed to Haskell functions? This would almost seem required with such a system.

Convert Data Structures

Structures created with "data" will most likely be GObjects. Those done with "newtype" will most likely have to be duplicated, or maybe we can get away with #defines. If anything, we should probably be able to get away with a #define for "type" definitions. Some standard structures from the Prelude will have to be converted and included in the conversion tool.


Haskell has a, to put it lightly, strong typing system. We need to think about how to bring that into the C context. How much of it do we want to preserve?


Any good tool is useless without documentation. Part of the deliverables of this project will be documentation on usage, how it works, and some man pages.

#1555 Embedded Haskell: calling Haskell from other languages assigned madtroll OK 8 years 4 years

The Haskell FFI, and in particular, GHC, supports "foreign export" of Haskell code to C, and embedding Haskell modules inside C apps. However, this functionality is rarely used, and requires a fair bit of manual effort.

This project would seek to simplify, polish and document the process of embedding and calling Haskell code from C, and languages that use the C ffi (Python, Erlang, C++ ...), providing a canned solution for running Haskell fragments from projects written in other languages.

A good solution here has great potential to help Haskell adoption (as we've seen in Lua), by mitigating the risk of using Haskell in a project.

Related organisations

Depending on the language, we may be able to move this under another language umbrella (Python, Perl, Ruby, ... ?)

Interested mentors

  • Don Stewart

Interested students

  • Ben Kalman
  • Wojciech Cichon
  • Damien Desfontaines
#1547 FFI bridge to Python new OK 8 years 8 years

Python has an impressive number of libraries we might want to utilise from Haskell (e.g.pygments is used in the new hpaste), also, being able to access Haskell from Python lowers the risk when integrating Haskell into new projects.

This project would seek to develop an Python<->Haskell bridge, allowing the use of libraries on either side from each language.

Python's C-based FFI makes this not too daunting, and the pay off could be quite large. Related projects: MissingPy? and hpaste's light python binding.

We may also be able to get this funded under the (many) python slots.

Projects to contact

This could be funded under the Python project umbrella. Applications should also be submitted to them.

Interested mentors

Don Stewart

Interested Students

Michal Janeczek <janeczek@…>

#1116 Haskell Qt binding generator new none OK 9 years 8 months

The task is to write a program which generates a Haskell binding to the Qt library or parts of it. The program shall do the generation fully automatically, using the Qt header files or similar data as its input.

Interested Students

  • (2007) Soenke Hahn <shahn@…>
#1103 COM interop library and IDL compiler for Haskell new none 9 years 9 years

Quite often someone from the mailing lists is asking how to bind Haskell to COM. HaskellDirect? is a tool that allows direct translation from IDL to Haskell. The problem is that currently it is seriously ill. The generated code is often full of both runtime and compile errors. The generated code is using a COM interop library which was developed before the current FFI addendum and doesn't conform to it very well. I have developed an initial replacement for the library that I intend to use in Visual Haskell. The aim of this project would be to implement an IDL to Haskell translator for it.

#72 GStreamer bindings new none 10 years 10 years

GStreamer is a multimedia framework. The goal would be to create haskell bindings, to allow easy creation of multimedia applications in haskell.

Interested Mentors

  • ?

Interested Students

  • Simon Sandlund (psi) <simon.sandlund@…>
  • Johannes Woolard (notanotheridiot) <johannes.woolard@…>
  • Henning Günther (der_eq) <h.guenther@…>
#25 GSL Binding assigned johan_silver 10 years 8 years

Extend the GSLHaskell library to cover all the GSL functions. Implement (possibly using additional numerical libraries) important Octave functions not available in the GSL.

Interested Mentors

  • Alberto Ruiz <aruiz@…>

Interested Students

  • Creighton Hogg <wchogg@…>
  • Xiaogang Zhang <zxg_32@…>
  • Alexey Kokovin <alexey.kokovin@…>

Topic: Bioinformatics (2 matches)

Ticket Summary Status Owner Priority Created Modified
#1121 bio library development new none 9 years 9 years

The "bio" package is a collection of useful functionality aimed at bioinformatics. Its development has been largely driven by the immediate needs of the applications that use it, and its current contents reflect this. Ideally, this would develop into a general, broadly scoped bioinformatics library (akin to bioperl and biopython).

The library can be extended in many directions, and this will to a large part be dictated by student interests and background. Some possibilities are:

  • sequence alignment: some rudimentary alignment exists already, but more advanced methods and multiple alignment, phylogeny, etc would be useful. Haskell should also lend itself well to parallel alignment. (In addition, laziness could possibly be used to make T-coffee-type multiple alignments more efficient?)
  • machine learning: this is an important aspect, but as it is applicable to many different domains, I've added a separate ticket. Interested students should probably apply for a pure/general machine learning library (ticket 1127) instead of restricting it to bioinformatics.
  • file formats: currently there's support for a handful of file formats, but there exists many other, usually simple text-based, file formats to which a parser would be nice to have. File format support is likely to be a part of other tasks, but also useful in its own right.
  • suffix arrays: gives time and space efficient searching. Efficient construction is likely to require some low-level hacking, but would improveme many algorithms that currently use associative structures. Also generally useful for many text-related problems.

Application-driven library development may be a useful route, so library development as part of solving concrete (biological or otherwise) problems is welcome.

If this sounds interesting, please e-mail me at <ketil at malde dot org> to discuss the details.

#28 Bioinformatics tools new none 10 years 9 years
  1. Further develop RBR, a tool for masking repeats. This can include a) optimize (using FastPackedString and/or a new internal data structure); b) extend functionality.

For more details, click here.

  1. Develop a tool for annotation/classification of sequences. This would involve computation on and visualization of graphs (experience with the latter would be really great).

For more details, click here.

Prior bioinformatics knowledge is not a requirement. Please contact me for details.

Interested Mentors

  • Ketil Malde (kzm) <ketil@…>

Interested Students

  • Mark Reitblatt <reitblatt@…>
  • N Rajagopal (IRC: tuxplorer) <rajagopal.n@…>
  • Nicolas Wu <nicolas.wu@…>

Topic: Cabal (5 matches)

Ticket Summary Status Owner Priority Created Modified
#1659 Treat each .cabal component separately during dependency solving new good 9 months 9 months

From an email from Johan Tibell:

Currently the cabal dependency solver treats all the components (i.e. library, test suites, and benchmarks) as one unit for the purpose of dependency resolving. This creates false dependency cycles. For example, test-framework depends on containers and the container's test suite depends on containers, which means it should be possible to build the container's test suite by first building the library, then test-framework, and finally the container's test suite. This doesn't work today as the solver treats the whole containers .cabal file as one unit and thus reports a dependency cycle. The dependency solver should treat each component as a separate mini-package for the purpose of dependency solving.

#1655 Have GHC track the different library "ways" it has installed new OK 9 months 9 months

Today GHC doesn't know if it has installed e.g. both the vanilla and profiling versions of a library. This leads to annoying error messages and an unpleasant experience when users want to profile something. If GHC tracked which versions of a library it had installed cabal could easily and automatically installed the missing profiling versions.

#1653 PVP compliance checker new OK 9 months 9 months

A tool that is given two package tarballs (or similar) tells you which version number component that needs to be bumped and why. It should be integrated into the cabal workflow, perhaps into cabal itself.

(Probably too small to be a project on its own, suggested by Johan Tibell.)

#1652 Generalize cabal to work with collections of packages instead of having a single package focus new good 9 months 9 months

Today most cabal commands assume that there's a "current" package that's the focus of most operations. As code bases grow users will end up with multiple-package projects and thus a need to build/test subsets of those packages together. We should generalize most commands (e.g. build, test, bench, and repl) to take a list of targets. You can already today specify local targets (i.e. only run these N test suites in my .cabal file). We'd extend the syntax to allow for

cabal test PACKAGE DIR]:[SECTION?...


cabal test my-pkg1-dir:test1 my-pkg1-dir:test2 my-pkg2-dir

Implementation wise this means that the working dir (i.e. dist/ today) needs to be able to live outside the some "current" package's dir. It would live in the project "root" (e.g. some director that contains all the packages that are checked out in source form.)

We should also make it more convenient to create these collections of packages. Today you can get half way there by creating a sandbox and manually add-source all your packages. That doesn't scale well to 100-1000s of packages. We should have more clever defaults (e.g. scan subdirs for .cabal files and consider them part of the package set.)

(Project idea first proposed by Johan Tibell)

#1602 Supply dependencies for non-simple cabal build types (eg, Setup.hs) new good 5 years 9 months

Non-standard builds often need to implement specific build steps in Setup.hs, specifying a build-type: Custom in the project cabal file. The user hook system works reasonably well for modifying or replacing the specific sub steps of a build, but *implementing* anything more than the simplest logic in Setup.hs is very difficult.

A great deal of this difficulty stems from the lack of library support for code in Setup.hs. Adding a cabal section that specifies a build-depends: for Custom (and possibly other) build types would allow developers to reuse build code between projects, to share build system modifications on hackage more easily, and to prototype new additions to cabal.

Setup.hs *can* allow arbitrarily complex build system manipulations; however, it is not practical to do so because the infrastructure surrounding Setup.hs doesn't promote code reuse. The addition of dependencies that cabal-install would install prior to building setup.hs and issuing the build would enable developers to produce custom builds that perform complex operations that utilize the high-quality libraries available on hackage. Furthermore, this would provide the means to prototype (and distribute) new cabal / cabal-install features before integrating experimental code into the stable tools.

I think something akin to the Library section would work for this, e.g.;

        Setup-is: Setup.hs
        Build-Depends:  ...
        Build-tools:    ...
        (I expect that most of the fields applicable to 
         Library would also apply here.)

Topic: Concurrency (3 matches)

Ticket Summary Status Owner Priority Created Modified
#1544 Parallel programming benchmarking and benchmark suite new OK 8 years 8 years

GHC offers many features for muilticore, parallel programming, including a parallel runtime, a new parallel garbage collector, a suite of shared memory concurrency abstractions, and a sophisticated parallel stratgies library.

What's missing from this set is experience building parallel applications using the parallel runtime, and high level parallelism primitives.

In this project a parallelism benchmark suite, possibly ported from an existing suite, would be implemented, and used to gather experience and bug reports about the parallel programming infrastructure.

Improvements, to , say, Control.Parallel.Strategies , could result, as would a robust way of comparing parallel program performance between versions of GHC.

Interested mentors

  • Don Stewart
  • Manuel Chakravarty
  • Roman Leshchinskiy

Interested Students

  • Donnie Jones <donnie@…>
#1537 Add NVIDIA CUDA backend for Data Parallel Haskell new bad 8 years 9 months


This ticket proposes to add a NVIDIA CUDA backend for the Data Parallel Haskell extension of GHC.


To quote Wikipedia on CUDA:

"CUDA ("Compute Unified Device Architecture"), is a GPGPU technology that allows a programmer to use the C programming language to code algorithms for execution on the GPU... CUDA gives developers unfettered access to the native instruction set and memory of the massively parallel computational elements in CUDA GPUs. Using CUDA, Nvidia GeForce-based GPUs effectively become powerful, programmable open architectures like today’s CPUs (Central Processing Units). By opening up the architecture, CUDA provides developers both with the low-level, deterministic, and for repeatable access to hardware that is necessary API to develop essential high-level programming tools such as compilers, debuggers, math libraries, and application platforms."

To me, the exciting thing about CUDA is, if not the technology itself the high availability of CUDA enabled "graphic" cards. It is estimated that by the end of 2007 there will be over 40,000,000 CUDA-capable GPUs!

Also see the NVIDIA CUDA site.

Data Parallel Haskell

To quote the Haskell Wiki on DPH:

"Data Parallel Haskell is the codename for an extension to the Glasgow Haskell Compiler and its libraries to support nested data parallelism with a focus to utilise multi-core CPUs. Nested data parallelism extends the programming model of flat data parallelism, as known from parallel Fortran dialects, to irregular parallel computations (such as divide-and-conquer algorithms) and irregular data structures (such as sparse matrices and tree structures)..."

The project

It turns out people are actually already working on this. See this thread on haskell-cafe.

I actually think this project is to big for a Google Summer of Code project. It's more suitable for a Masters project I guess. However the project can be broken into several sub-projects that each address a different research question. Immediate questions that come to mind are for example:

  1. How to compile DPH code to CUDA C code?
  2. How to integrate normal Haskell code with CUDA backed DPH code?
  3. <add your question here>

Interested Mentors

  • ?

Interested Students

  • ?
#80 BSPHlib - A parallel programming library based on BSP model new none 10 years 9 years

Implementation of a different aproach to parallel programming in Haskell, based on BSP model and using the MPI library for message passing, instead of PVM. The great advantage of this aproach is that the BSP model have an easy efficiency prediction.

BSP Reference:

MPI Reference:

Interested Mentors

  • Raul Lopes <raulh@…>

Interested Students

  • Frederico Franzosi <ffranzosi@…>
  • Michele Catasta <mcatasta@…>

Topic: Databases (2 matches)

Ticket Summary Status Owner Priority Created Modified
#35 Add support for optimization features in HaskellDB new none 10 years 10 years

The projects is to added support for indexes, prepared statements and other optimization features to HaskellDB.

Interested Mentors

  • Björn Bringert (bringert) <bringert@…>

Interested Students

  • Ivan Tarasov (navi) <Ivan.Tarasov@…>
#33 Port HaskellDB to HList new none 10 years 10 years

HaskellDB currently uses its own record system, which (I believe) is less powerful than HList. The project is to port HaskellDB to use HList instead, making any necessary changes to the interface to make this possible and to fit HList better.

Interested Mentors

  • Björn Bringert (bringert) <bringert@…>

Interested Students

  • Ivan Tarasov (navi) <Ivan.Tarasov@…>
  • Jun Mukai (jmuk) <mukai@…>

Topic: GHC (11 matches)

Ticket Summary Status Owner Priority Created Modified
#1662 Improve documentation and examples in GHC's base new bad 9 months 9 months

Most of the base library functions have at best a one-line description of what they do. These are the most-used functions in Haskell, and the lack of examples and parameter documentation makes it difficult for beginners to learn.

There's tons of low-hanging fruit here. For example, in Data.String, the following Prelude functions have no examples of their usage to show what they do:

  • lines
  • words
  • unlines
  • unwords

It would only take about an hour to add extensive documentation showing all sorts of use cases. The examples can then be tested with the doctest package from

Here are some example revisions adding these sorts of documentation:

This is a good project because there's no way it can fail. Each module in base is a TODO item, and once one of them is completed (independently of the rest), it's done for good. If at the end of the Summer the candidate is only 50% finished, we don't walk away empty-handed -- half of the library is documented. Good examples can be crowd-sourced if needed.

It's also easy to get up to speed with Phabricator for independent changes like this.

As mentioned earlier, the examples can be checked with the doctest program to ensure that their output is correct. At the moment, doctest will not run "out of the box" within the GHC tree (due to some entangled imports). A bonus project along these lines would be to make doctest runnable within the GHC tree, and to automate the test suite for the examples in base.

#1661 Improve performance of native code new not yet rated 9 months 9 months

There are at least two open GHC tickets (#8279 and #8287) to improve the performance of generated code. Both of these already have a good amount of discussion and (old) patches to get you started.

This would be a good topic to follow through to the point where it can be merged into GHC, as every Haskell program compiled to native code would benefit.

#1658 Better tooling for profiling and performance monitoring new good 9 months 9 months

(Suggestion taken from an email by Simon Peyton Jones):

Something like: while a Haskell program is running, start another process, which connects to the running Haskell RTS, switches on the event-monitoring infrastructure, processes the stream of events, and displays useful stuff about it (probably in a web browser). Karolis Velicka (an undergrad) did an 8-week internship in which he made the ghc-events library, which parses the event stream coming from the event-monitoring infrastructure, incremental. That means it can parse events as they arrive, rather than having to wait until the run is completed. So that is a useful piece of the puzzle. Threadscope could in principle be a client of this more incremental API. But Threadscope is hard to build (based on GTK). Maybe something displaying in a web browser would be more easily portable? Peter Wortman is working on generating DWARF annotations in the symbol table. So I don’t have a precise vision of what the project might be, but something about better infrastructure for giving insight into what is going on at runtime.

#1656 Make parallel builds in GHC actually give speed-ups new OK 9 months 9 months

Something is wrong in the parallel build implementation in GHC, as it doesn't even give good speed-ups in embarrassingly parallel cases on few cores (e.g. 2). Someone needs to profile GHC and work on speeding up parallel builds.

#1622 General FFI improvements new julek not yet rated 4 years 9 months

What I would like to do is to improve the integration of C/C++ with Haskell, particularly in calling Haskell from C/C++.

Currently ghc is able to generate stubs to export functions whose arguments are simple types such as CInts into C/C++. The stub generated is always in an extern "C" clause due to the fact that ghc does not as yet implement the C++ calling conventions as defined in the "The Haskell 98 Foreign Function Interface 1.0" (

So a first step would be to implement this calling convention to bring it up to speed with the above referenced report. This shouldn't be too hard and mostly involves implementing C++ name mangling conventions.

Next, I would like to extend the stub generation so as to be able to deal with more complex types.

The type systems in C++ and Haskell have many analogous syntaxes that can be easily exploited to provide strong compatibility and interoperability between the two languages.

For example a simple type such as:

data Foo = A | B

Could be implemented as an enum in C/C++:

enum Foo {A, B};

More advanced types that take arguments such as:

data Tree = Node Left Right | Leaf

Could be converted to a struct in C/C++:

struct Tree {

struct Tree* left; struct Tree* right;


Types that have functions that act on them such as:

data IntContainer? = IntContainer? Int

getInt :: IntContainer? -> Int getInt (IntContainer? a) = a

could have these functions automatically converted to C/C++:

struct IntContainer? {

int a;


extern int getInt_hs(IntContainer? a); This also opens up the possibility of exploiting C/C++ name mangling conventions, to allow the _hs postfix I'm suggesting here to be eliminated.

Haskell classes:

class Arithmetic a where (+) :: a -> a -> a (*) :: a -> a -> a (-) :: a -> a -> a (/) :: a -> a -> a could be implemented using C++ functions with virtual members:

class Monad {


virtual Monad add(Monad a, Monad b); virtual Monad mult(Monad a, Monad b) virtual Monad neg(Monad a, Monad b); virtual Monad div(Monad a, Monad b);


All types of single/multiple instancing (i.e. either directly or through requirements of instances) would be implemented using single/multiple inheritance.

Obviously, this example is rather contrived due to the conversion of the function names. The fact that the rules that govern function naming in Haskell are much more permissive than those of C/C++ might cause compatibility issues.

This can be worked around by implementing a similar syntax to that currently used for function imports by the FFI. E.g..:

foreign export ccall "bind" >>=
CInt -> CInt

Similar to:

foreign import ccall "f" func
CInt -> CInt

The latter is the current syntax for imports.

The name given for the export would be checked for legality in the namespace of the target language.

Alternatively this could be done in an automated manner using some naming conventions as well as operator polymorphism, but this would probably sacrifice ease of use.

Finally polymorphic Haskell functions/types can be implemented in C++ using templates.

I would like to extend ghc to implement enhanced C/C++ stub generation using the methods described above as well as to generate Haskell stubs which describe the Haskell CType equivalents of the Haskell types defined, functions for conversion between the two and function stubs to convert the types, run the Haskell function and convert back as required.

On top of this I'd like to write C/C++ libraries for dealing with most of the standard Haskell types such as Maybe, Either, etc...

Finally, I'd like to work on ironing out any bugs that remain in the RTS when it is used in "multilingual" situations, as well as improving it's performance in this situation.

I think that extending ghc to the level required by "The Haskell 98 Foreign Function Interface 1.0" specification and above would reap significant benefit to the Haskell community.

The improved integration into C/C++ would open the door for this to happen for several other languages and would make Haskell more widespread.

Many Haskell beginners are daunted by the falsely perceived complexity of working with Haskell IO and monads, but love using the massive advantages that the paradigm gives in a non monadic context. Due to this, simplifying the interoperability between Haskell and C/C++ would enable many of these users to stick around for longer and perhaps encourage them to eventually look deeper into the language. This would make the size of the community grow and make the use of Haskell more widespread, potentially reaping benefits for the community at large.

I believe this could be implemented within the time frame given for GSOC.

#1618 Add NUMA-supporting features to GHC new good 4 years 8 months

This ticket is an adaptation of Sajith Sasidharan's mailing list post.

Broadly, idea is to improve support for NUMA systems. Specifically:

  • Thread pinning Real physical processor affinity with forkOn. We need to pin to specific CPUs if we want to. (Currently, the number passed to forkOn is interpreted as number modulo the value returned by getNumCapabilities).
  • Process pinning: when launching processes, we might want to specify a list of CPUs rather than the number of CPUs, so that we can pin a process to a particular NUMA node. Say, a -N [0,1,3] flag rather than -N 3 flag.

  • Memory locality: From a very recent discussion on parallel-haskell, we learned that there is a clear path to improving the storage manager's NUMA support. The hypothesis is that allocating node-local nurseries per Capability will improve performance over the bump-pointer approach in allocate. We might use the NUMA-aware allocation primitives from the Portable Hardware Locality (hwloc) library for this.
  • Logging and tracing Add NUMA-specific event logging and profiling information to support performance analysis and debugging of user-level NUMA-aware programs.

Interested Mentors

Needed! I ([acfoltzer@… Adam Foltzer]) know the outlines of the problems fairly well, but have no experience hacking on RTS code. I would be willing to take on a supporting role, but such experience seems necessary to mentor this project.

Interested Students (Include enough identifying info to find/reach you!)

  • [sasasidh@… Sajith Sasidharan]
#1585 Combine Threadscope with Heap Profiling Tools new good 6 years 6 years

ThreadScope? lets us monitor thread execution. The Haskell Heap Profiler lets us monitor the Haskell heap live. HPC lets us monitor which code is execution and when. These should all be in an integrated tool for monitoring executing Haskell processes.

#1584 ThreadScope with custom probes. new good 6 years 6 years

ThreadScope? lets us monitor thread execution. The Haskell Heap Profiler lets us monitor the Haskell heap live. HPC lets us monitor which code is execution and when. These should all be in an integrated tool for monitoring executing Haskell processes.

#1582 LLVM optimisation passes / tables next to code new good 6 years 8 months

Project 1: LLVM optimisation passes ~

1) Clearly identify issues with LLVM produced assembly code in the context of GHC

  • This can be done by examining how it compares to the native code generator on nofib benchmarks
  • You might be able to get some mileage from simply eyeballing the assembly and looking for "obvious" stupidity, like the ESP issue David spotted
  • The result of this part should be a simple set of Haskell test cases, the assembly they produced, what the assembly *should* be (roughly) and perhaps some notes on what might fix it

2) The second part would be to identify the lowest hanging fruit from those things identified in 1) and make changes to the LLVM output / write LLVM optimisations (apparently this is a joy to do, the LLVM framework is very well designed) to fix the issues

Separating the project into two parts like this means that we could get something out of the project even if the student is unable to make significant progress with LLVM itself in the GSoC timeframe. Having a clear description of the problems involved + simple benchmark programs would be a huge help to someone attempting part 2) in the future, or they could serve as the basis for feature requests to the LLVM developers.

Project 2: Tables next to code

My feeling is that this is the more challenging of the two projects, as it is likely to touch more of LLVM / GHC. However, it is likely to yield a solid 6% speedup. It seems there are two implementation options:

1) Postprocessor ala the Evil Mangler (a nice self contained project, but not the best long term solution)

2) Modify LLVM to support this feature


Either project could include looking at the impact of using llvm for link time optimisation, and a particularly able student might be able to attempt both halves.

See also the thread at and David's thesis

#1114 Sandboxed Haskell new none OK 9 years 7 years

hs-plugins can dynamically compile and load Haskell code, but does not prevent plugins from using unsafePerformIO or unsafeCoerce#. I would like to be able to use hs-plugins to execute untrusted code. As far as I can see, two pieces of infrastructure are missing:

  • A way to ensure that a dynamically compiled program does not use any unsafe primitives.
  • A way to limit the resources (clock cycles and RAM) used by an untrusted computation.

It seems to me that the best way to achieve the first goal is to make GHC keep track during compilation of which functions are safe (do not call unsafe primitives, or, I suppose, are declared to be safe by a pragma in a trusted library). However, I know only very little about GHC internals.

One project I want to use this for would be a web server that lets users create Haskell-based web applications without having to set up their own Unix account etc. If this project is accepted, I'll build a prototype of this that can be used to test "sandboxed haskell" (no matter whether the project ends up being assigned to me or somebody else).

Interested Mentors

  • ?

Interested Students

  • Benja Fallenstein (benja.fallenstein@…) -- I'm not familiar with the internals of GHC at this point, but I'm willing to learn. :-) A knowledgable mentor would be good if I end up doing this project.
  • Brandon Wilson (xelxebar) <[bmw.stx@…]>
#70 integrate searchpath and ghc new none 10 years 10 years

Searchpath currently operates as a wrapper around ghc and ghci. This means that a :r in ghci does not attempt a full reload from the internet. It would be nice if :r reloaded needed modules from the internet. For more info on searchpath, see

Topic: Games (1 match)

Ticket Summary Status Owner Priority Created Modified
#1672 Improve Nomyx new bad 9 months 9 months

Nomyx [1] is a unique game where you can change the rule. It's the first full implementation of a Nomic game [2] in a computer. It is based on a Haskell DSL that lets the players submit new rules while playing, thus completely changing the behavior of the game through time.

The game could be seen as a platform where beginners in Haskell could learn the language while playing an actual multiplayer game. Experts on the other hand could participate in great matches that show their Haskell skills.

However, the game is still in development and especially the design space of the DSL need to be explored to make it easier to learn and use. Ideally, even non-Haskellers should be able to create some basic rules. This is the objective of this proposed GSoC.

[1] [2]

Topic: Graphics (4 matches)

Ticket Summary Status Owner Priority Created Modified
#1597 Platform neutral GUI leading to ... hackage 3 new OK 5 years 5 years

It is hard in hackage to get an overview over nearly 3000 libraries, how popular a library is, examples of how to use it, see changes, etc. . Some of this is planned to be solved in hackage 2, which is very promising. I would like to extend this by displaying libraries like skyscrapers in a city as flow networks. For this I have developed several libraries to prepare this, see . I have have found a very general way to construct 3d shape based on symmetry classes in my diploma thesis (this also tackles this reddit proposal: My plan is to change Sourcegraph ( to parse all packages and display them with WebGL or a 3d-engine.

#1550 wxHaskell improvements new OK 8 years 9 months

wxWidgets are a funded project this year. Coordinate with them to work on wxHaskell improvements. Needs support from the wxHaskell team

Things which look like they might be good summer of code projects:

  • Type-safe XRC support (and Document)
  • Fancy vector graphics support on top of wxGraphicsContext

Things we want for wxhaskell 0.12

  • better handling of extensions
  • Merge WXSOEGraphics code.

Daan's ideas from

  • Add new widget abstractions to the WX library
  • Portable resources (see the page)
  • Create a good tree control / list control abstraction (perhaps the higher level libraries like autoforms?)

Other ideas

  • Better Cabalization (This is not just wxHaskell project. We should collaborate with Cabal developper.)
  • Add com (ActiveX) support for Windows platform

Todo: what else needs doing with wxHaskell?

Interested mentors

  • Kido Takahiro
#77 3D GUI system and widget library new none 10 years 9 years

A framework for writing 3D, skinnable GUIs in Haskell (simmilar to some of the projects at The system will be intended for use in some games, CAD/CAM applications, 3D art tools, and any other program that needs a heavy-duty GUI in a primarly 3D environment. It could also be potentially used (if only for inspiration) in an OSS response to the current trends in 3D desktop composition engines.

The goals for the Summer of Code 2006 will be:

  • Full description of design goals
  • Overall design (relating GUIs, messages, documents, and the IO monad)
  • Window composition engine
    • Relative (pixel-agnostic) sizing
    • Absolute (pixel-sensitive) sizing
  • Common widgets, and a default skin for each of those widgets
    • Text controls
      • Label
      • Edit box
      • Multiline edit box
    • Buttons
      • Standard button
      • Radio button
      • Check box
    • Sliders
      • Standard slider (vertical/horizontal)
      • Progress bar (horizontal)
    • Compound widgets
      • List boxes
        • Simple list box
        • Columnar list box
      • Scroll bar
      • Scrolling window

Interested Mentors

Interested Students

  • Ryan Trinkle <ryant5000@…>
#55 Improvements to the INBlobs tool developed at U. Minho new none 10 years 10 years

We propose the following improvements to the INblobs tool developed at U. Minho:

  • To fix graphical portability problems of INblobs. This implies a few tricks with wxHaskell and the wx toolkit.
  • Adding new features to the tool, namely:
    • Archetypes
    • Better layout after application of reduction
    • Automatic align of nodes
    • A macro feature
    • A graphical editor for symbols (agents shapes)
    • Better verification of Interaction Net systems

Interested Mentors

  • Jorge Sousa Pinto <jsp@…>

Interested Students

  • ?

Topic: Haddock (1 match)

Ticket Summary Status Owner Priority Created Modified
#1663 Hyperlinked sourcecode for documentation new not yet rated 9 months 9 months

To extend Haddock or HsColour? to create hyperlinked coloured source files for browsing, similar to how agda can render Agda source[1].

Clickable source code is useful as documentation and as an exploration tool to learn a library but also to learn library design. The ambition is to make all source on Hackage browsable in this manner.

Time permitting this proposal could also include other enhancements to Haddock, such as links to instance declarations in Haddocks.


Topic: JHC (2 matches)

Ticket Summary Status Owner Priority Created Modified
#1108 .NET CLR back end for jhc new none 9 years 9 years

jhc currently has front end support for .NET, accepting hugs98.NET compatable foreign declarations and allowing selection of .NET as a target, there is just no .NET back end.

one can be written to either convert core or grin to .NET. jhc core is similar enough to ghc core that there might be relevant literature, but grin -> imperitive code would be a much shorter hop.

Interested mentors

  • John Meacham

Interested students

  • Aaron Tomb <atomb@…>
#43 JHC Hacking new none 10 years 9 years

Implement a new feature in JHC, depending on interests and skills. e.g. MPTC+FD, fast arrays, STM, Template Haskell, open datatypes, nice records, or something that you are interested in.

Interested Mentors

  • Einar Karttunen (musasabi) <ekarttun@…>

Interested Students

  • Sven Moritz Hallberg (pesco) <pesco@…>
  • Mathieu Boespflug (mboes) <0xbadcode@…>
  • ?

Topic: Networking (1 match)

Ticket Summary Status Owner Priority Created Modified
#1117 SNMP MIB compiler using Parsec new none 9 years 9 years

Write a SMIv1, SMIv2 SNMP MIB compiler using the Parsec combinator.


Topic: Systems (3 matches)

Ticket Summary Status Owner Priority Created Modified
#1548 xmonad: compositing support new OK 8 years 7 years

xmonad is a tiling window manager for X11, and a popular Haskell open source project with many users.

This project would seek to integrate "compositing" support into xmonad, creating the first compositing tiling window manager.

Compositing is the use 3D hardware accelaration to provide window effects, such as in the Apple "expose" functionality, and Compiz, an unix window manager supporting compositing effects.

By reusing the compositing libraries provided by "compiz", binding to them from Haskell, and integrating compositing hooks into xmonad, we could hope to write effects in Haskell, for the window manager.

This would make xmonad unique: the only tiling window manager with support for compositing. Additionally, a Haskell EDSL for describing effects would be of general utility. The result would be a novel UI interface, and would investigate how the user interface for a tiling wm can be enhanced via compositing.

The initial goal would be to bind to the basic library, providing a simple effect (such as shadowing), and then extend the supported effects as necessary.

Related material

The xmonad feature ticket for this:

Organisations that might be interested

  • Portland State University

Discussion will take place on the xmonad@ lists, in order to prepare good submissions to these groups.

Interested mentors

  • Don Stewart
#78 Graphical type analysis new none 10 years 10 years

Type errors can be frustrating to beginner and intermediate programmers in haskell. Indeed, with its powerful type system and baffling compiler output, finding type errors for programs compiled in ghc can be a task which can be extremely tedious and time consuming. As a teaching assistant, I have seen first hand the exasperation of students when trying to find their errors. I propose to let the programmer see which types have been assigned to what parts of the code graphically in Yi. A user would simply have to point to the piece of code he wants to scrutinise and all the type information would be instantly displayed. Furthermore, the exact location of the typing incongruities would be quickly displayed in red. This would involve:

  • Implementing the the haskell type, kind and module systems
  • Creating an interface with which the programmer can analyse his code using gtk2hs
  • Incorporating it in Yi
  • Time permitting, extending it to the ghc type system

Interested Mentors

  • Dons Stewart <dons@…>

Interested Students

  • Jacques Le Normand <jlenor1@…>
#76 Darcs project management web application new none 10 years 10 years

A web based project management application for Darcs that would honor it's distributed and patch based nature.

There exist projects to make Trac work with Darcs, but the suggester of the project (and the first interested student) feels a system that supports every aspect of Darcs (unrecord and unpull) in addition to designing the system to match Darcs nature will be beneficial.

Implementation language would not necessarily be Haskell. Ruby was suggested.

Interested Mentors

  • ?

Interested Students

  • Eivind Uggedal (rfsu) <uggedal@…>
  • Juan José Olivera Rodríguez <jotajota@…>

Topic: Tools (7 matches)

Ticket Summary Status Owner Priority Created Modified
#1675 A substitution stepper new freinn not yet rated 9 months 9 months

This idea was in my head since people learning haskell (like me) usually have some trouble understanding substitution step by step. Is the third wish in


foldr (+) 0 [1, 2, 3, 4]

foldr (+) 0 (1 : [2, 3, 4])

1 + foldr (+) 0 [2, 3, 4]

1 + foldr (+) 0 (2 : [3, 4])

1 + (2 + foldr (+) 0 [3, 4])

1 + (2 + foldr (+) 0 (3 : [4]))

1 + (2 + (3 + foldr (+) 0 [4]))

1 + (2 + (3 + foldr (+) 0 (4 : [])))

1 + (2 + (3 + (4 + foldr (+) 0 [])))

1 + (2 + (3 + (4 + 0)))

1 + (2 + (3 + 4))

1 + (2 + 7)

1 + 9


Comparing this with foldl immediately shows the viewer how they differ in structure:

foldl (+) 0 [1, 2, 3, 4]

foldl (+) 0 (1 : [2, 3, 4])

foldl (+) ((+) 0 1) [2, 3, 4]

foldl (+) ((+) 0 1) (2 : [3, 4])

foldl (+) ((+) ((+) 0 1) 2) [3, 4]

foldl (+) ((+) ((+) 0 1) 2) (3 : [4])

foldl (+) ((+) ((+) ((+) 0 1) 2) 3) [4]

foldl (+) ((+) ((+) ((+) 0 1) 2) 3) (4 : [])

foldl (+) ((+) ((+) ((+) ((+) 0 1) 2) 3) 4) []

(+) ((+) ((+) ((+) 0 1) 2) 3) 4

1 + 2 + 3 + 4

3 + 3 + 4

6 + 4


Each step in this is a valid Haskell program, and it’s just simple substitution.

This would be fantastic for writing new algorithms, for understanding existing functions and algorithms, writing proofs, and learning Haskell.

#1654 Make the GHC performance monitoring build bot production ready new good 9 months 8 months

Joachim Breitner created a build bot ( that runs some the nofib suite every so often. It gathers the results and create a dashboard showing how performance on the benchmark changes over time. It needs to be productionized so it runs on a machine maintained by the Haskell infrastructure team. It also needs to reliably detect regressions and email ghc-dev@ when those happens. The latter involves setting up emailing infrastructure and possibly tweaking nofib until the benchmarks are reliable (i.e. have long enough runtimes to avoid noise.)

#1610 Cabal support for the UHC JavaScript backend new good 4 years 9 months

We recently improved the UHC JavaScript? backend[1] and showed that we can use it to write a complete front end for a web application[2]. A summary and demo of our results is available online[3].

Currently, however, it is still quite inconvenient to compile larger UHC JS projects, since Cabal support for UHC's different backends is limited. The programmer currently manually specifies include paths to the source of the used modules. Improving this situation is the goal for this GSoC project: make it possible to type cabal configure && cabal build and find a complete JS application in the project's dist folder.

Successful completion of this project would go a long way towards making the UHC JS backend more usable and as a result, make Haskell a practical language to use for client-side programming.

In solving this problem, one will have to think about how to deal with external Haskell libraries (UHC compiles JS from its internal core representation, so storing pre-compiled object files won't work in this case) and perhaps external JS libraries as well. One will also need to modify Cabal so that it becomes possible to select a specific UHC backend in your cabal files. Ideally, only one cabal file for an entire web application is needed; for both the front-end and backend application.

An additional goal (in case of time to spare) would be the creation of a UHC JS-specific Haskell Platform-like distribution, so that programmers interested in using this technology can get started right away. Of course, such a distribution would have to be installed alongside the regular Haskell Platform.

As for mentoring is concerned, I (Jurriën) might be able to help out there, but since the above project would revolve more around Cabal and not so much around the UHC internals, there might be more suitable mentors out there.


#1604 Embedding Haskell in C++: The FFI upside-down new OK 5 years 9 months

The idea is to, following in the footsteps of tickets #1555 and #1564, make accessing Haskell functions from C++ as conveniently as possible by creating a tool which easily exposes Haskell functions to C++.


The main rationale behind this, is that pure code should be called from impure code. This is a paradigm already present in Haskell in the form of the IO monad. Thus Haskell should be invoked from an external (impure) language, not the other way around.

The FFI is mostly used in the opposite direction, because it is usually employed to create wrappers for external libraries.


A big part of exposing Haskell functionality to another language is marshalling data structures between the two languages. We could use the following system to derive a c++ representation for Haskell datatypes:

To keep things initially simple one could start with mutually recursive datatypes with type parameters (no nested datatypes or functions with constructors). For these there is a straightforward (automatic) translation from Haskell types into c++ types:

  • Datatypes become virtual classes
  • Constructors become subclasses of their datatype
  • Constructor parameters become class-variables
  • Type parameters become templates

Standard Haskell types such as integers/lists/maybe/map could be treated separately to create native c++ representations.

For functions we want exposed to c++, a wrapper function is created which calls the original Haskell function (via C). This is also where the marshalling of arguments/results takes place. Initially only first-order functions will be supported. A system to support higher-order functions (which implies functions have to become first-class in c++) may be possible but this would be future work.


A possible use-case for this system would be a code editor. The (impure) gui can be written in c++ while the text editing functions and for example parsing for code-highlighting can be provided by pure Haskell functions.

#1599 Improve several areas of EclipseFP new OK 5 years 9 months

Now, EclipseFP supports working with both Haskell and Cabal files, compiling and running the code. However, it would be great to make it a complete environment as for Java or C++. Some ideas would be:

  • Use the framework in Eclipse so that Haskell programmers could run Quickcheck, Smallcheck, HUnit test as Java programmers can run JUnit tests;
  • Allow the programmers to run their executables under profiling environment and then show the results. This plug-in may be based in the ones for GProf or OProfile in the Linux Tools Eclipse project;
  • Try to provide more code completion or IntelliSense?-like things;
  • Allow the user to access in a fast way information from functions (for example, if you Shift+click on a function name, the Haddock doc may appear)

Interested students

  • Alejandro Serrano (serras)
  • Saurabh Kumar <saurabh.catch@…>
#1111 Bring Hat tracing back to life new none OK 9 years 8 years

The Hat tracer for Haskell is a very powerful tool for debugging and comprehending programs. Sadly, it suffers from bit-rot. It was written largely before the Cabal library packaging system was developed. It is difficult to get any non-trivial program to work with Hat, if the program uses any pre-packaged libraries outside the haskell'98 standard.

The aims of this project would be

(a) to fix Hat so that it adequately traces hierarchical library packages

(b) to integrate support for tracing into Cabal, so a user can simply ask for the 'tracing' version of a package in addition to the 'normal' and 'profiling' versions.

(c) to develop a "wrapping" scheme whereby the code inside libraries does not in fact need to be traced at all, but instead Hat would treat the library functions as abstract entities.

Interested Mentors

  • Malcolm Wallace (@

Interested Students

#1104 GuiHaskell, to superceed WinHugs new none 9 years 9 years

I started a GuiHaskell? project some time ago (, the goal of this project is to write a replacement for WinHugs? but with several important enhancements:

  • Written in Haskell, not C
  • Supports Hugs/GHCi/GHC/Yhc/nhc etc. as evaluators
  • Runs on all platforms, Windows/Linux/Mac? at least

The project was blocked on a missing Gtk2Hs feature, which is now present. The initial prototype is capable of executing code as a proof of concept, but the vast majority of the code has not yet been written.

The project would aim to achieve at the very least:

  • Add support for all compilers
  • Hook up appropriate interaction with external editors, integration with the tools, toolbar buttons, etc
  • Superceed WinHugs?

And it would be cool if the student looked towards adding the following features (at least a few would be expected, all would be great!):

  • One click profiling, recompilation using appropriate flags, spawning viewing tools
  • Cabal integration, making Cabal package creation/compilation/installation a button press
  • Hat debugging
  • Hoogle support
  • Haddock/HsColour? pretty code generation
  • Add extra tools here

The aim is NOT to write a Haskell text editing widget that you can use to edit your code, but to do everything else that might be handy in an IDE.

Neil Mitchell offers to mentor (and wrote the current version of WinHugs?, and the GuiHaskell? prototype)

Interested Students

  • Asumu Takikawa (shimei) <shimei+soc@…>
  • Sascha Boehme <sascha.boehme@…>
  • Andreas Voellmy <andreas.voellmy@…>

Topic: Web Development (5 matches)

Ticket Summary Status Owner Priority Created Modified
#1671 PureScript Improvements new good 9 months 9 months

PureScript? is a relatively large project written in Haskell, which is increasing in popularity and starting to see commercial use. The PureScript? community has plenty of possible projects for interested students, and we have a very active developer community, which would be able to provide lots of guidance.

To avoid splitting each possible project into a separate Trac item, I have created a more detailed set of project descriptions on the PureScript? wiki, here:

Initial feedback on #haskell-gsoc IRC seems to suggest that some of the later items in the list might be too much work for a single summer, but it should be possible to break most of them down into manageable pieces, and get at least some good results by the end of the summer.

#1621 Snap: Implement type-safe URL support for the Snap Web Framework new mightybyte OK 4 years 9 months

Snap aims to be a fast, resiable, easy to use, high quality and high level web development framework for Haskell. More information on Snap can be found at

Type safe URLs encode all accessible paths of a web application via an algebraic datatype defined specifically for the web application at hand. All urls within the application, then, are generated from this datatype, which ensures statically that no dead links are present anywhere in the application.

Furthermore, when the user-visible URLs need to change for various (possibly cosmetic) reasons, the change can be implemented centrally and propagated to all the places of use within the application.

The web-routes package on Hackage seems to provide a framework-agnostic way to handle type-safe URLs. However, it might be worthwhile to discuss if this is the right choice for Snap.

The scope of this project would include the following:

  • Evaluation of web-routes and development of an idiomatic solution for underlying type-safe URL support for snap, both at "snap-core" and "snap" package levels. The API design should specifically consider how all the pieces will come together given the highly flexible snaplet infrastructure that Snap provides to achieve modularity in web applications.
  • Development of Haskell combinators/helpers, so that URLs for various pages can be generated programmatically anywhere in the application.
  • Integration with Heist template system so that URLs for various pages, including all necessary parameters, can easily be generated in templates anywhere in the application.
  • Development of test cases to be added to the appropriate snap test-suites.
  • Implementation of the type-safe URL system for the Snap website, as an initial use case study and verification of the design.
  • The project may involve Template Haskell to remove obvious boilerplate.

Interested Mentors

Doug Beardsley

Greg Collins

Ozgun Ataman

#1120 XML Schema implementation new none OK 9 years 4 years

XML Schema( is a format to define the structure of XML documents(like DTD, but more expressive). Since it's recommended by the W3C, many XML standards use it to specify their format. Some XML technologies, like WSDL( use it to define new data structures.

This makes XML Schema a door-opener for the implementation of other XML technologies like SOAP. The project could include the following:

  • Implementing a XML Schema parser(using for example HaXML)
  • Building a tool for XML Schema -> Haskell code transformation(like the DtdToHaskell tool included in HaXML)
  • Creating a verifier that checks if a document conforms to a XML Schema

Interested Mentors

Malcolm Wallace

Interested Students

Vlad Dogaru <ddvlad*REMOVETHIS*@…>

Saurabh Kumar <saurabh.catch@…>

Marius Loewe <mlo@…>

David McGillicuddy <dmcgill9071@…>

#53 Tie HAppS with SQL databases new none 10 years 10 years

HAppS is a web framework offering transactional semantics. Tying it to SQL transactions is an interesting exercise requiring some knowledge in haskell database bindings (implementing two phased commit for different databases) and extending HAppS MACID to handle the two-phased commit.

Interested Mentors

  • Einar Karttunen (musasabi) <ekarttun@…>

Interested Students

  • Robert Zinkov <rob@…>
#19 Continuations-based DSL on top of HAppS new none 10 years 10 years

Do you have a vision how to do better than WASH? Integrate continuation based interaction with client or use something like Functional Forms for the interaction. How to best to interact with XML etc. Other HAppS related projects also possible.

Interested Mentors

  • Einar Karttunen (musasabi) <ekarttun@…>

Interested Students

  • ?

Topic: misc (34 matches)

Ticket Summary Status Owner Priority Created Modified
#1674 Implement SplitMix for System.Random new good 9 months 8 months

The recent splitMix work of Steele, Lea, and Flood ( describes a splittable algorithm for random generation that "is an object-oriented version of the purely functional API used in the Haskell library for over a decade, but splitMix is faster and produces pseudorandom sequences of higher quality; it is also far superior in quality and speed to java.util.Random, and has been included in Java JDK8 as the class java.util.SplittableRandom?."

Simon Peyton Jones has noted: "Moreover, splitMix has a published paper to describe it, which is massively better (in a tricky area) than an un-documented pile of code. I knew that Guy was working on this but I didn't know it was now published. Excellent! I would love to see this happen."

The goal of this project would be, guided by the paper and the working code available in java JDK8's java.util.SplittableRandom?, to bring the algorithm back to an efficient Haskell implementation, suitable for use as an instance of RandomGen? [1]. We could then seek to replace the current StdGen? with this better implementation.


#1673 Out-of-process Template Haskell new good 9 months 9 months

As described in this mailinglist thread:

Template-Haskell is not available to cross compilers at the moment. This is because it is only available in a "stage 2" compiler (i.e. self-hosted). Meanwhile, cross-compilation is performed by a "stage 1" compiler -- running on a host to produce executables for a target.

The thread describes a solution developed for ghcjs that allows a stage 1 compiler on a host to _drive_ a stage 2 compiler on a target only for the expansion of template haskell splices.

"The ultimate goal would be a multi-target compiler, that could produce a stage2 compiler that runs on the host but can produce binaries for multiple targets of which the host its running on is one."

There remains a fair amount of work to integrate this solution into the main GHC code base, but it would be very worthwhile.

This is a relatively advanced project, but for the right student it could be a great exercise.

Luite, who developed the solution for ghcjs, had offered to mentor.

#1670 Haddock Improvements new good 9 months 9 months

A catch-all ticket for Haddock work.

One idea is to allow Haddock to generate not only the full correct signature for functions, but also "example" type-specialized versions, perhaps hidden in a mouseover -- as discussed at

Another general area to look at (not yet a full proposal) is moving haddock from "documented modules" towards "a system for building documentation". Some ideas would be better doctest integration, improving inclusion of images and such, perhaps a plugin to generate dependency graphs and visualizations. Also, rather than cutting haddock over to markdown, allowing "markdown-only" documentation, not necessarily coupled to the source of particular modules, to be bundled and also provided in the generated html. Similarly, inclusion of a changelog file could be useful.

Generally, more ideas could be derived from looking at the state-of-the-art in other languages' package documentation systems and looking for things to borrow.

#1669 (not so) Silly shape-dependent representation tricks new not yet rated 9 months 9 months

Unary natural numbers (also known as Peano naturals) have the stigma of slowness associated with them. This proposal suggests to rehabilitate Nat (and also its ornamented cousin, the linked list) as an efficient data structure on modern CPUs. I'll add more ideas later (possibly as comments), but for now let's consider a bare bones case…

data Nat where
  Z :: Nat
  S :: !Nat -> Nat

is the type of strict unary naturals. They can be only

S (S Z)
S^n^ Z

I propose to introduce an analysis detecting this shape of constructors and store the machine integer (unsigned) n instead of a linked heap-allocated chain. (Some considerations must be invested to deal with overflows). The general principle is to count the known S constructors. GHC already performs constructor tagging, and my suggestion might be regarded as a generalization of it to more than 3 (2 on 32-bit architectures) tag bits. The real benefit comes from algorithms on Nat, such as

double :: Nat -> Nat
double Z = Z
double (S m) = S (S (double m))

Each S above could be a batch pattern match of Sn, thus arriving at the efficient double Sn = S2*n if we set Z = S0. This is (basically) a shift-left machine instruction by one bit.

Handling non-strict naturals needs adding extra knowledge about non-constructors, but the basic mechanics (and benefits) would be similar. For strict/lazy-head linked lists the corresponding machine-optimized layout would be the segmented vector with e.g. upto 64 memory-consecutive heads. The exact details need to be worked out and there is some wiggle-room for cleverness.

Summary: I'd like to give back the honor of efficiency to the famously elegant concept of Peano naturals by generalizing pointer tagging to many bits.

#1668 Module Import Improvements new good 9 months 9 months

Cleaned up from reddit:


import Data.Map (Map) qualified as M

This would be the same as

import Data.Map (Map)
import qualified Data.Map as M

B) Nested Module Organization The proposal is detailed here:[email protected]/2012-01/msg00171.html

C) Qualified Module Export The proposal is detailed here:

The first is very simple. That combined with either of the others (the second being my longtime favorite) would make a good GSOC imho.

#1667 Improvements to the criterion Library new not yet rated 9 months 9 months

From Reddit:

I've been using (the excellent) criterion a ton and have a small wishlist of things easily doable in a summer. Mainly around making the reports better, in order from most to least important to me:

  • Replace bar graphs in overview with filled version of the KDE contour plots shown in details section. Or at least error bars, but those I think would be strictly less useful. Maybe one could toggle between different representations.
  • allow dynamic showing/hiding/filtering in summary, so that if you get a really slow benchmark you can hide it and have the scale re-adjust to the others
  • In the KDE plot, use the gray vertical lines to indicate scale by e.g. marking a line every mean / 20 or something. So a KDE plot with lots of vertical gray lines close together indicates we're "zoomed out" and the timings have quite a spread. As is the gray lines don't serve much function (you can mouseover parts of the plot to get a sample)
  • Often I'm testing several different benchmarks against several different alternate implementations. It would be nice to be able to toggle between grouping by benchmark (so that we can compare implementations), and grouping by implementation.
  • fixing some layout issues (e.g. keys overlapping graphs)
  • figure out a way to lighten the dependencies
#1666 Better Hackage UI / Discoverability new good 9 months 7 months

From reddit comments:

One element of Hackage I'd very much like to see work on is discoverability -- better search, tagging, a "reverse deps" counter that doesn't eat all our memory, etc. Part of that would be maybe letting people add search columns for things like # of uploads, existence of test suites, whether build reports are green and deps are compatible with various platforms, etc. I know star ratings and comments are considered unreliable and untrustworthy for a number of reasons, but if there were a decent approach to them I'd be keen. Perhaps if we included voting on comments? And maybe not star ratings but just "I recommend this" as something somebody could choose to thumbs up (like twitter favs or facebook likes), then that could help?

And also:

I think +1s are a great metric - they are basically a people-who-downloaded-this-and-liked-it-enough-to-mention-it counter. Often, when I'm trying to find a library for something, this metric is good enough for me to get going and avoid going down the wrong path. On the subject of stealing ideas though, I consider the MetaCPAN project a shining example of good metadata. See the sidebar for Catalyst - sparklines, inline issue counters, build machine reports, ratings, +1 aggregates, and even a direct link to IRC. There's a lot we could learn from, right there!

#1665 HBlas Projects / FFI Numerics Bindings new good 9 months 9 months

Comments from carter on reddit, who would also be happy to mentor as far as I know:

  • rounding out the blas API while also adding some docs so that people dont need to to read the docs in the original fortran codes. Per se theres a few other blas bindings on hackage in various states of quality, but hblas tries very very hard to a) be zero config install on all the most common platforms b) not have any configure or dist time code gen
  • adding a decent coverage of basic lapack operations with a decent UX. This alone could eat up a good chunk of a summer. its not hard, but its not easy either.
  • another project would be to write a haskell binding for blis, which is a more modern approach to blas. This be a bit saner project than working on the high level lapack binding, but a bit more work than helping finish up the blas in hblas.
#1664 IHaskell Projects new good 9 months 9 months

From the reddit thread:

Not sure if this is the sort of thing GSOC is meant for, but if anyone is interested in improving interactive coding in Haskell in IHaskell (useful for graphics, interactive data analysis, etc), I would be happy to help. I've mentored people working on small projects related to IHaskell / contributions to IHaskell, which went well. If anyone is interested, let me know! Potential ideas include:

  • Working on IHaskell proper. This can include some pretty cool things, like perhaps trying to use GHC's type holes to implement context and type driven completion.
  • Working on integrating libraries into the IHaskell ecosystem. Some are fairly easy to integrate, but some are more challenging (and potentially require upstream changes); I personally would like to see diagrams animation support, gloss as a WebGL code generator, and threepenny-gui as a GUI framework integrated into IHaskell.
  • Using IHaskell as a library; this can include implementing a more powerful hint replacement or writing a kernel for another language such as Idris.

And also:

I've mentioned this before, but, which offers sage and ipython notebooks for free also offers terminals, and ghc in those terminals. William Stein, whose site it is, has indicated that if someone were to help him set up IHaskell in sageMathCloud, he would be happy to do so. I like the goals of the SMC project and the IHaskell project both, and together I think it would be huge for IHaskell, since it would remove all the setup cost and hosting cost for sharing, just as it does for IPython notebooks. So I think that would be a very doable SOC project, with a significant immediate payoff.

#1660 Rewrite the Pandoc Markdown Parser new good 9 months 9 months

Pandoc is one of the most widely used open-source Haskell projects. The project is primarily used by academics and authors to streamline their writing process. A typical author will write their paper in markdown before converting their document to LaTeX but of course there are many other possibilities.

Unfortunately, this process is not always plain sailing. Despite looking very simple, there are a number of ambiguous corner cases in the original Markdown description. This leads to surprises to users when they try to convert their documents.(1) Many of these cases have been reported as issues to the Pandoc issue tracker.

Last year, John MacFarlane along with an industry consortium set out to standardise Markdown. The result of their work was CommonMark. The project expands Gruber's initial ideas into a several thousand word specification along with reference implementations and a comprehensive test suite. Hopefully as the spec finalises, more implementations will conform to it.

As a result of this ambiguity, the markdown parser has grown difficult to maintain and inefficient. The goal of this proposal would be to rewrite the markdown parser so that it becomes more maintainable, efficient, extensible and consistent.

In addition to the "standard" elements, Pandoc supports many useful extensions. These have been added in an ad-hoc manner so the code has accumulated technological debt over time. A proposal should also look to provide a mechanism to make it easy to add extensions in the future. This part would require careful thought and analysis of the existing code base as some extensions (such as footnotes) have caused a significant change in the parsing strategy. Given a suitably extensible base parser it would perhaps even be desirable to split the base CommonMark? parser into a separate package (á la Cheapskate). This would enable other projects to firstly use the parser without depending on Pandoc and secondly, independently define their own extensions.


The following tasks could make up part of a proposal.

  1. Rewrite the Pandoc Markdown parser to conform to the CommonMark specification.
  2. Integrate the CommonMark tests into the Pandoc test suite.
  3. Design a system which would allow support of Pandoc's existing extensions.
  4. Explore efficient methods to parse Markdown. The CommonMark javascript implementation is current 3-4x faster than Pandoc's parser and the C implementation approximately 30-40x faster.

Getting Involved

There are lots of open tickets on the issue tracker. Those marked minor should have isolated fixes which would be suitable for beginners.

Markdown Specific Issues

  • #1909 - apostrophe being misinterpreted as open quote
  • #661 - --smart must also convert << to « and >> to »

(1) For an overview of different implementations, see BabelMark.

#1657 Switch cabal to use an off-the-shelf SAT-solver/theorem prover for solving version constraints new good 9 months 9 months

Switch cabal to use an off-the-shelf SAT-solver/theorem prover for solving version constraints

The current solver has already started to hit some performance issues (which I think affects Michael Snoyman for example). It also gives terrible error messages. Using a true and tested solver could get rid of these solver issues, as long as we can make the off-the-shelf solver give good enough error messages. There are some open issues (e.g. the solver needs to have a liberal license, be possible to ship with cabal-install, etc) that need to be thought about before we make this a student project however. I'd rank this project as difficult but feasible, for the right student.

#1619 Tweak memory-reuse analysis tools for GHC compatibility new good 4 years 9 months

Some program instrumentation and analysis tools are language agnostic. Pin and Valgrind use binary rewriting to instrument an x86 binary on the fly and thus in theory could be used just as well for a Haskell binary as for one compiled by C. Indeed, if you download Pin from, you can use the included open source tools to immediately begin analyzing properties of Haskell workloads -- for example the total instruction mix during execution.

The problem is that aggregate data for an entire execution is rather coarse. It's not correlated temporally with phases of program execution, nor are specific measured phenomena related to anything in the Haskell source.

This could be improved. A simple example would be to measure memory-reuse distance (an architecture-independent characterization of locality) but to distinguish garbage collection from normal memory access. It would be quite nice to see a histogram of reuse-distances in which GC accesses appear as a separate layer (different color) from normal accesses.

How to go about this? Fortunately, the existing MICA pintool can build today (v0.4) and measure memory reuse distances.

In fact, it already produces per-phase measurements where phases are delimited by dynamic instruction counts (i.e. every 100M instructions). All that remains is to tweak that definition of phase to transition when GC switches on or off.

How to do that? Well, Pin has existing methods for targeted instrumentation of specific C functions:

By targeting appropriate functions in the GHC RTS, this analysis tool could probably work without requiring any GHC modification at all.

A further out goal would be to correlate events observed by the binary rewriting tool and those recorded by GHC's traceEvent.

Finally, as it turns out this would NOT be the first crossing of paths between GHC and binary rewriting. Julian Seward worked on GHC before developing valgrind:

Interested Mentors

Ryan Newton


Interested Students (Include enough identifying info to find/reach you!)

#1616 Implement Cabal<->OS package management systems relations new OK 4 years 9 months

Currently cabal manages the installation of hackage projects relatively well provided (among other things) that you have all the necessary external tools available and installed.

However, when certain tools are not available, such as alex or happy or some external library that cabal checks for with pkg-config, we lose the automated aspect of compiling and installing packages and confused and frustrated users. That is particularly bad for new users or people just wishing to use a certain haskell program available in hackage for which their OS does not have a pre-packaged version (such as an installer in windows or .deb in Debian/Ubuntu?).

For instance, when issuing the command cabal install gtk in a fresh environment (say, ubuntu), it is going to complain about the lack of gtk2hs-buildtools. Then when issuing the command cabal install gtk2hs-buildtools it complains about the lack of alex. Installing alex is also not enough, as you also need happy. Cabal does not indicate that at first, though. Installing alex and happy still does not solve your problems as gtk depends on cairo, which checks for cairo-pdf through pkg-config and fails.

Point being that even though it is quite easy to install all those packages (and some you might not even have to be explicitly installed, as it can be provided by your haskell platform package), this automated process becomes manual, interactive and boring.

I propose extending cabal in three ways:

  • Adding checks at the start of the compilation (of the first package) so as to solve all (or as many of) those dependencies at once:
    • If it's an external executable, such as alex and happy, and we don't see it installed and on the path, complain or add it to the dependencies list to be installed (if we know how to, through point no.2 below);
    • If it's something that is checked by pkg-config down the chain, it'd be nice to know in advance so we can first bother about all the libraries that it might need at once and stop babysitting the install so issue warnings before continuing;
  • Extending cabal so that it can accept several environment aware plugins (and develop at least one):
    • I like the thought of issuing apt-get install cabal-install && cabal update && cabal install cabal-ubuntu-integration or something like that and have cabal be aware that it is a big boy now and can suggest the install of external libraries to satisfy pkg-config dependencies (the user would still have to confirm and provide credentials, however due to the point above that would be at the beginning and we still get a nearly hands-free install);
    • Obviously we are not limited to Ubuntu, used on this example; Other apt and rpm based systems and mac package management systems such as fink (though I have very limited experience on this side of things) could be supported similarly; Even windows could be, though that would probably require mode code to download/install cygwin/msys/painkillers to build some of the external libraries or provide reduced functionality.
  • Extending cabal to provide dependencies information that can be converted for external packages registered in a system wide install:
    • For this I took inspiration on the g-cpan tool for gentoo, which allows you to create on-the-fly ebuild files for PERL modules available at CPAN[1], which are then installed as regular packages on the system. I propose a generic way of doing this so that a simple haskell project can encapsulate the required conversion between .cabal and whatever recipe/ebuild/etc file your particular environment might need.
    • Basically each package would be able to state: I depend on X, Y and Z libraries that are on hackage, E and F executables (which I might or might not know where they are) and on L, M and N libraries through pkg-config; mapping those names to packages from each package manager would still be a problem for the individual plugi-ns, though.

I think this would be a nice step in making Haskell even more accessible and slightly less frustrating for newcomers and for people trying interesting things in projects that depend on external libraries.

I do understand that cabal is not a package manager[2], however I see no reason why it should not be able to check for dependencies and issue warnings and suggestions in this way. And since those are warnings and not errors, the user can always ignore them and proceed.


Edit: Talked to Jürrien about this ticket and he mentioned the text on link [2]. I'm editing to try to make things clearer.

#1614 Replicating backend for AcidState. new good 4 years 9 months

AcidState is a high performance library for adding ACID ( guarantees to (nearly) any kind of Haskell data structure. In layman's term, it's a library that will make your data persist even in the event of, say, a power outage.

AcidState relies on an on-disk transaction log from which to recreate the Haskell data structure if your application crashed unexpectedly. This makes the hard drive a single point of failure (leading to the loss of data, even) and the low reliability of hard drives simply does not merit this.

I propose that this should be solved by replicating the state across a cluster of machines. Each machine would keep a copy of the transaction log on disk and the single point of failure would be gone.

When thinking about this project, one has to keep the CAP theorem ( in mind. In particular, I propose consistency and availability to be guaranteed. Partition tolerance (compared to consistency and availability) has fewer uses and is much harder to implement.

AcidState is already structured for multiple backends so few, if any, internal changes are required before this project can begin.

Interested mentors:

  • Jeremy Shaw (an officially accepted mentor)

Interested students:

  • David Himmelstrup.
#1613 Haiku support new OK 4 years 9 months

Port GHC to Haiku OS.

#1611 Solve cabal dependency hell new OK 4 years 9 months
  • Installing a new package should never break an existing package install.
  • If there is a correct way to configure dependencies, Cabal should resolve it.
  • Local installs should have the same status as hackage installs (merge cabal-src-install)

This project will likely have a greater community impact than all others combined.

#1608 Implement 2-3 concurrent data structures from the literature new rrnewton good 4 years 8 months

The GHC Haskell compiler recently gained the capability to generate atomic compare-and-swap (CAS) assembly instructions. This opens up a new world of lock-free data-structure implementation possibilities.

Furthermore, it's an important time for concurrent data structures. Not only is the need great, but the design of concurrent data structures has been a very active area in recent years, as summarized well by Nir Shavit in this article:

Because Haskell objects containing pointers can't efficiently be stored outside the Haskell heap, it is necessary to reimplement these data structures for Haskell, rather than use the FFI to access external implementations. There are already a couple of data structures implemented in the following library (queues and deques) :

But, this leaves many others, such as:

  • Concurrent Bags
  • Concurrent Priority Queues

A good point of reference would be the libcds collection of concurrent data structures for C++ (or those that come with Java or .NET):

One of the things that makes implementing these data structures fun is that they have very short algorithmic descriptions but a high density of thought-provoking complexity.

A good GSoC project would be to implement 2-3 data structures from the literature, and benchmark them against libcds implementations.

Interested Mentors

Ryan Newton

Interested Students (Include enough identifying info to find/reach you!)

Sergiu Ivanov


Mathias Bartl

Aleksandar Kodzhabashev


This ticket has been REFACTORED. It is now specifically focused on deques, bags, or priority queues. For anyone interested in concurrent hash-tables / hashmaps take a look at ticket #1617.

Recommended Papers with State-of-the-art Algorithms

#1607 Haskell for the "real-time web" new OK 4 years 9 months

Note "real-time web" basically means not needing to refresh web pages, nothing to do with real-time systems.

Haskell has the best async IO implementation available. And yet, node.js is getting all the spotlight! This is in part a library issue. Node.js has several interfaces that use websockets and fall back to other technologies that are available.

I propose we create such a library for Haskell. We have already identified a language-independent design called sockjs. And there is already work under way on wai-sockjs, with good implementations of wai-eventsource and wai-websockets already available.

Such a library is a fundamental building block for interactive (real-time) web sites. There is also a use case for embedding a capable Haskell messaging server into a website ran on a slower dynamic language that has poor concurrency, like Ruby (some people embed node.js based messaging servers now). This gives Haskell a back door to getting adopted in other areas.

This should be easily achievable in a GSoC. In fact I think the biggest problem may be how to expand the scope of this to fit the summer. The Yesod web framework is now a very good server oriented web framework. A good wai-sockjs implementation could be nicely integrated with Yesod and other frameworks, but it could also be the building block for a great client-side abstraction in which a web page is automatically kept up to date in "real-time".

Michael Snoyman and myself are willing to mentor. We can probably also get those that have been involved with websockets related development to date to mentor or at least give assistance.

#1605 A universal data store interface. new OK 5 years 9 months

A lack of a high-level data store library is a huge weakness in haskell.

Data storage is absolutely critical for web development or any program that needs to persist data or whose data processing exceeds memory. Haskell web development has languished in part because one had to choose between lower-level SQL or Happstack's experimental data store that uses only the process memory. All other widely used programming languages have multiple ORMs for proven databases. Haskell needs some better database abstraction libraries.

The persistent library is a universal data store interface that already has PostgreSQL, Sqlite, MySQL, MongoDB, and experimental CouchDB backend. Most users of the Yesod web framework are using it, and it is also being used outside of web development. With some more improvements, persistent could become the go-to data store interface for haskell programmers.

We could create interfaces to more databases, but the majority of Haskell programs just need *a* database, and would be happy with a really good interface to any database. There is also a need to interface with existing SQL databases. So I would like to focus on making (SQL & MongoDB) storage layers really good. MongoDB should be easier to create a great interface for.

We have moved Persistent in the direction of universal query interface to just a universal data store serialization interface. There are many critics of query interfaces for good reasons: we will never be able to solve all use cases.

I believe future work on Persistent should continue this recent direction of allowing for raw queries. One can now finally write raw SQL queries and get them automatically properly serialized. The next step is to make them extraordinarily type-safe. That is, we know at compile time that the queries are valid. They reference columns correctly and they are valid database queries. There is already an experimental implementation of this for SQL called persistent-hssqlppp that checks the validity of SQL statements at compile time.

Persistent's biggest limitation right now is the lack of a good technique for returning a projection of the data - we always give back a full record. This issue should be explored in the GSoC, but does not have to be solved.

Persistent already has a very good Quasi-Quoted DSL for creating a schema, but another task at hand is to write a convenient Template Haskell interface for declaring a schema. This should not be difficult because we already have all the tools in place.

There are also some possibilities for integrating with existing Haskell backends. One interesting option is integration with HaskellDB or DSH - HaskellDB does not automatically serialize to a Haskell record like Persistent does.

Michael Snoyman and Greg Weber are willing to mentor, and there is a large community of users willing to help or give feedback.

#1601 Library for the Arduino platform new OK 5 years 5 years

The Arduino platform is an open electronics platform for easily building interactive objects. Arduino offers simple ways of getting creative with electronics and building your own devices without having to dive deep into microcontroller programming with c or asm. I think this is a great opportunity for the Haskell community to extend beyond the software world and to get people interested in Haskell who would otherwise not consider engaging with fp.

This project would aim at providing an extensible library for programming the Arduino platform, enabling a functional (and hopefully more intuitive) way of programming the Arduino hardware. The library would be designed on providing an API for interacting with Arduino. I think this would be a better way then to write a cross-compiler or to define a language subset.

If you have any thoughts or ideas please feel free to leave a comment.

#1600 Mathematical environment in Haskell new OK 5 years 9 months

Haskell is usually described as a language appropiate for people with a mathematically-oriented mind. However, I've found that there is no comprehensive library (or set of libraries) for doing maths in Haskell.

My idea would be to follow the path of the Sage project That projects took a lots of libraries and programs already available and added a layer for communication with them in Python. The result is that now you can have a full-fledged mathematical environment while using your normal Python knowledge for GUI, I/O and so on...

The project can have several parts:

  • Try to come with a hierarchy of classes like the ones in Sage for the basic building blocks (this may start with the Numeric Prelude)
  • Try to create bridges in Haskell as done in Sage
  • Find a way to "import" code in Sage (some algorithms are written just in them) to the Haskell counterpart

For free, GHCi could be used as the next Matlab :D

Interested students

  • Alejandro Serrano (serras)

Interested mentors

  • Jacques Carette <carette@…>
#1596 A statistics library and environment new OK 5 years 5 years

Although some statistics functionality exists in Haskell (e.g. in Bryan O'Sullivan's 'statistics' library), I think Haskell (or GHCI) could make sense as a more complete environment for statistical analysis.

I think a standard data type representing a data frame would be a cornerstone, and analysis and visualization could be built on top of this to become a complete analysis environment, as well as a standard library for use in regular applications.

This isn't necessarily a very difficult project, but for it to be acceptable it would probably need to be detailed and discussed a lot more, and one or more suitable mentors would need to step forward.

Edit: It would also be great to take advantage of Haskell's parallel support (R only has this as proprietary add-ons, I think), and type safety (for instance using the 'dimensional' library to keep track of units).

See also

#1583 Language.C enhancements new OK 6 years 6 years

Three possibilities:

The first is to integrate preprocessing into the library. Currently, the library calls out to GCC to preprocess source files before parsing them. This has some unfortunate consequences, however, because comments and macro information are lost. A number of program analyses could benefit from metadata encoded in comments, because C doesn't have any sort of formal annotation mechanism, but in the current state we have to resort to ugly hacks (at best) to get at the contents of comments. Also, effective diagnostic messages need to be closely tied to original source code. In the presence of pre-processed macros, column number information is unreliable, so it can be difficult to describe to a user exactly what portion of a program a particular analysis refers to. An integrated preprocessor could retain comments and remember information about macros, eliminating both of these problems.

The second possible project is to create a nicer interface for traversals over Language.C ASTs. Currently, the symbol table is built to include only information about global declarations and those other declarations currently in scope. Therefore, when performing multiple traversals over an AST, each traversal must re-analyze all global declarations and the entire AST of the function of interest. A better solution might be to build a traversal that creates a single symbol table describing all declarations in a translation unit (including function- and block-scoped variables), for easy reference during further traversals. It may also be valuable to have this traversal produce a slightly-simplified AST in the process. I'm not thinking of anything as radical as the simplifications performed by something like CIL, however. It might simply be enough to transform variable references into a form suitable for easy lookup in a complete symbol table like I've just described. Other simple transformations such as making all implicit casts explicit, or normalizing compound initializers, could also be good.

A third possibility, which would probably depend on the integrated preprocessor, would be to create an exact pretty-printer. That is, a pretty-printing function such that pretty . parse is the identity. Currently, parse . pretty should be the identity, but it's not true the other way around. An exact pretty-printer would be very useful in creating rich presentations of C source code --- think LXR on steroids.

#1579 Implement overlap and exhaustiveness checking for pattern matching new good 7 years 4 years

GHC's current checker for overlaps and exhaustiveness patterns is in need of an overhaul. There are several bugs and missing features. For example, GADTs are not taken into account.

The project would involve the analysis of the current implementation, specification of bugs and desired features, design and implementation of an improved checker.

The project is mentioned on:

Interested Mentors

Interested Students

  • Eugen Jiresch ( e0204097(_atsign_) )
  • Saurabh Kumar ( saurabh.catch@… )
  • Ben Ross (benjross@…)
#1556 Further Parsec Improvements new OK 8 years 9 months

Last year's Summer of Code project led to Parsec3, a monad transformer version of Parsec with significantly more flexibility in the input it accepts. There is much work left that can be done though:

  • Regaining lost speed for common cases such as ParsecT Identity with [] as input stream
  • Reworking the error handling and source position models to handle the increased variety in input streams, correct problems with error messages and enable greater integration with any underlying monad where appropriate
  • Exploring further possibilities for optimisation
  • Building examples, documentation and support libraries for parsing tasks outside Parsec's traditional domain such as binary parsing

Interested Mentors

  • Philippa Cowderoy (Philippa) <flippa@…>
  • Derek Elkins (ddarius) <derek.a.elkins@…>
  • Dmitry Astapov (ADEpt) <dastapov@…>

Interested Students

  • Matej Kollar <kollar.208115@…>
  • Andre Murbach Maidl <amaidl@…>
#1541 Type-level programming library new bad 8 years 9 months

GHC's new support for type families provides a fairly convenient and rather expressive notation for type-level programming. Just like their value-level counterparts, many type-level programs rely on a range of operations (e.g., type-level lists and lookup functions) that are common to different applications and should be part of a standard library, instead of being re-implemented with every new type-level program. As a concrete example of a sophisticated use of type-level programming, see for example Andrew Appleyard's C# binding and its encoding of C#'s sub-typing and overload resolution algorithm:

The primary goal of this project is to develop a support library for type-level programming by looking at existing type-level programs and what sort of infrastructure they need. A secondary goal is to acid test the rather new type family functionality in GHC and explore its current limitations.

Interested Mentors

  • Manuel Chakravarty <chak@…>
  • Edward Kmett <ekmett@…>
  • Jacques Carette <carette@…>

Interested Students

  • Thomas Schilling <nominolo@…>
  • John Morrice <spoon@…>
#1540 Debugger for Attribute Grammar using Haskell new Yogesh Mali 8 years 8 years

I intended to design a debugger for attribute grammar specification language.The plan is to finish this project in summer, but i give myself some flexibility of time till summer plus two more months.My current work focuses on attribute grammar specification language,silver which has been designed using Haskell. Silver is used for language design and extend them with new domain specific features.

#1127 Machine learning library new none OK 9 years 9 months

Note that this was proposed some years ago, and any prospective student should first identify a suitable mentor, and discuss the details with her.

Machine learning includes many methods (e.g, neural networks, support vector machines, hidden markov models, genetic algorithms, self-organizing maps) that are applicable to classification and/or pattern recognition problems in many fields.

A selection of these methods should be implemented as a Haskell library, the choice would depend on the qualifications and interests of the student. Optionally, a more limited library with an application using it could be implemented.

My main interest is in bioinformatics, but as machine learning methods are useful in a vast number of fields, I'm happy to hear from prospective mentors and other people who are interested, as well as from prospective applicants.

Previously Interested Students

  • Anil Vaitla <avaitla16@…> (Matrix Decompositions, SVM, and HMM)
  • Andreas Launila <> (SVM and HMM)
  • Jiri Hysek (dvekravy) <xhysek02@…> (NN and GA)
  • Charles Blundell <blundellc@…> (SVM, HMM, ID3/C4.5; already done NN, Q-learn, SOM, ~GA)
  • P McArthur <> (Hidden Markov Model)
  • Dave Tapley <dukedave@…> (Studying: <>)
  • Ivan Dilchovski <root.darkstar@…> (currently doing graduation paper on NN)
  • Dinesh G<g.dinesh.cse@…> (SVM, PCA, SVD, Random Projection, Cluterings techniques) Currently doing my masters in Indian Institute Of Science. Area Of Interest: Machine learning
  • Moises Osorio <…> (Genetic algorithms)

Currently Interested Students

  • Martin Boyanov <mboyanov@…> (HMM, k-means clustering, Naive Bayes,neural networks)
#1126 MIME library new none 9 years 9 years

The goal is to create the one, true MIME library for Haskell. Currently, there are a lot of partial MIME libraries, but nothing really complete. For more information, see and

#1118 Debian package autobuilder for hackage packages new none 9 years 9 years

Creating an infrastructure for automatically generating Debian source packages for Hackage packages, building them, and publishing the resulting binaries in a publicly-accessible APT archive.

Interested Mentors

  • ?

Interested Students

  • Bryan Donlan <bdonlan@…>
  • Cesar Flores <cesarjesus@…>
#1113 New I/O library (async I/O+unicode filenames+filesystem manipulations+...) new Bulat 9 years 9 years

Existing ghc i/o library is hard to extend, closely coupled with ghc rts, can't be ported to other haskell compilers. it will be great to detach new i/o library, based on code of current ghc i/o, streams, SSC, network-alt, fps, filepath and other libs. detailed explanation at

Interested Mentors

  • Bulat (Bulat.Ziganshin@…)

Interested Students

  • ?
#69 Implement a model checker new none 10 years 10 years

jhypernets is a simulator of a formal model of mobile computations called "Petri hypernets", with a core implemented in Haskell. The model supports model checking (automatic checking of satisfiability of logical formulas). At the moment there is a very simple model checker implemented in Haskell which is too slow for using it for non-trivial models. The idea is to implement a fast model checker in Haskell using Symbolic Model Checking. It requires implementing or creating bindings to a BDD (Binary Decision Diagrams) library, implementing the model checker itself (which is a non-standard one) and possibly strong profiling of code.

The project could be a proof of a hypothesis "Haskell is a great tool for implementing systems specified formally". The hypothesis is partially proved at the moment (Haskell component is a very successful part of the project), but creating a model checker is a very implementation-speed-sensitive and we do not know whether we can have the high-levelness of Haskell and speed of imperative languages at the same time (in the domain of model checkers).

Interested Mentors

  • ?

Interested Students

  • Artur Siekielski <asiekiel@…>
#51 XMPP (aka Jabber) bindings for Haskell new none 10 years 9 years

Support for the XMPP (, will enable us to write IM client, message and file transfer tools, notifiers, IM bots and otherwise join an XMPP frenzly that seems to engulf the whole world :)

Interested Mentors

  • Dmitry Astapov <dastapov@…>

Interested Students

  • Caio Marcelo (cmarcelo) <cmarcelo@…>
  • Henning Günther (der_eq) <h.guenther@…>
#50 Visual Haskell new none bad 10 years 9 months

Visual Haskell is the Microsoft Visual Studio plugin for Haskell development, released in September 2005.

There's lots to do in Visual Haskell. There are plenty of enhancements that could be added; some ideas are here.

Interested Mentors

  • Simon Marlow <simonmar@…>

Interested Students

  • Rafael Vargas <rafavargas@…>
Note: See TracQuery for help on using queries.