Opened 7 years ago

Last modified 4 years ago

#1622 new proposed-project

General FFI improvements

Reported by: julek Owned by: julek
Priority: not yet rated Keywords: FFI
Cc: Difficulty: unknown
Mentor: not-accepted Topic: GHC

Description

What I would like to do is to improve the integration of C/C++ with Haskell, particularly in calling Haskell from C/C++.

Currently ghc is able to generate stubs to export functions whose arguments are simple types such as CInts into C/C++. The stub generated is always in an extern "C" clause due to the fact that ghc does not as yet implement the C++ calling conventions as defined in the "The Haskell 98 Foreign Function Interface 1.0" (http://www.cse.unsw.edu.au/~chak/haskell/ffi/ffi.pdf)

So a first step would be to implement this calling convention to bring it up to speed with the above referenced report. This shouldn't be too hard and mostly involves implementing C++ name mangling conventions.

Next, I would like to extend the stub generation so as to be able to deal with more complex types.

The type systems in C++ and Haskell have many analogous syntaxes that can be easily exploited to provide strong compatibility and interoperability between the two languages.

For example a simple type such as:

data Foo = A | B

Could be implemented as an enum in C/C++:

enum Foo {A, B};

More advanced types that take arguments such as:

data Tree = Node Left Right | Leaf

Could be converted to a struct in C/C++:

struct Tree {

struct Tree* left; struct Tree* right;

};

Types that have functions that act on them such as:

data IntContainer? = IntContainer? Int

getInt :: IntContainer? -> Int getInt (IntContainer? a) = a

could have these functions automatically converted to C/C++:

struct IntContainer? {

int a;

};

extern int getInt_hs(IntContainer? a); This also opens up the possibility of exploiting C/C++ name mangling conventions, to allow the _hs postfix I'm suggesting here to be eliminated.

Haskell classes:

class Arithmetic a where (+) :: a -> a -> a (*) :: a -> a -> a (-) :: a -> a -> a (/) :: a -> a -> a could be implemented using C++ functions with virtual members:

class Monad {

public:

virtual Monad add(Monad a, Monad b); virtual Monad mult(Monad a, Monad b) virtual Monad neg(Monad a, Monad b); virtual Monad div(Monad a, Monad b);

}

All types of single/multiple instancing (i.e. either directly or through requirements of instances) would be implemented using single/multiple inheritance.

Obviously, this example is rather contrived due to the conversion of the function names. The fact that the rules that govern function naming in Haskell are much more permissive than those of C/C++ might cause compatibility issues.

This can be worked around by implementing a similar syntax to that currently used for function imports by the FFI. E.g..:

foreign export ccall "bind" >>=
CInt -> CInt

Similar to:

foreign import ccall "f" func
CInt -> CInt

The latter is the current syntax for imports.

The name given for the export would be checked for legality in the namespace of the target language.

Alternatively this could be done in an automated manner using some naming conventions as well as operator polymorphism, but this would probably sacrifice ease of use.

Finally polymorphic Haskell functions/types can be implemented in C++ using templates.

I would like to extend ghc to implement enhanced C/C++ stub generation using the methods described above as well as to generate Haskell stubs which describe the Haskell CType equivalents of the Haskell types defined, functions for conversion between the two and function stubs to convert the types, run the Haskell function and convert back as required.

On top of this I'd like to write C/C++ libraries for dealing with most of the standard Haskell types such as Maybe, Either, etc...

Finally, I'd like to work on ironing out any bugs that remain in the RTS when it is used in "multilingual" situations, as well as improving it's performance in this situation.

I think that extending ghc to the level required by "The Haskell 98 Foreign Function Interface 1.0" specification and above would reap significant benefit to the Haskell community.

The improved integration into C/C++ would open the door for this to happen for several other languages and would make Haskell more widespread.

Many Haskell beginners are daunted by the falsely perceived complexity of working with Haskell IO and monads, but love using the massive advantages that the paradigm gives in a non monadic context. Due to this, simplifying the interoperability between Haskell and C/C++ would enable many of these users to stick around for longer and perhaps encourage them to eventually look deeper into the language. This would make the size of the community grow and make the use of Haskell more widespread, potentially reaping benefits for the community at large.

I believe this could be implemented within the time frame given for GSOC.

Change History (1)

comment:1 Changed 4 years ago by Carter Schonwald

Difficulty: 1 person Summerunknown
Priority: goodnot yet rated

tweaking the difficulty and priority because both are a bit trickier afaik :)

I seriously doubt this project is tractable in a single summer, and some of the details are quite subtle!

1) to the best of my knowledge, the details of the C++ ABI wrt templates (and thence c++ generics) aren't even portable between C++ compilers like clang and gcc! Thus it seems deeply unlikely for a summer project to add portable C++ (with templates!) ABI support to ghc for every major supported platform in a way that would work with both GCC and clang

2) this proposal doesn't address how marshalling from haskell heap data to C++ data structures would work in a memory safe way (and there'd have to be marshalling of some sort, the memory semantics for the two languages are quite different!)

3) yuras did a prelim set of work to add portable C Struct support to GHC and its ffi, and even that (awesome!) work had enough subtleties with its (by comparision simple) marshalling that at least for the near term, its been shelved. The scope of what you want to do is WAY more ambitious than the work by Yuras.

basically, unless the student proposing this project can demonstrate some preliminary prototyping work that validates they actually have a clear understanding of the underlying challenges and has done materially relevant engineering before, this is not a viable GSOC project.

let me propose an equally cool project that would be totally doable in the course of the summer by a really smart student: adding heap introspection/visualization/manipulation of running programs tools to ghcjs. (because of how ghcjs works, it'd be much easier / saner to do that vs with ghc's native code, but it'd still wind up being a useful mechanism to debug native haskell code!). Though of course anyone working on that should talk with luite and learn how ghcjs works.

Note: See TracTickets for help on using tickets.