wiki:ForeignFunctionInterface

Version 19 (modified by chak@…, 5 years ago) (diff)

--

Foreign Function Interface

Ticket: #35

Brief Explanation

The Foreign Function Interface (FFI) adds support for invoking code and accessing data structures implemented in other programming languages and vice versa. The current proposal encompasses a general mechanism for inter-language operations as well as specific support for interoperating with the C programming language.

References

Pros

  • Widely accepted and used addendum.
  • Provides an essential facility.

Cons

  • Inappropriate use can subvert all semantic guarantees provided by Haskell and can cause memory corruption and program crashes.

Open Questions

The following topics still require a bit of discussion or a decision between multiple alternatives.

Libraries

A large part of the FFI addendum are the libraries living under Foreign in the library hierarchy. Given the general discussion surrounding the inclusion of libraries into Haskell', we need a concrete approach for the Foreign libraries. A tentative proposal was the following:

  • The FFI addendum has been written before the proposal for HierarchicalModules, it uses a flat names space. In Haskell', we will use the hierarchical names from the standard library.
  • Haskell' keeps a subset of the libraries defined by Haskell 98 (but with their hierarchical names) and adds the FFI libraries.

Moreover, the current libraries have grown somewhat since the original FFI Addendum. Do we want to add the additions listed in the following:

  • The module Foreign.Marshal.Pool (which may be used by the OpenGL binding)
  • In Foreign.C.Error: throwErrnoIfRetryMayBlock, throwErrnoIfRetryMayBlock_, throwErrnoIfMinus1RetryMayBlock, throwErrnoIfMinus1RetryMayBlock_, throwErrnoIfNullRetryMayBlock, throwErrnoPath, throwErrnoPathIf, throwErrnoPathIf_, throwErrnoPathIfNull, throwErrnoPathIfMinus1, & throwErrnoPathIfMinus1_
  • In Foreign.ForeignPtr: finalizeForeignPtr
  • In Foreign.Marshal.Array: withArrayLen, withArrayLen0
  • Foreign.Marshal.Error does actually omit some routines of the FFI Addendum (namely those to construct I/O errors). I think we need to keep them in the FFI specification.

TODO:

  • Decide whether we adopt the above tentative proposal.
  • Decide whether we want to adopt the additional module and functions.

Reinitialisation after hs_exit()

The FFI addendum currently requires that hs_exit() can be followed by another hs_init(). GHC doesn't support that and I am not convinced that we should require it. I propose to remove that requirement.

TODO: Decide whether we remove the requirement for re-initialisation.

Include files

It can be tedious to add include files to a large number foreign imports, which is why GHC users often use -#include options instead. I am wondering whether we should add a form

foreign import "include FNAME"

We would still allow include files in other foreign import statements as before, but would gain a portable version of -#include options.

TODO: Decide whether we want this addition.

ccall vs. stdcall

Cross-platform libraries (e.g. HOpenGL) often want to import the same foreign functions using the ccall convention on Unix and the stdcall convention on Windows. The usual method is to use CPP hackery. Would there be any harm to implement stdcall as ccall on Unix? In other words, do we need stdcall on Unix at all?

TODO: Decide whether we allow implementations to implement stdcall as ccall if that is appropriate for the platform.

Finalizers

Finalizers can only be C functions (not Haskell functions), to avoid the problems discussed in Boehm's paper. The original FFI addendum makes the following guarantees about finalizers:

There is no guarantee on how soon the finalizer is executed after the last reference to the associated foreign pointer was dropped; this depends on the details of the Haskell storage manager. The only guarantee is that the finalizer runs before the program terminates. Whether a finalizer may call back into the Haskell system is system dependent. Portable code may not rely on such callbacks.

There was some discussion, in the context of GHC's implementation, about these guarantees. IIRC, the conclusion was that these guarantees are reasonable for C finalizers (but it would be hard for Haskell finalizers to guarantee that they run before the program terminates). Is that right, or do we need to change the specification here?

TODO: Decide whether the current guarantees concerning finalizers are practical.

Alignment

Except for mallocBytes & allocaBytes, the original FFI addendum does not specify any constraints on alignment for allocated memory (by mallocForeignPtrBytes and others). This is clearly an oversight. The questions is how we want to express the fact that all routines that allocate memory specified in bytes must conform to a set of alignment constraints (this was proposed by Thomas DuBuisson and John Meacham) We may add the following statement concerning alignment at the beginning of Section 5 (which describes the library modules):

All storage allocated by functions that allocate based on a size in bytes must be sufficiently aligned for any of the basic foreign types (see Section 3.2) that fits into the newly allocated storage. All storage allocated by functions that allocated based on a specific type must be sufficiently aligned for that type. Array allocation routines need to obey the same alignment constraints for each array element.

TODO: Decide whether we want to add this text. If not, we need to find an alternative way of expressing the required alignment constraints.


Integration into the report

The FFI addendum is already written and formatted in the style of the report — it should be straight forward to integrate. In addition the following changes relative to the original addendum will be applied.

Hierarchical names for the libraries

Changes to the library portion as discussed above under "Open Questions".

Transparent marshalling of newtypes

The FFI addendum defines in Section 3.2 that The argument types ati produced by fatype must be marshallable foreign types; that is, each ati is either (1) a basic foreign type or (2) a type synonym or renamed datatype of a marshallable foreign type. We will improve on the second part of this statement as follows:

  • As the transparent marshalling of newtypes (aka renamed datatypes) is a fairly significant features, we will dedicate a separate (sub)subsection to it.
  • In contrast to the FFI addendum (and it's implementation in GHC), we require that a newtype in a foreign signature is not abstract. Only if its constructor is visible, can the newtype be transparently marshalled. (After all, marshalling makes only sense if we know the type of the value in foreign land.) This implies that we will export the newtypes in the modules Foreign.C.Types and
  • Clarify the connection between marshallable foreign types and the various flavours of foreign signatures discussed in Section 4.1.3. (E.g., in case of a foreign import "dynamic" the whole signature —grammar nonterminal ftype— doesn't need to be marshallable, only portions of it.)
  • We allow GHC's newtype wrapping of the IO monad.
  • We add one or two examples.

Changes to the FFI libraries

Additions:

  • We should also have castCharToCUChar, castCUCharToChar, castCharToCSChar, and castCSCharToChar (i.e., not only for CChar of which it is platform-dependent whether it is signed or not).
  • Types from ISO C99 (with conversion routines):
    WordPtr uintptr_t
    WordMax uintmax_t
    IntPtr  intptr_t
    IntMax  intmax_t
    
    ptrToWordPtr :: Ptr a -> WordPtr
    wordPtrToPtr :: WordPtr -> Ptr a
    
    ptrToIntPtr :: Ptr a -> IntPtr
    intPtrToPtr :: IntPtr -> Ptr a