wiki:ForeignData

Version 9 (modified by john@…, 8 years ago) (diff)

--

Allocating space in the bss and data segment from haskell

One is forced to use an external C file to allocate data in the bss or data segment even though no code at all will be output and the object file will simply contain a linker directive to allocate some space. This is a deficiency in the current FFI spec.

Proposal (experimental in jhc)

allow declarations of the form

foreign space [const] <n> :: Ptr <type>

where n is the number of elements to allocate (default 1) and type is a basic type or a renaming thereof.

the space allocated will be n*sizeof type for the sizeof as specified by the Storable class. user defined types (other than simple newtype or type renaming of built in types) may not be used.

if the type is 'forall a . Ptr a' then the size will be assumed to be one byte.

if 'const' is specified then that is an assertion the contents of memory there will never change and the haskell compiler may make use of that and the data may be allocated in the shared among processes, read-only data segment.

initialized data

initialized data is trickier, a possible syntax is

foreign space [bigendian|littleendian] [const] <n> :: Ptr <type> = constant

where constant may be one of

  • a value: 3
  • an initialized list: [ 0, 1, 2, ...]
  • a "string" to be output as utf8, utf16 or ucs4 unicode code points depending on what type of pointer it is assigned to.

if the data is initialized as a string, <n> will always refer to a number of characters regardless of encoding and the string will be null terminated (unless an explicit <n> chops off the trailing space)

big endian or little endian may be explicitly specified in which case the data is written out with the specified endianess, else it will be output in the default format of the given system.

implementation

Implementation is trivial once you can parse the new constructs (purposfully similar to existing haskell constructs so lexer and parser need not be modified other than to add new rule). these declarations translate immediatly into equivalant C, C--, or assembly linker directives.

caveats

It is anoying that <n> must be a constant and <type> must be a builtin, but there is not really any other recourse without defining a preprocessor in haskell or a staged system like template haskell. however, use of CPP or a preprocessor like hsc2hs will mitigate these problems and the situation is no worse (and somewhat better) than when having to link against an external C library.

A possible extension would be to allow implementations to derive instances of Storable and allow types with such derived instances be used in foreign space declarations too.

another possibility is the definition of 'manifestly constant' data. which is defined as declarations of the form

name :: built-in-type
name = <constant>
  • or sizeof a builtin
  • or 'foo <op> bar' where foo and bar are manifestly constant and op is a basic operation.

then allow such manifestly constant values for n and allow types whose sizeof is manifestly constant to be used in foreign space declarations.

however, this is probably a lot of work for a problem that has better workarounds unless other uses for manifestly constant data are found.