Newtypes with hidden constructors cannot be passed as FFI arguments

changed weight to 5

I can see merit in Mkhail's point of view.

Example

module Colours( Colour, red, draw ) where

newtype Colour = C Int

red :: Colour
red = 7

foreign import "draw" :: Colour -> IO ()

The question is whether this should be legal

module Foo where
import Colours
foriegn import "draw2" :: Colour -> IO ()

That is, a client of Colours can call a draw2 function that unpacks the Colour. Or should the abstract data type Colour be restricted to the functions (foreign or otherwise) that Colours exports?

I don't have a strong opinion. Did we make the change in response to a bug report? Ie was someone arguing the opposite?

Simon

The Haskell 2010 report says:

the constructor N is visible where T is used
[...]
Consequently, in order for a type defined by newtype to be used in a
foreign declaration outside of the module that defines it, the type
must not be exported abstractly.

This was fixed in response to #3008 (closed).

Incidentally, the report also says:

The module Foreign.C.Types that defines the Haskell equivalents
for C types follows this convention

but I don't understand what that means, as those types are exported abstractly. Perhaps they shouldn't be?

I think GHC is correct now. This was discussed a long time ago on the FFI list (no link handy, sorry), and I think the principle is that in order to use a newtype in an FFI declaration you have to know its representation, because you're linking against some known C code that depends on the representation. So if we allowed that to happen with an abstract newtype, it would constitute leakage of the abstraction.

We ought to change Foreign.C.Types to export the types non-abstractly.

Replying to [ticket:5529#comment:55323 simonmar]:

I think GHC is correct now. This was discussed a long time ago on the FFI list (no link handy, sorry), and I think the principle is that in order to use a newtype in an FFI declaration you have to know its representation, because you're linking against some known C code that depends on the representation.

That's not exactly correct. The C code depends on the particular way the argument is represented (e.g. passing Int64 instead of Int32 is a bad idea), but what GHC checks is a general "representability" (Int64/Int32/Float/..., it doesn't matter as long as it fits the calling convention) of the type. Constructor exporting is just an overkill for that task. What we need is a way to mark (with corresponding C types) appropriate newtypes as being representable (e.g. automatically by the compiler in module interfaces, or by requesting derivation of a special type class) without forcing users to spill the guts and when invent conventions for not hurting themselves ("yes, we export the internals, but never use them in your code, they are here only to pass FFI checks").

Replying to [ticket:5529#comment:55352 mikhail.vorozhtsov]:

That's not exactly correct. The C code depends on the particular way the argument is represented (e.g. passing Int64 instead of Int32 is a bad idea), but what GHC checks is a general "representability" (Int64/Int32/Float/..., it doesn't matter as long as it fits the calling convention) of the type. Constructor exporting is just an overkill for that task. What we need is a way to mark (with corresponding C types) appropriate newtypes as being representable (e.g. automatically by the compiler in module interfaces, or by requesting derivation of a special type class) without forcing users to spill the guts and when invent conventions for not hurting themselves ("yes, we export the internals, but never use them in your code, they are here only to pass FFI checks").

I don't understand. Using a newtype in a foreign call exposes its representation, because it only works if you match the representation type with the type expected by the C function. So marking newtypes as "representable" doesn't help - you really have to know the representation.

Yes, that's why I said that types should be marked "with corresponding C types" (maybe "tagged" would be a better word). When a programmer "matches" newtypes from other Haskell libraries with the types of arguments of the C function he wants to import, he does it by name. I don't need to know how exactly, say, CTime is represented to "match" it with time_t, compiler does. My point is that while exporting constructors does expose representation to GHC's FFI checking code, it also exposes it to programmers that use the module. And I don't think that the former should require/force the latter. I want to have control over the export list, compiler can use other means of making representation accessible to it's own code.

The point is, if you FFI import a C function that takes a time_t, then you can't pass it a MyType unless you know that MyType is represented by CTime (e.g. newtype MyType = MyType CTime). If MyType is abstract then you don't know that.

Once again, I, as a programmer, am completely fine if the module documentation says that MyType mirrors time_t, exported constructor doesn't help me a bit (one way to ensure that would be by writing something like let unused = MyType (undefined :: CTime), but what about newtype MyType = MyType Int64 (assuming time_t ~ int64_t), how would you check all possibilities?). GHC doesn't ensure that FFI import's types are right (with respect to the C types), programmer does. Well, exporting guts may be seen as a sort of documentation, but I'd rather not encourage/require that practice.

To sum things up.

Exported constructors do not help programmers to get the FFI import's types right, as GHC provides no way to (automatically) assert that type X is a wrapper (using arbitrary number of layers) around "primitive" type which is equivalent to the desired C type y_t.
GHC's FFI implementation does need to know the representations of the types used in an FFI import. But this information does not necessarily need to be passed around in the form of exported constructors. Store it somewhere else in the module interface and leave the export list to programmers.

Mikhail: this is not about whether GHC itself needs to know the representation internally, indeed typically GHC does have the full representation available even if the constructor is exported abstractly (at least when -O is on). The question here is whether GHC should require that the constructor is exported non-abstractly in order to be used in an FFI declaration. So I think point (2) above is not the issue.

I think your point (1) is saying that even if we know that CTime == Int64, what the programmer actually needs to know is that CTime == time_t, correct?

FYI, here's the previous discussion from the haskell-prime list:

http://www.haskell.org/pipermail/haskell-prime/2009-February/002726.html

it addresses most of the points we've covered here.

Replying to [ticket:5529#comment:55434 simonmar]:

I think your point (1) is saying that even if we know that CTime == Int64, what the programmer actually needs to know is that CTime == time_t, correct?

Yes. And, more importantly, the programmer needs a way to ensure that CTime == time_t automatically at compile time, because we can't really expect people to recheck that equality by hand every time a new version of the library is released/used. This hypothetical machinery must be able to handle multiple levels of wrapping, like CTime -> X1 -> X2 -> Int64. My other (original) concern is that exporting a constructor does not only reveal the representation of the type, but also, obviously, gives third party non-FFI code a way to construct instances of the type. For CTime it may be ok, but think about flag types, which are allowed to have only limited number of bits set (or, even worse, only limited number of combinations of set bits).

Replying to [ticket:5529#comment:55435 mikhail.vorozhtsov]:

Replying to [ticket:5529#comment:55434 simonmar]:

I think your point (1) is saying that even if we know that CTime == Int64, what the programmer actually needs to know is that CTime == time_t, correct?

Yes. And, more importantly, the programmer needs a way to ensure that CTime == time_t automatically at compile time, because we can't really expect people to recheck that equality by hand every time a new version of the library is released/used. This hypothetical machinery must be able to handle multiple levels of wrapping, like CTime -> X1 -> X2 -> Int64.

This goes way beyond the original ticket, and I must admit I'm having trouble following your arguments.

The fact remains that allowing an abstract newtype to be used in an FFI declaration breaks the abstraction, so we can't allow that. If you want to propose an alternative mechanism that would fix this problem, please go head - but I suggest this ticket is not the right place (a wiki page would be better).

My other (original) concern is that exporting a constructor does not only reveal the representation of the type, but also, obviously, gives third party non-FFI code a way to construct instances of the type. For CTime it may be ok, but think about flag types, which are allowed to have only limited number of bits set (or, even worse, only limited number of combinations of set bits).

But you want to give clients the ability to call arbitrary foreign functions that return the type! Allowing them to construct instances in Haskell is no worse than that.

This goes way beyond the original ticket, and I must admit I'm having trouble following your arguments.

My point is that, given the current level of integration between GHC and C compilers, exporting constructors does not make programmers more confident in their FFI imports signatures.

My other (original) concern is that exporting a constructor does not only reveal the representation of the type, but also, obviously, gives third party non-FFI code a way to construct instances of the type. For CTime it may be ok, but think about flag types, which are allowed to have only limited number of bits set (or, even worse, only limited number of combinations of set bits).

But you want to give clients the ability to call arbitrary foreign functions that return the type! Allowing them to construct instances in Haskell is no worse than that.

I disagree. It's like saying that, since we provide a way to construct instances via unsafeCoerce, we should just export everything. To me FFI is like unsafeCoerce, you can break things with it, but it requires a certain amount of conscious effort to do so. Then the programmer sees unsafeCoerce or "foreign import", he naturally checks things twice. On the other hand, constructing "incoherent" instance using the newtype constructor is just way too easy.

mentioned in commit fb83cd02

mentioned in commit f2ef82ed

closed

The constructors are now exported for the C* types.

GHC is now following the language definition. If you'd like to propose changing the definition, please discuss it on the Haskell Prime mailing list.

Trac metadata

Trac field	Value
Resolution	Unresolved → ResolvedFixed

mentioned in issue #7204

mentioned in issue #9197

mentioned in issue #11983 (closed)

mentioned in commit a28611b1

added Pnormal label

Trac field	Value
Version	7.3
Type	Bug
TypeOfFailure	OtherFailure
Priority	normal
Resolution	Unresolved
Component	Compiler
Test case
Differential revisions
BlockedBy
Related
Blocking
CC
Operating system
Architecture

Newtypes with hidden constructors cannot be passed as FFI arguments

Child items 0

Activity