Opened 5 years ago

Last modified 2 months ago

#8281 new bug

The impossible happened: primRepToFFIType

Reported by: tibbe Owned by:
Priority: normal Milestone:
Component: Compiler Version: 7.6.2
Keywords: Cc: simonmar, angerman
Operating System: Unknown/Multiple Architecture: Unknown/Multiple
Type of failure: Compile-time crash Test Case:
Blocked By: Blocking:
Related Tickets: Differential Rev(s): Phab:D3619
Wiki Page:

Description

I ran into this error while trying to use GHCi on the hashable package:

$ cabal repl
Preprocessing library hashable-1.2.0.10...
GHCi, version 7.6.2: http://www.haskell.org/ghc/  :? for help
Loading package ghc-prim ... linking ... done.
Loading package integer-gmp ... linking ... done.
Loading package base ... linking ... done.
Loading package array-0.4.0.1 ... linking ... done.
Loading package deepseq-1.3.0.1 ... linking ... done.
Loading package bytestring-0.10.0.2 ... linking ... done.
Loading package text-0.11.3.1 ... linking ... done.
Loading object (static) dist/build/cbits/fnv.o ... done
Loading object (static) dist/build/cbits/getRandomBytes.o ... done
final link ... done
[1 of 4] Compiling Data.Hashable.RandomSource ( Data/Hashable/RandomSource.hs, interpreted ) [flags changed]
[2 of 4] Compiling Data.Hashable.Class ( Data/Hashable/Class.hs, interpreted ) [flags changed]
ghc: panic! (the 'impossible' happened)
  (GHC version 7.6.2 for x86_64-apple-darwin):
	primRepToFFIType

Please report this as a GHC bug:  http://www.haskell.org/ghc/reportabug

Change History (39)

comment:1 Changed 5 years ago by monoidal

Confirmed in HEAD. To reproduce, save this file as Class.hs and run ghci Class (not merely ghc).

{-# LANGUAGE ForeignFunctionInterface, MagicHash, UnliftedFFITypes #-}

module Data.Hashable.Class () where

import GHC.Prim (ThreadId#)
import Foreign.C.Types (CInt(..))

foreign import ccall unsafe "rts_getThreadId" getThreadId :: ThreadId# -> CInt

comment:2 Changed 5 years ago by simonpj

Cc: simonmar added

I see what is happening here, but need help from Simon M or anyone else. Here's what I learned:

  • ThreadId# is represented by a heap pointer; it's "rep" is PtrRep. But it is a primitive type; i.e. not implemented in Haskell.
  • The totally-undocumented flag -XUnliftedFFITypes seems to allow all of GHC's primitive types to be marshalled across the FFI.
  • These primitive types include Array#, MutableArray#, ByteArray# and MutableByteArray#, all of which also have PtrRep.
  • Marshalling these pointer types seems utterly wrong for a safe call, where GC might occur during the call, thereby moving the array. (Or maybe arrays never move? If so, this is un-documented.)
  • The actual crash comes in LibFFI.hsc, in primRepToFFIType which is used (via prepForeignCall) only by the bytecode generator, to prepare the foreign call arguments. primRepToFFIType barfs if it gets a PtrRep.
  • How does it work for Array# and friends? Because have a special shim in ByteCodeGen.generateCCall, which adds an offset to get past the array header, and then says AddrRep! Hence primRepToFFIType sees an AddrRep.
  • There is no such special treatment for ThreadId#, hence the crash.
  • The base-library module GHC.Conc.Sync has precisely the rts_getThreadId import as in the tiny test case above, but works(just) because it is compiled.

My conclusion:

  • These PtrRep things should not be allowed in safe foreign calls.
  • In primRepToFFIType we should allow PtrRep

However I'm not certain about the last of these because I don't understand how foreign calls work in the bytecode interpreter.

Simon M: do you agree?

Simon

comment:3 Changed 5 years ago by simonmar

We used to use UnliftedFFITypes quite a lot in the IO library in the base package, but I just looked and we use it very little now (only for ThreadId#). It does have a brief entry in the flags reference in the user's guide, but no proper documentation. As far as I recall we didn't intend it to be an advertised feature, so I'm not sure why it appears in the docs at all.

So what's going wrong here is that we have a foreign call that takes a ThreadId#. The foreign call is marked unsafe, because (as you note, Simon) it couldn't work if it was safe. But GHCi only knows how to compile safe foreign calls - it ignores unsafe - so there's no way GHCi can compile this code such that it works.

What I would like to do is get rid of UnliftedFFITypes and use foreign import prim instead. The example above is a good use for foreign import prim, and indeed we should change GHC.Conc.Sync to do it that way (where the example above was copy/pasted from, incidentally).

But there are some places where UnliftedFFITypes is really useful, e.g.:

foreign import ccall unsafe "memcpy"
    memcpy_freeze :: MutableByteArray# s -> MutableByteArray# s -> CSize
           -> IO (Ptr a)

from the array package. To do this with foreign import prim would mean another function call.

So I think we have little choice here.

  • disallow passing boxed-but-unlifted types to safe foreign calls, except for arrays. This error would also trigger in GHCi for an unsafe call, because GHCi compiles unsafe calls as safe calls. Hence the above code would be rejected by GHCi, but accepted by GHC.
  • document UnliftedFFITypes, and explain the pitfalls: not fully supported by GHCi, and be careful passing arrays to safe calls (they must be pinned).

comment:4 Changed 5 years ago by errge

I met this today and came up with the following temporary workaround:

{-# LANGUAGE MagicHash #-}
{-# LANGUAGE UnliftedFFITypes #-}
{-# LANGUAGE ForeignFunctionInterface #-}

import Control.Concurrent
import GHC.Conc.Sync
import Foreign.C
import GHC.Base

foreign import ccall unsafe "rts_getThreadId" getThreadId# :: Addr# -> CInt

getThreadId :: ThreadId -> CInt
{-# INLINE getThreadId #-}
getThreadId (ThreadId tid) = getThreadId# (unsafeCoerce# tid)

threadId :: IO Int
{-# INLINE threadId #-}
threadId = do
  mtid <- myThreadId
  return $ fromIntegral $ getThreadId mtid

main = do
  print =<< threadId
  forkIO $ print =<< threadId
  threadDelay 10000

Seems to work with ghci and 32/64-bit compiled and 32/64-bit optimized code.

Total noob in this area, please tell me if this is dangerous somehow.

comment:5 in reply to:  3 Changed 5 years ago by simonpj

Summary: I think we have a plan but there are niggly details. Help needed from Simon M.

Replying to simonmar:

We used to use UnliftedFFITypes quite a lot in the IO library in the base package, but I just looked and we use it very little now (only for ThreadId#). It does have a brief entry in the flags reference in the user's guide, but no proper documentation. As far as I recall we didn't intend it to be an advertised feature, so I'm not sure why it appears in the docs at all.

I don't mind it being advertised as dangerous, but whoever is using it needs to know what the spec it. If the spec isn't in the user manual it should be on the wiki. But in the user manual is better (more findable, more robust), surrounded with caveats perhaps.

Ditto foreign import prim. This does have a user manual entry (8.1.3), but it refers to an unspecified place in the wiki. I believe that the intended documentation is here, but again I'd prefer the docs to be in the user manual; then it stays with that particular version of the compiler.

What I would like to do is get rid of UnliftedFFITypes and use foreign import prim instead. The example above is a good use for foreign import prim, and indeed we should change GHC.Conc.Sync to do it that way (where the example above was copy/pasted from, incidentally).

Why is foreign import prim better? The documentation (such as it is) says nothing about how things are marshaled, or what types are legal. Perhaps all types are legal and are passed without any sort of conversion or marshaling. (Eg a MutableByteArray# would be passed as a pointer to the array object, not a pointer to the data payload of the object, as happens with foreign import unsafe.

Is foreign import prim by-construction unsafe? That is, it does not generate any save/restore for STG registers etc, and the Cmm code should never block or cause GC. It's just a "fat machine instruction". Is that right? What if you say foreign import prim safe?

Presumably to make GHC.Conc.Sync use foreign import prim we'd need to call a function (with a different name) defined in some .cmm file, not rts_getThreadId defined in Threads.c?

But there are some places where UnliftedFFITypes is really useful, e.g.:

foreign import ccall unsafe "memcpy"
    memcpy_freeze :: MutableByteArray# s -> MutableByteArray# s -> CSize
           -> IO (Ptr a)

from the array package. To do this with foreign import prim would mean another function call.

Not only that, but presumably it is also sometimes convenient to pass an Int# because that is what is in your hand, rather than box it up only for the FFI to unbox it?

And it's fine to use Int# as an argument to a safe foreign call. But this is plain wrong:

foreign import ccall safe f :: ByteArray# -> IO Int

It should only work for unsafe. Or are you suggesting that it should be accepted, but may fail badly at runtime if the ByteArray# is not pinned, which is not statically checkable?

You propose:

  • disallow passing boxed-but-unlifted types to safe foreign calls, except for arrays. This error would also trigger in GHCi for an unsafe call, because GHCi compiles unsafe calls as safe calls. Hence the above code would be rejected by GHCi, but accepted by GHC.

Alternatives:

  • Disallow passing boxed-but-unlifted types to safe foreign calls altogether. If you want to do that, use foreign import prim (which is by-construction unsafe).
  • Disallow passing boxed-but-unlifted types to safe foreign calls, except for arrays (on the grounds that arrays may be pinned).

You suggest the latter, but I'd prefer the former, because it's more clear-cut. Moreover the above only covers safe foreign calls. I think the rules for unsafe foreign calls should be the same as foreign import prim. Do you agree?

  • document UnliftedFFITypes, and explain the pitfalls: not fully supported by GHCi, and be careful passing arrays to safe calls (they must be pinned).

Yes. And document foreign import prim better.

Plus

  • Use foreign import prim for rts_getThreadId in GHC.Conc.Sync

I'd really like to get this squared away. Simon M: it might be quicker for you to just do this than to explain to me how to do it. Or perhaps you can answer all my stupid questions and someone else can do it.

Thanks

Simon

comment:6 Changed 4 years ago by thomie

Milestone: 7.10.1

comment:7 Changed 4 years ago by thoughtpolice

Milestone: 7.10.17.12.1

Moving to 7.12.1 milestone; if you feel this is an error and should be addressed sooner, please move it back to the 7.10.1 milestone.

comment:8 Changed 3 years ago by thoughtpolice

Milestone: 7.12.18.0.1

Milestone renamed

comment:9 Changed 3 years ago by phadej

This happens with 8.0.1-rc1 still.

comment:10 Changed 3 years ago by thomie

Priority: normalhigh

comment:11 Changed 3 years ago by simonpj

It would be great if someone picked this up. There is helpful diagnosis above, but it does need someone to take a careful look and propose something concrete.

comment:12 Changed 3 years ago by bgamari

Milestone: 8.0.18.2.1

Not happening for 8.0.1.

comment:13 Changed 2 years ago by bgamari

Milestone: 8.2.18.4.1

Nor will this happen for 8.2.

comment:14 Changed 21 months ago by winter

I'd really like to see some documents on how UnliftedFFITypes interact with GC and an unsafe FFI call.

According to my understanding, passing ByteArray# is safe here because GC can't happen during an unsafe call. Am i right on this? Or perhaps RTS do some thing else to prevent GC move the ByteArray# during FFI when UnliftedFFITypes enable?

comment:15 Changed 21 months ago by simonmar

@winter it's always wrong to pass an unpinned ByteArray# to a foreign call, regardless of whether the call is annotated as safe or unsafe, because GHC is free to implement an unsafe call as a safe call. Indeed we do this in GHCi.

comment:16 Changed 21 months ago by rwbarton

Wow, that's news to me! Does that apply to foreign import prim as well?

comment:17 Changed 21 months ago by simonmar

No, it doesn't apply to foreign import prim. Think of foreign import prim as declaring a primop: in principle we could implement all our primops using foreign import prim (except that we want to optimise by generating code directly for some of them).

comment:18 in reply to:  15 Changed 19 months ago by winter

Replying to simonmar:

@winter it's always wrong to pass an unpinned ByteArray# to a foreign call, regardless of whether the call is annotated as safe or unsafe, because GHC is free to implement an unsafe call as a safe call. Indeed we do this in GHCi.

But lots of packages(text, for example) seems to be relying on this assumption. Like your example suggest, text also implemented its copy with FFI memcpy rather than a copyByteArray#, and small texts are no way pinned.

Last edited 19 months ago by winter (previous) (diff)

comment:19 Changed 18 months ago by bgamari

text is used nearly everywhere at this point; the fact that it gets this wrong is a rather scary revelation; we really need to make this expectation better known.

However, I am a bit worried that the inability to pass ByteArray#s to foreign calls may put us in a bit of a pickle performance-wise. For instance, currently text uses a C helper, hs_text_decode_utf8, to implement equality on Text (see Data.Text.Array.equal). Not only would it be harder to write the equivalent C-- implementation, but you would also be limited to the optimisation capabilities of the GHC backend, which might hurt for a tight loop such as this.

What is the rationale for allowing GHC to implement unsafe calls as safe calls? It seems like this puts library authors is a tough spot.

Last edited 18 months ago by bgamari (previous) (diff)

comment:20 Changed 18 months ago by simonmar

@bgamari I understand the concerns and I agree.

To answer your question about the rationale, the idea is that "safe" is the default and "unsafe" relaxes the compiler's obligations only. It doesn't add new obligations (such as the requirement not to interrupt the call with a GC). Implementing that obligation in GHCi might be possible, but we haven't done it.

One way around this would be to add a new annotation to foreign calls that requires the call not to be interrupted by GC, e.g. foreign import unsafenogc ... or something. This would not be implemented by GHCi (yet?) so compilation would fail if you tried to load text into GHCi.

Incidentally it's a bad idea to make one of these calls that might run for an arbitrarily long time, because if a GC strikes everything is blocked until the call returns. This has been a rich source of performance bugs in our system at Facebook. I don't think we've encountered problems with text, but we've had to fix other libraries (e.g. regex-pcre) to turn unsafe calls into safe calls. It's possible that the unsafe calls in Text would cause problems when working with very long strings.

comment:21 Changed 18 months ago by simonpj

I see that the docmentation on foreign import unsafe is thin in the extreme. The user manual section does not mention unsafe; while the relevant section of the Haskell 2010 report has only a single cryptic sentence about "call backs into the Haskell system". It would be good to document it better, perhaps in the user manual.

A good example is this thread: can you pass an unpinned bytearry to a foreign call? I don't think we have a shred of documentation about this.

At one point we thought of unsafe foreign calls as a "fat machine instruction". GHC's obligations are simpler because GC cannot be invoked by the call, or thread-switching. Nor, I think, should it block (because then GC might happen while it was blocked). So it could be used for things like cosine that might be implemented out of line, but morally are like a machine instruction.

I think Simon's unsafenogc gets closer to the "fat machine instruction" idea. It really must be used only for short-running calls.

comment:22 Changed 18 months ago by winter

This is a very important technique to write high performance array code(we can't put everything related to ByteArray into ghc-prim after all). I'd like to get it fixed in GHCi and document it(include caveats) . what is stopping GHCi supporting unsafe FFI calls?

comment:23 Changed 18 months ago by winter

This is a very important technique to write high performance array code(we can't put everything related to ByteArray into ghc-prim after all). I'd like to get it fixed in GHCi and document it(include caveats) . what is stopping GHCi supporting unsafe FFI calls?

Version 0, edited 18 months ago by winter (next)

comment:24 Changed 18 months ago by bgamari

what is stopping GHCi supporting unsafe FFI calls?

If I understand Simon correctly, the answer is probably "nothing". However, it's more a question of whether we want to change the semantics of "unsafe" foreign calls (something which would require the involvement of the Haskell Prime committee, if to be done properly). I can see the argument for unsafe not placing any new obligations on the compiler and think there's little reason to push for such a change.

I do think, however, that it is important that we have a unsafenogc call type for the reason you describe.

comment:25 in reply to:  24 Changed 18 months ago by winter

Replying to bgamari:

what is stopping GHCi supporting unsafe FFI calls?

If I understand Simon correctly, the answer is probably "nothing". However, it's more a question of whether we want to change the semantics of "unsafe" foreign calls (something which would require the involvement of the Haskell Prime committee, if to be done properly). I can see the argument for unsafe not placing any new obligations on the compiler and think there's little reason to push for such a change.

I do think, however, that it is important that we have a unsafenogc call type for the reason you describe.

That's a reasonable solution. But as simon pointed out, we're too vague on unsafe FFI calls before, it's not a big deal to give a more clear semantics, is it? After all we have been relying that semantics for years, it's more costly to add new FFI keyword and change all the libraries.

comment:26 Changed 18 months ago by bgamari

But as simon pointed out, we're too vague on unsafe FFI calls before, it's not a big deal to give a more clear semantics, is it?

I absolutely agree that the documentation should be improved either way. It's possible that #13730 is also being bit by this same confusion.

comment:27 Changed 18 months ago by simonmar

I suggested to @bgamari today that it might be possible to support unsafe foreign calls in GHCi with the same behaviour as compiled code (i.e. no GC during the call guaranteed) by avoiding the suspendThread() / resumeThread() calls that the interpreter makes in the bci_CCALL byte code implementation. We'd need to include a flag with the byte code to indicate that the call is unsafe, but there's already a flag being passed for interruptible so there's room for another flag in the same word.

comment:28 Changed 17 months ago by bgamari

Differential Rev(s): Phab:D3619
Status: newpatch

There is a currently-untested patch for this in Phab:D3619.

comment:29 Changed 17 months ago by Ben Gamari <ben@…>

In 9ef909d/ghc:

Allow bytecode interpreter to make unsafe foreign calls

Reviewers: austin, hvr, erikd, simonmar

Reviewed By: simonmar

Subscribers: rwbarton, thomie

GHC Trac Issues: #8281, #13730.

Differential Revision: https://phabricator.haskell.org/D3619

comment:30 Changed 17 months ago by winter

Thank you! I suggest to add some document in manual as well.

comment:31 Changed 17 months ago by bgamari

How does Phab:D3682 look?

comment:32 Changed 17 months ago by Ben Gamari <ben@…>

In 7de2c07d/ghc:

users-guide: Document FFI safety guarantees

Test Plan: Read it

Reviewers: austin

Subscribers: simonmar, rwbarton, thomie

GHC Trac Issues: #13730, #8281

Differential Revision: https://phabricator.haskell.org/D3682

comment:33 Changed 17 months ago by bgamari

Status: patchnew

The GHCi FFI safety issue has been resolved (for 8.4) with comment:29, but I believe the original issue remains unsolved.

comment:34 Changed 17 months ago by winter

How does ​Phab:D3682 look?

It's good, thank you Ben!

comment:35 Changed 13 months ago by angerman

Cc: angerman added

comment:36 Changed 10 months ago by bgamari

Milestone: 8.4.18.6.1

Bumping to 8.6.

comment:37 Changed 8 months ago by bgamari

Milestone: 8.6.1
Priority: highnormal

Bumping down in priority and removing milestone as this has been open for several years with no progress.

comment:38 Changed 8 months ago by bgamari

The plan:

  • Work out if and how the issue can now be reproduced
  • Document UnliftedFFITypes

comment:39 Changed 2 months ago by hvr

Fwiw, re documenting UnliftedFFITypes for unpinned ByteArray#s I (re)stumbled over this old authoritative sounding email (https://mail.haskell.org/pipermail/haskell-cafe/2014-June/114761.html) from Johan Tibell which stated

There is a way to pass an unpinned ByteArray# (or MutableByteArray#, but the former seems right in your case) to a foreign call, using the UnliftedFFITypes language extension. The ByteArray# is guaranteed to not to be moved for the duration of the call. The code should treat the ByteArray# argument as if it was a pointer to bytes. You will need to do any address offset computations on the C side (i.e. pass any offsets you need as extra argument to your C function).

...which might explain why there's a lot of code out there (including my own) which relies on that guarantee to be upheld (including for safe FFI calls).

Note: See TracTickets for help on using tickets.