#13237 closed feature request (fixed)

Extend TH with addModCStub

Reported by: facundo.dominguez Owned by: facundo.dominguez
Priority: normal Milestone: 8.2.1
Component: Template Haskell Version: 8.0.1
Keywords: inline-c Cc: mboes, simonpj, goldfire, nh2, bitonic
Operating System: Unknown/Multiple Architecture: Unknown/Multiple
Type of failure: None/Unknown Test Case:
Blocked By: Blocking:
Related Tickets: #13269 Differential Rev(s): Phab:D3106
Wiki Page:

Description (last modified by facundo.dominguez)

inline-c could benefit of the ability to tell the compiler to include some code in the object file of the current module. https://github.com/fpco/inline-c/issues/21

This way, a module FFI.hs using inline-c doesn't need to produce a file FFI.c with C code, but the code can be built and included in the object file of the module.

For this sake, it would be needed the following TH function:

addModCStub :: String -> Q ()

which would indicate to the compiler that the C code in the given string needs to be built and included in the object file of the current module.

Change History (20)

comment:1 Changed 21 months ago by facundo.dominguez

Description: modified (diff)

comment:2 Changed 21 months ago by facundo.dominguez

Cc: simonpj goldfire added
Owner: set to facundo.dominguez

comment:3 Changed 21 months ago by facundo.dominguez

Differential Rev(s): Phab:D3106
Status: newpatch

comment:4 Changed 21 months ago by facundo.dominguez

Description: modified (diff)

comment:5 Changed 21 months ago by nh2

Cc: nh2 added

comment:6 Changed 21 months ago by mboes

To clarify the context, a solution to this feature would lift a long standing usability problem for inline-c. That package generates C code, which then has to be linked with all the other object code from a project. Since GHC doesn't know about this, we have to lie to Cabal by listing the generated C file as an extra source file.

But that's a very gross hack. Only meant to work around the fact that GHC has no direct support for appending arbitrary object code to the output of the current compilation unit. But in essence, what inline-c is doing is no different from what GHC already needs to do in when compiling modules with foreign exports or foreign import wrappers: it creates C stubs whose object code must be included with the rest of the object code for the module.

There are now at least a dozen open source libraries that depend on inline-c, and likely many more closed source ones. But the current hack explained above causes problems: listing a generated file as a source file tends to confuse GHC's and Stack's recompilation checking. Worse, in some cases the generated C source file causes the build to fail. More details here: https://github.com/fpco/inline-c/issues/21.

Being able to programmatically register C stubs in the current compilation unit would solve both problems at once.

comment:7 Changed 21 months ago by bitonic

Cc: bitonic added

comment:8 Changed 21 months ago by nh2

This would be a very welcome change!

comment:9 Changed 21 months ago by nh2

One question about the implementation:

Do we have to be careful with recompilation avoidance / reproducible builds / interface hashes here? E.g. if the String passed to addModCStub is different between two builds, should that invalidate any modules further down? Or could, through recursive tracing of includes, the behaviour of a module change when the module is loaded from TemplateHaskell, so that e.g. the entire preprocessor-expanded C code should make its way into an interface hash?

(Note I haven't actually analysed if it's possible for the generated C code to be used in TemplateHaskell later in the same compilation pass, just some ideas that went through my head.)

comment:10 Changed 21 months ago by facundo.dominguez

I'd say that if the string changes, programs and libraries would need to be relinked. But I don't see now why a downstream module would need to be rebuilt.

comment:11 Changed 20 months ago by nh2

What I had in mind is the following:

You have a module that exposes some function f that is implemented in C. You have another module that contains a TH function t that that calls f at compile time. And a third module that uses $(t) to generate some code.

Then the third module has to be recompiled when the C code or its includes change.

Of course this question depends on whether the scenario above is even already possible. To my knowledge with inline-c it is not currently possible, but I might be wrong.

comment:12 Changed 20 months ago by rwbarton

Even if that scenario is not possible already, it will be possible with this change, so it should be addressed.

Anyways, I think it's an equivalent scenario to one in which the implementation of an ordinary function has changed, but in a way that doesn't affect any information that appears in the interface file (e.g. its type or unfolding); and then that function is used in a TH splice in another module. I assume that one of the interface file hashes takes this into account, so it should also take the addModCStub calls into account. (Or possibly we don't handle this scenario correctly at present.)

comment:13 Changed 20 months ago by facundo.dominguez

As far as I've tested, changing spaces in irrelevant places, forces recompilation of files which import the module when the files use TH.

I'd say in the current state of affairs there is no danger of ignoring a change, but rather unnecessary recompilation might happen.

Regarding the possibility of includes in the C code which change, it won't be detected in the same way that reading a file with runIO isn't detected. For this cases, there is addDependentFile to let GHC know of the dependency.

I'm afraid it is not easy to call addDependentFile in addCStub because identifying the relevant files is difficult if they are conditionally included (i.e. between #if and #endif). On the otherhand, it would be ok to ask the frameworks producing the C code to call addDependentFile as necessary.

Last edited 20 months ago by facundo.dominguez (previous) (diff)

comment:14 Changed 20 months ago by nh2

I had an idea in mind on how we might do better for addModCStub than we can for runIO, but it depends on what GHC actually does with the C sources eventually (I don't know what it does):

If GHC invokes a plain gcc or cc or CC or whatever, then all bets are off and we can't do better.

But if it can separate the C preprocessor invocation and the C compiler invocation, then we could fully preprocess the C code (which expands all includes) and hash that into the interface file, allowing us to notice when included header files change. This is better than having to write that logic ourselves in TH based on addDependentFile, because one cannot in TH query what C preprocessor and flags GHC will invoke.

comment:15 Changed 20 months ago by facundo.dominguez

Suppose we compile this module.

module A where

do addCStub "#include \"header.h\""
   return []

How does GHC know to recompile A.hs when header.h changes.

Yes, the hash in A.hi includes the old contents in header.h, but that doesn't seem to cause GHC to notice that header.h changed.

For the use case of inline-c, addCStub doesn't need to do more. The user won't call it directly. But if you know of other use cases it would be great to learn of them.

comment:16 in reply to:  15 Changed 20 months ago by nh2

Replying to facundo.dominguez:

How does GHC know to recompile A.hs when header.h changes.

This could be added with the approach I tried to describe in my last comment:

  • GHC sees addCStub mystring
  • Instead of passing mystring to e.g. gcc to generate an object file, it could pass it to gcc -E to generate a fully preprocessed C source code string mycppdstring that has all #includes expanded. We'd add the hash of mycppdstring to the .hi file.

Please let me know if this makes clearer what I mean.

That's outside of the design goals of your ticket though, so probably this enhancement should go into a different ticket.

comment:17 Changed 20 months ago by facundo.dominguez

We'd add the hash of mycppdstring to the .hi file.

I follow this far. Now that the hash is in the .hi file, how does GHC know to recompile A.hs the next time header.h changes?

Making a new ticket sounds good.

comment:18 in reply to:  17 Changed 20 months ago by nh2

Replying to facundo.dominguez:

I follow this far. Now that the hash is in the .hi file, how does GHC know to recompile A.hs the next time header.h changes?

Ah, now I get what you're thinking about. No, cannot notice when it has to recompile A.hs when includes change; it would have to be the CPP to notice that (or ask the CPP what files it read, or run the CPP unconditionally). I was suggesting (but you're right, I didn't actually state it) that *when* we recompile A.hs, and notice mystring is unchanged (which would usually result in further compilation being avoided), we can invalidate *downstream* modules of A.hs based on the includes, even when the unexpanded argument mystring of addModCStub doesn't change. Sorry for the confusion.

The value this would add is that you can touch (or save the buffer) of the module that contains the Haskell string to force a proper non-avoided build when (you know that) includes changed, as opposed to e.g. having to stack/cabal clean the entire project.

comment:19 in reply to:  17 Changed 20 months ago by nh2

Replying to facundo.dominguez:

Making a new ticket sounds good.

Will do.

comment:20 Changed 20 months ago by facundo.dominguez

Resolution: fixed
Status: patchclosed
Note: See TracTickets for help on using tickets.