Generalize hi-boot/hi for signatures, to manage intermediate merged interfaces
The problem
In some situations, we need to output multiple interface files for what is morally the same module name.
Example 1: Merging external and home signatures
unit a-sig where
signature A
unit p where
include a-sig
signature A
Compiling p/A.hsig
produces an interface file which contains just
the definitions declared in p
. However, someone including p
should see the merge of the interface of p/A.hsig
AND a-sig/A.hsig
(which was included.)
Example 2: Merging two home signatures
unit p where
signature A
signature B where
import A
...
signature A where
import B
...
What should we do if a signature is specified multiple times in the same
unit? The compilation of each produces a distinct interface, and the
public interface we want to expose is the merge of the two. (And by the
way, what's the source file name of A
, if we are not using the inline syntax?)
Example 3: Merging a signature and a module
unit p where
signature A
module B where
import A
...
module A where
import B
...
A
and B
are mutually recursive, and we want to use a signature file to
break the gap. The signature produces an interface file, only to be
overwritten when we actually define the module proper.
But wait! We have a solution for this example already: the first interface
file for A
is not saved to A.hi
, but A.hi-boot
...
The proposal
I want to take the A.hi-boot
versus A.hi
distinction and
generalize it: we should be able to name intermediate interface
files A.1.hi, A.2.hi, ... and finally A.hi (which
is publically visible outside the unit.) This naming convention applies
to Haskell files too.
User-visible consequences
Every signature file is numbered, and every import of a signature file refers to a specific number. This number is unique among all other modules in a unit which share the same name. For backwards compatibility, some number/file name extensions are treated specially:
-
.hs
files compile to.hi
(implicitly numbered 0) -
.hs-boot
files compile to.hi-boot
(implicitly numbered 1) -
.hsig
files compile to.hi-boot
(implicitly numbered 1) -
.n.hsig
files compile to.n.hi-boot
(numbered n, where n is greater than 1)
- *Flex point:** We could give
.hsig
files their own file extension
for interface files; just would require some more work to distinguish
between hs-boot
and hsig
as well as record the numbering.
To import, the {-# SOURCE n #-}
pragma can be used (with {-# SOURCE #-}
being equivalent {-# SOURCE 1 #-}
.)
Inline Backpack files can omit numbering, since we can figure it out based on the ordering of declarations (numbering in REVERSE order of occurrence). Example 2 can be numbered as follows:
signature {-# SOURCE 2 #-} A
signature {-# SOURCE 1 #-} B where
import {-# SOURCE 2 #-} A
...
signature {-# SOURCE 1 #-} A where
import {-# SOURCE 1 #-} B
...
Internal consequences
In many places in the code today, we record a boolean indicating if
we depended on the boot interface hi-boot
or the normal interface hi
.
We now replace this marker with an integer which records the numbering.
The primary affected components are dependency recording in interfaces,
interface loading code in GHC, and the implementation of --make
.
Interaction with signature merging
Unlike hs-boot
files, hsig
files can be included from external
units, in which case the semantics are that all signatures in scope
are merged together. The key rule is that we **generate an hi
file for each partial merge**; this means that whenever we want
to typecheck a module, there is exactly one interface file per
module we import. Consider this example:
unit a-sig where
signature A
unit a-sig2 where
signature A
unit p where
include a-sig
module B
include a-sig2
module C
signature A
module D
When compiling this, we generate four interface files for A
:
unit p where
include a-sig
-- Produces A.3.hi-boot (a-sig)
module B -- uses A.3.hi-boot
include a-sig2
-- Produces A.2.hi-boot (a-sig + a-sig2)
module C -- uses A.2.hi-boot
signature A
-- Produces A.hi-boot (everything)
module D -- uses A.hi-boot
-- At the end, A.hi-boot copied to A.hi to be publically visible
Can we do anything simpler?
There are a few barriers to doing something simpler:
- We can avoid generating extra interface files if we instead merge them on-the-fly when we use them. However, this forces later instances of GHC to do repeated work remerging interface files, so it seems desirable from a performance perspective to merge before writing. Another scheme is that we could merge on use for signatures in the home package, and then write out a unified file at the very end, trading off performance for less written interface files.
- The Backpack language is defined in a way that allows modules, signatures and includes to be ordered in a semantically meaningful way. For example:
unit q where
signature M
signature A where
f :: Int -> Int
...
unit p where
signature A where
data T
module M where
import A -- should get T but not f
...
include q -- fill in M
module S where
import A -- should get T and f
This means that even within a unit, the interface of a signature file may differ. We could rule this out, but we would have to work out how to explain this limitation to users. (For example, we could solve the example above by saying that units which define modules do not bring their signatures into scope for a package which imports them; but this is a pretty ad hoc rule! And you still have to deal with repeated signatures, or a signature importing a module importing a signature. There are a lot of cases.)
- This problem cannot be avoided at all if you are truly doing recursive modules, since you need the intermediate interface file to do compilation at all prior to getting the real implementation.
Trac metadata
Trac field | Value |
---|---|
Version | 7.11 |
Type | Task |
TypeOfFailure | OtherFailure |
Priority | normal |
Resolution | Unresolved |
Component | Package system |
Test case | |
Differential revisions | |
BlockedBy | |
Related | |
Blocking | |
CC | |
Operating system | |
Architecture |