It occurred to me that rather than moving just these instances to a new module, we could move the large tuples themselves to a new module Data.LargeTuple and put the instances there. The Prelude would reexport the large tuples, so there would be no user-visible change. According to my experiments, GHC should never have to read the Data.LargeTuple interface file unless a program actually mentions a large tuple type, which is presumably rare. We could then also extend the existing instances for Eq, Show, etc., which are currently only provided through 15-tuples.
A nontrivial aspect of this change is that tuples are wired-in types, and they currently all live in the ghc-prim package. I'm actually not sure why they need to be wired-in rather than ordinary types with a funny-looking name. In any case I need to look into this further, but the difficulties here don't seem to be insurmountable.
Trac metadata
Trac field
Value
Version
8.1
Type
Task
TypeOfFailure
OtherFailure
Priority
normal
Resolution
Unresolved
Component
libraries/base
Test case
Differential revisions
BlockedBy
Related
Blocking
CC
Operating system
Architecture
To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information
Child items
0
Show closed items
No child items are currently assigned. Use child items to break down this issue into smaller parts.
Linked items
0
Link issues together to show that they're related or that one is blocking others.
Learn more.
So I have a branch for this that builds successfully. However if I add instances to GHC.LargeTuple, then the instances aren't visible automatically. Using :i brings the instances into scope somehow. A sample ghci session:
Prelude> let x = (1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1) in x == x<interactive>:3:46: error: • Ambiguous type variable ‘p0’ arising from a use of ‘x’ prevents the constraint ‘(Num p0)’ from being solved. Probable fix: use a type annotation to specify what ‘p0’ should be. These potential instances exist: instance Num Integer -- Defined in ‘GHC.Num’ instance Num Double -- Defined in ‘GHC.Float’ instance Num Float -- Defined in ‘GHC.Float’ ...plus two others (use -fprint-potential-instances to see them all) • In the first argument of ‘(==)’, namely ‘x’ In the expression: x == x In the expression: let x = (1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1) in x == x<interactive>:3:46: error: • No instance for (Eq (a0, b0, c0, d0, e0, f0, g0, h0, i0, j0, k0, l0, m0, n0, o0, p0)) arising from a use of ‘==’ • In the expression: x == x In the expression: let x = (1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1) in x == x In an equation for ‘it’: it = let x = ... in x == xPrelude> :i (,,,,,,,,,,,,,,,)data (,,,,,,,,,,,,,,,) a b c d e f g h i j k l m n o p = (,,,,,,,,,,,,,,,) a b c d e f g h i j k l m n o p -- Defined in ‘GHC.LargeTuple’instance [safe] (Bounded p, Bounded o, Bounded n, Bounded m, Bounded l, Bounded k, Bounded j, Bounded i, Bounded h, Bounded g, Bounded f, Bounded e, Bounded d, Bounded c, Bounded b, Bounded a) => Bounded (a, b, c, d, e, f, g, h, i, j, k, l, m, n, o, p) -- Defined in ‘GHC.LargeTuple’instance [safe] (Eq p, Eq o, Eq n, Eq m, Eq l, Eq k, Eq j, Eq i, Eq h, Eq g, Eq f, Eq e, Eq d, Eq c, Eq b, Eq a) => Eq (a, b, c, d, e, f, g, h, i, j, k, l, m, n, o, p) -- Defined in ‘GHC.LargeTuple’instance [safe] (Ord p, Ord o, Ord n, Ord m, Ord l, Ord k, Ord j, Ord i, Ord h, Ord g, Ord f, Ord e, Ord d, Ord c, Ord b, Ord a) => Ord (a, b, c, d, e, f, g, h, i, j, k, l, m, n, o, p) -- Defined in ‘GHC.LargeTuple’Prelude> let x = (1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1) in x == xTrue
Simon, do I need to apply a similar refactoring as you did in ffc21506 ("Refactor tuple constraints")? I see ConstraintTuple is treated differently in tcTupleTyCon, which looks relevant.
Well that is indeed odd. I'd expect to see a checkWiredInTyCon in all the places in the typechecker where an explicit tuple occurs, namely
In the ExplicitTuple case of TcExpr. (Don't forget the not tup-args-all-present case.)
In the HsTypleTy case of TcHsType.tc_hs_type. (I think the best place would be in finish_tuple.)
How does it work at the moment? It looks as if the instances for tuples are defined with the relevant classes (e.g Eq in GHC.Classes). And that is loaded whenever the class is loaded, so we won't see any missing instances for tuples. That looks like a fluke to me.
Incidentally, you might want to do the same big-tuple thing for the constraint tuples in GHC.Classes, which is almost invariably loaded. (Use -ddump-if-trace to see.)
I have this nearly working following the the approach you outlined above, but there's one point I'm not sure about having to do with type family instance checking.
Originally I had Prelude importing GHC.LargeTuple, but since GHC.LargeTuple defines type family instances (for Rep) it was marked as a "family instances" import of Prelude; which meant that any time the user loaded an interface file with their own type family instances, GHC would read the interface file for GHC.LargeTuple too to make sure there were no conflicts. Not good.
So I figured out that Prelude doesn't actually need to import GHC.LargeTuple, since the type checker will load the latter when needed with checkWiredInTyCon. Now everything works great. However, I'm not sure about the type family soundness situation. Consider that a user can write an instance that overlaps with the ones from GHC.LargeTuple without mentioning a tuple, such as
instanceGeneric(abcdefghijklmnopqrst)wheretypeRep(abcdefghijklmnopqrst)=X-- overlaps with a 19-tuple.-- Can even replace `b` with a data type defined in this module,-- so it wouldn't be an orphan
I think it's still impossible to use both instances in the same program without mentioning a large tuple constructor and therefore loading the GHC.LargeTuple interface file and finding the conflict; but I'm not 100% sure (what if the uses of the two type family instances are in separate packages, etc.) Thoughts?
any time the user loaded an interface file with their own type family instances, GHC would read the interface file for GHC.LargeTuple too to make sure there were no conflicts
I don't think so. If M contains a type family instance, that instance should have been checked for consistency with LargeTuple. See FamInst:
Note [Checking family instance consistency]~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~For any two family instance modules that we import directly or indirectly, wecheck whether the instances in the two modules are consistent, *unless* we canbe certain that the instances of the two modules have already been checked forconsistency during the compilation of modules that we import.
So M and LargeTuple have already been checked; so importers of M won't need to check. Right?
To be sure it's a bit of a pain having to load LargeTuple for M. But you suggested solution sounds ad-hoc and fragile to me.
One possibility might be this: rather than a "family instance module" being a boolean flag, keep a list of the type functions it has instances of. Now if A defines instances of FA and B defines instances of FB we don't need to check them against each other.
I think I figured out the answer to my question. The reason why GHC needed to load the interface file for GHC.LargeTuple for family instance checking when Prelude imported it is because I might have written a function in Prelude that relies on the family instance. Since I actually didn't do that, it's fine to not eagerly check GHC.LargeTuple for type family instance conflicts when loading Prelude.
Anyone who does use the GHC.LargeTuple type family instances (even indirectly through another module) will have GHC.LargeTuple as a "family instances" import of their module (worth actually testing!) and so the necessary consistency checks will be done when we import that module in another module.
I might have written a function in Prelude that relies on the family instance
But that's true in every module. I don't understand. Are you proposing some kind of special case for LargeTuple? Or are you just saying that you were mistaken? Or what?
I don't think so. If M contains a type family instance, that instance should have been checked for consistency with LargeTuple. See FamInst:
Note [Checking family instance consistency]~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~For any two family instance modules that we import directly or indirectly, wecheck whether the instances in the two modules are consistent, *unless* we canbe certain that the instances of the two modules have already been checked forconsistency during the compilation of modules that we import.
So M and LargeTuple have already been checked; so importers of M won't need to check. Right?
Interesting. If it's supposed to work this way in this scenario, it doesn't (and hasn't since 7.8.4 at least). Any user module that imports another user module that defines a type family instance forces reading the interface files in base that define type family instances. I'll take a closer look.
I might have written a function in Prelude that relies on the family instance
But that's true in every module. I don't understand. Are you proposing some kind of special case for LargeTuple? Or are you just saying that you were mistaken? Or what?
Okay, let me step back a bit.
Originally I added an import GHC.LargeTuple () to Prelude because I thought it was necessary and sufficient for GHC to know about the instances (both type family and class instances) in GHC.LargeTuple in programs that import Prelude. It turned out to be neither sufficient, nor necessary. It was not sufficient because the type checker was just pulling the TyCon for a large tuple out of thin air, in tcExpr. So, writing a large tuple expression did not cause GHC.LargeTuple to be read, and so the instances were not available.
Adding the checkWiredInTyCon call to tcExpr fixed that. In fact since GHC has wired-in knowledge that large tuples live in GHC.LargeTuple, the checkWiredInTyCon call makes it read the GHC.LargeTuple interface file, which is conveniently where the instances are as well. So, now the import GHC.LargeTuple () in Prelude served no actual purpose. Its only apparent effect was to mark GHC.LargeTuple as a "family instances" import of Prelude.
What I was arguing above is that receiving the type family instances for large tuples should not be thought of as an effect of importing Prelude. Rather, it is as though any module which mentions a large tuple implicitly has an import GHC.LargeTuple added. There's no more need for Prelude to export the large tuple instances than there is for it to export instances defined by any other module in base that Prelude does not import. It just feels a bit odd at first because it seems natural to think of the large tuples as being imported from Prelude, rather than another place.
So M and LargeTuple have already been checked; so importers of M won't need to check. Right?
Interesting. If it's supposed to work this way in this scenario, it doesn't (and hasn't since 7.8.4 at least). Any user module that imports another user module that defines a type family instance forces reading the interface files in base that define type family instances. I'll take a closer look.
How do we know which pairs of modules have already been checked? Any pair ofmodules where both modules occur in the `HscTypes.dep_finsts' set (of the`HscTypes.Dependencies') of one of our directly imported modules must havealready been checked. Everything else, we check now. (So that we can becertain that the modules in our `HscTypes.dep_finsts' are consistent.)
Should we also be assuming that if module A is one of our directly imported modules and A's dep_finsts are [B,C,D] that A has been checked against each of B, C, D?
Should we also be assuming that if module A is one of our directly imported modules and A's dep_finsts are [B,C,D] that A has been checked against each of B, C, D?
Yes I think so. Otherwise compilation of A should fail! Make a test case and check that it really does!
Should we also be assuming that if module A is one of our directly imported modules and A's dep_finsts are [B,C,D] that A has been checked against each of B, C, D?
Yes I think so. Otherwise compilation of A should fail! Make a test case and check that it really does!
It currently doesn't. I have a patch D2947 to fix this, but it needs to be cleaned up and a test added.
Anyone who does use the GHC.LargeTuple type family instances (even indirectly through another module) will have GHC.LargeTuple as a "family instances" import of their module
turns out not to be true. The family instances imports of a module are calculated from its imports, and at that point we don't know whether the module will mention any large tuples.
For the same reason putting type family instances in a module where a wired-in thing lives is a dangerous thing to do in general, since we might read the wired-in module and learn about its instances and then use them without tracking their origin. (I found #13102 while writing the previous sentence.) I don't know whether there are existing occurrences of this in base, but it's exactly what I wanted to add.
Note that adding an import GHC.LargeTuple () to Prelude is not a complete solution either, since a module using large tuples and their Generic instance could have -XNoImplicitPrelude. So I don't have a solution that I'm really happy with here.