#9200 closed bug (fixed)
Milner-Mycroft failure at the kind level
Reported by: | ekmett | Owned by: | goldfire |
---|---|---|---|
Priority: | normal | Milestone: | 7.10.1 |
Component: | Compiler (Type checker) | Version: | 7.8.2 |
Keywords: | Cc: | ekmett@…, dimitris@…, sweirich@… | |
Operating System: | Unknown/Multiple | Architecture: | Unknown/Multiple |
Type of failure: | GHC rejects valid program | Test Case: | polykinds/T9200 polykinds/T9200b |
Blocked By: | Blocking: | ||
Related Tickets: | #9427 | Differential Rev(s): | |
Wiki Page: |
Description
This is a reduction of a problem that occurs in real code.
{-# LANGUAGE PolyKinds #-} class D a => C (f :: k) a class C () a => D a
Typechecking complains:
The first argument of ‘C’ should have kind ‘k’, but ‘()’ has kind ‘*’ In the class declaration for ‘D’
This program should be accepted, but we're not generalizing enough.
A slightly less reduced version of the problem arises in practice in:
{-# LANGUAGE RankNTypes #-} {-# LANGUAGE FlexibleInstances #-} {-# LANGUAGE TypeFamilies #-} {-# LANGUAGE MultiParamTypeClasses #-} {-# LANGUAGE FunctionalDependencies #-} {-# LANGUAGE UndecidableInstances #-} {-# LANGUAGE FlexibleContexts #-} {-# LANGUAGE PolyKinds #-} import Control.Category class (Category c, Category d, Category e) => Profunctor (p :: x -> y -> z) (c :: x -> x -> *) (d :: y -> y -> *) (e :: z -> z -> *) | p -> c d e -- lens-style isomorphism families in an arbitrary category type Iso (c :: i -> i -> *) (s :: i) (a :: i) = forall (p :: i -> i -> *). Profunctor p c c (->) => p a a -> p s s class Category e => Cartesian (e :: z -> z -> *) where type P e :: z -> z -> z associate :: Iso e (P e (P e a b) c) (P e a (P e b c))
This typechecks, but if I replace the line
class (Category c, Category d, Category e) => Profunctor
with
class (Category c, Category d, Cartesian e) => Profunctor
to say that you can only enrich a category over a monoidal category (using Cartesian in the approximation here), then it fails because a more baroque version of the same kind of cycle as the minimal example above as Profunctor references Cartesian which references an Iso which mentions a Profunctor.
I'm supplying explicit kind variables in the signature of the class, so we should be able to use those like we do with function signatures a universe down. =/
Change History (46)
comment:1 Changed 18 months ago by ekmett
- Cc ekmett@… added; ekmett removed
comment:2 Changed 18 months ago by goldfire
comment:3 Changed 18 months ago by simonpj
That's very interesting Richard. I had not followed this particular aspect of your implementation of closed type families.
The new proposed strategy
I would explain the strategies rather differently to you.
Baseline strategy (BASELINE), originally due to Mark Jones. Here is the strategy that we currently follow for ordinary, recursive term-level functions, and for recursive data types. I'll describe it for data types, with this example:
data S f a b = MkS (T a f) (S f a b) data T (a::k) (f::k -> *) :: * where MkT :: f a -> S f a Maybe -> S f a Int -> T a f
- Identify which type constructors have Complete User Type Signatures (CUSK). In this example, T does. Extend the environment with these, fixed, kinds:
T :: forall k. k -> (k->*) -> *
- Perform strongly-connected component analysis on the non-CUSK decls, ignoring dependencies on a type constructor with a CUSK. In our example, we get a single recursive component, containing S.
- For each s-c component in turn:
- Bind the type constructor to a fresh meta-kind variable:
S :: kappa0
- Kind-check all the declarations of the component in this environment. This will generate some unifications, so in the end we get
kappa0 ~ (kappa1 -> *) -> kappa1 -> kappa2 -> *
The kappa1 arises from instantiating T at its call site in S - Generalise. So we get
S :: forall k1 k2. (k1->*) -> k1 -> k2 -> *
- Bind the type constructor to a fresh meta-kind variable:
- Extend the environment with these generalised kind bindings, and kind-check the CUSK declarations.
The Key Point is that we can kind-check S without looking at T's definition at all, because we completely know T's kind. That in turn means that we can exploit inferred polymorphism for S when kind-checking T. As we do here: T uses S in two different ways (S f a Maybe) and (S f a Int).
Richard's strategy (RICHARD). The key idea is that all polymorphism is declared, so nothing gets to be kind-polymorphic unless you say so. But the payoff is that you can give partial kind signatures. I'm not going to call it NonParametric becuase I think that's a terrible name. Here's the strategy.
- Sort the declarations into strongly-connected components. No special treatment for CUSKs.
- For each declaration, extend the environment with a kind binding that has a forall for each explicit user-written kind variable, but meta-kind variables otherwise. For example
data Foo (a :: k1 -> k1) b c = ...
would get a kind bindingFoo :: forall k1. (k1->k1) -> kappa1 -> kappa2 -> *
Our earlier example would giveT :: forall k. k -> (k->*) -> * S :: kappa3 -> kappa4 -> kappa5 -> *
- Kind-check the declartions in this environment. At a call of Foo, say, we'd instantiate the forall k1 with a fresh meta-kind variable, but would share kappa1, kappa2 among all calls to Foo.
- Default any unconstrained meta kind variables to *
That's it! No generalisation step. The only polymorphism is that declared by the user.
So our earlier S/T example would be rejected because it relies on S being polymorphic in its third parameter. If you want the S/T example to work you could write
data S f (a::k1) (b::k2) = MkS (T a f) (S f a b) data T (a::k) f where MkT :: f a -> S f a Maybe -> S f a Int -> T a f
That's enough to ensure that S and T's kinds start with
S :: forall k1 k2. ...blah... T :: forall k. ...blah...
Type signatures
Another place that we currently (i.e. using (BASELINE)) do kind generalisation is in type signatures. If you write
f :: m a -> m a f = ...
then the type signature is kind-generalised thus:
This user-written signature f :: m a -> m a means this (BASELINE) f :: forall k (a:k) (m:k->*). m a -> m a
And f's RHS had better be that polymorphic.
Under (RICHARD) it would be consistent to say this:
This user-written signature f :: m a -> m a means this (RICHARD) f :: forall (a:*) (m:k->*). m a -> m a
If you want the kind-polymorphic one, you'd have to write thus
This user-written signature f :: forall k (a:k) (m:k->*). m a -> m a means this (RICHARD) f :: forall k (a:k) (m:k->*). m a -> m a
Declarative typing rules
Happily (RICHARD) has a nice declarative typing rule.
Here is what the conventional declarative typing rule, in the absence of polymorphism for a single self-recursive function looks like:
G, f:t |- e:t G, f:t |- b:t' --------------------------- G |- letrec f = e in b : t'
Here the "t" is a monotype (no foralls) that the declarative typing rules clairvoyantly conjures up out of thin air.
Once you add Hindley-Milner style polymorphism, the rule gets a bit more complicated
G, f:t |- e:t G, f:gen(G,t) |- b:t' --------------------------- G |- letrec f = e in b : t'
where 'gen' is generalising.
The (RICHARD) rule might look like this:
t = forall vs. sig[t1..tn/_] G, f : t |- e : forall vs.t G, f : t |- b:t' --------------------------- G |- letrec f :: forall vs. sig; f = e in b : t'
Here I'm expressing the user-specified knowledge as a signature forall vs.sig, with '_' for bits you don't want to specify.
f :: forall a. _ -> a -> _
Then the rule intantiates each '_' with a clairvoyantly guessed monotype (which might mention the 'vs', or 'a' in this example), and off you go.
Reflection
I think we could reasonably switch to (RICHARD) throughout.
As Richard's comments in TcHsType point out, we don't want maximal polymorphism. His example is:
type family F a where F Int = Bool F Bool = Char
We could generate
F :: forall k1 k2. k1 -> k2
so that (F Maybe) is well-kinded, but stuck. But that's probably not what we want. It would be better to get F :: * -> *
But what about
type family G a f b where G Int f b = f b G Bool f b = Char -> f b
You could just about argue that the programmer intends
F :: forall k. * -> (k->*) -> k -> *
It's quite similar to this:
data PT f a = MkPT (f a)
which today, using (BASELINE), we infer to have kind
PT :: forall k. (k->*) -> k -> *
But I'd be perfectly happy if PT got a monomorphic inferred kind, which is what (RICHARD) would do:
PT :: (*->*) -> * -> *
If you want the poly-kinded PT, use a signature:
-- Any of these would do data PT f (a :: k) = MkPT (f a) data PT (f :: k -> *) a = MkPT (f a) data PT (f :: k -> *) (a :: k) = MkPT (f a)
One oddity is that we'd do (BASELINE) for terms and (RICHARD) for types. But perhaps that's ok. They are different.
- Terms ought to be as polymorphic as possible but arguably not types. Examples above. Also, since kind polymorphism is still in its infancy, maybe it's no bad thing if all kind polymorphism is explicitly signalled every time a kind-polymorphic binder is introduced.
- Terms have well-established separate type signatures, but we don't have a syntax for separate kind signatures of types and classes.
If we moved from (BASELINE) to (RICHARD), some programs that work now would fail:
- the original S/T example above
- a data type like PT where the user did actually want the kind-polymorphic version.
But that might be a price worth paying for the simplicity, uniformity, and predictability you'd get in exchange.
comment:4 Changed 18 months ago by ekmett
Just a thought, in our compiler for Ermine we use a strategy very close to Mark P Jones' strategy with CUSKs.
However when we have a type that is not fully what you'd call a CUSK because at least one of the type variables are unannotated we actually allow it to participate in a binding group SCC like any other type.
We instantiate just those unannotated positions in the type with fresh kind variables not all.
Then we run through the binding group checking subsumption for each, letting those checks accrete constraints on the unannotated types before we generalize over the whole SCC.
A universe level down the code looks like:
https://github.com/ermine-language/ermine/blob/master/src/Ermine/Inference/Type.hs#L150
At the kind level the code is in a local branch and I'll push it out to github soon.
This lets us fit somewhere in the middle between Mark P Jones' strategy and Richard's.
Unannotated variables participate in cycles and unify out to whatever they must, while kind variables supplied explicitly annotated types can't be forced to unify with them without yielding skolem escapes.
The rule then becomes that if you annotate some types with a kind involving explicit kind variable in a class declaration all kinds that unify with that kind variable will need to be annotated, but other kinds can be left untouched.
In exchange you don't get more specific kinds than you ask for #9201, we can handle partially annotated polymorphic recursion like occurs here, and existing code so long as it doesn't rely on #9201 would still typecheck.
comment:5 Changed 18 months ago by goldfire
I think Simon got one key detail in (RICHARD) wrong: I do propose to generalize after kind-checking all the declarations, not defaulting to *.
The way I see it is this: there is "good" polymorphism and "bad" polymorphism.
Here is "good":
type family IsJust x where IsJust Nothing = False IsJust (Just x) = True
Without any annotation, I would want to infer forall k. Maybe k -> * as the kind for IsJust. This goes against Simon's conclusion that all kind polymorphism should be annotated. I actually would argue that the genie about this sort of polymorphism is out of the bottle, and that Simon's proposal would be too big a breaking change.
Here is "bad" polymorphism:
type family F x where F Bool = True F Maybe = False
It is perfectly reasonable to infer forall k. k -> * for the kind of F. Indeed, this could even be implemented without too much trouble. (I think I did implement this behavior along the way to the current behavior for closed type families!) But, I think most users would prefer to get a kind error in this case. If they want this polymorphism, use an annotation.
What's the difference between "good" and "bad" polymorphism? "Good" is parametric and "bad" is non-parametric. Hence the name NonParametricKinds.
The examples above come from type families. Extending these ideas to datatypes is awkward because datatypes don't (currently) support kind-indexing (essentially, being GADT-like in kind parameters) in the right way. The closest we can really get is to think about polymorphic recursion, which is where this all started. Thus, in the context of datatypes and classes, I agree that the name NonParametricKinds is bogus.
What this all leads to is this: we want to allow "good" polymorphism without an annotation but to allow "bad" polymorphism only if specifically requested. This is exactly the story with closed type families today. And, if I'm understanding all the details correctly, in the datatype/class world, this means that we get non-polymorphic recursion to work all the time, but polymorphic recursion only when requested. After all, if you cross your eyes, polymorphic recursion can look a little like non-parametric polymorphism: both are cases when a type parameter is used inconsistently in a definition.
I think the strategy I'm outlining is not unlike the behavior Edward describes in Ermine. The only difference I can spot is that my strategy allows unification variables to unify with skolems, where Ermine's seems to avoid this. For example:
type family G x (y :: k) :: Constraint where G x y = (x ~ y)
Here, the kind of x must be k, but that's acceptable. Would that fail in Ermine? I could certainly see that failure here could be reasonable.
As a small aside, I wonder how any of this interacts with the coming partial type signatures at the term level. How do those interact with polymorphic recursion?
comment:6 Changed 18 months ago by ekmett
Currently we'd reject that, but if you can come up with a story that allows it I'd happily switch.
We can (and do) use it for partial type signatures by combining it with HMF style "some" annotations, but since we disallow the partial types from referencing the quantified ones (since some can only occur in an Annot, outside of the type) it is perhaps less useful than a design that can do what you want could be.
comment:7 Changed 18 months ago by goldfire
What's insufficient about the algorithm I've described here? The G above is accepted by today's GHC. I don't see where a skolem escape would fail.
comment:8 Changed 18 months ago by dolio
Ermine's algorithm for this effectively has an additional quantifier, some. Your G definition is effectively equivalent to:
u : a -> a -> T g : some a. forall b. a -> b -> T g x y = u x y
'some' is effectively a quantifier for unification variables. So for instance, if I wrote:
h : some a b. a -> b -> T h x y = u x y
checking would succeed, with h : a -> a -> T. Also, some cannot occur arbitrarily in a type. It can only occur outer-most in a type ascription.
The reason ermine rejects g right now is that the metavariable a is at an outer level of binding. They cannot be generalized over until the entire SCC is done being checked, while g is expecting to be polymorphic in b. This is the kind of check that would fail for an example like:
g : some a. forall b. a -> b -> T g x y = h x y h y x = g x y
We could try to be more intelligent, and do additional generalization as long as skolems wouldn't flow into other types in the binding group, or something, but we haven't gotten that fancy yet.
comment:9 Changed 18 months ago by dolio
Having thought about this some, I realized that there's no reason to fail due to skolems 'escaping' into other types of the binding group, since we're about to generalize all the types anyway. So I've tweaked our algorithm to not care about that. Now the equivalent of your G example checks, and so does my example above:
g : some a. forall b. a -> b -> T g x y = h x y h y x = g x y
yields:
g : forall b. b -> b -> T h : forall b. b -> b -> T
This seems sensible to me, and sounds like your proposal.
comment:10 Changed 18 months ago by simonpj
- Cc dimitris@… sweirich@… added
- I have started a wiki page to record these alternative designs.
- I'm afraid that I do not understand how you distinguish "good" from "bad" polymorphism. Examples are not enough!
- I'm afraid I do understand Ermine's algorithm.
Could I ask you to write a section on the wiki page that describes as clearly, as you possibly can, the designs you have in mind? That would make sure we are all talking about the same thing.
By the same token, if you don't understand anything in the designs I have described there, please say so and I'll clarify.
I'm adding Stephanie and Dimitrios in cc
Simon
comment:11 Changed 18 months ago by goldfire
Added some comments (delineated with Richard: ... End Richard) to the wiki page. I believe the (PARGEN) option you describe there agrees with my proposal. I also believe that my proposed algorithm now (with dolio's changes to Ermine) matches Ermine's.
comment:12 Changed 18 months ago by dolio
Having thought some more, I feel like there is potential weirdness with the changes I mentioned earlier, even if it works nicely for some simple examples. For instance, consider:
u : a -> a -> a f : some a b. a -> a -> b f x y = u (g x y) (h x y) g : some c b. forall a. c -> a -> b g x y = f x y h : some c b. forall a. c -> a -> b h x y = f x y
Now, since f takes two arguments of the same type, this requires each of g and h to do the same. But, to ensure g and h are sufficiently polymorphic, we need to skolemize. These skolems then flow into the c unification variables, which in turn are unified due to f. So this will cause a skolem mismatch.
So, I'm not sure it's a good idea for these sorts of examples to work, even if they could in some cases. It seems like it could easily be confusing when similar examples happen to work or not work depending on where skolems flow. So I guess I'm back to thinking that it's better for your G example to not check:
u : a -> a -> T g : some c. forall k. c -> k -> T g x y = u x y
simply because it's due to a less confusing rule.
Ermine has, of course, always generalized unification variables that are still free after this process; that is what is expected for types. I think it's handy for kinds as well (probably excepting the 'bad polymorophism' examples).
comment:13 Changed 18 months ago by dolio
Having conferred with Ed, I guess GHC uses a different strategy that doesn't involve (proper) skolem variables, so it might not be subject to my complaint. Using some kind of after-the-fact distinctness check instead of proper skolems seems not to arbitrarily rule out certain definitions in a confusing manner, which was my main concern.
comment:14 Changed 18 months ago by simonpj
Richard, I have tried to address your questions on the wiki page. Better now?
I still don't know what the difference between bad and good polymorphism is, but maybe it doesn't matter now, since you say that it is "tangential".
You say "Because open type families do not have a body, they would still need their own kind inference story, where unconstrained meta-variables default to *." but I don't understand that. Add a subsection for it? (Open type families simply have their own, complete, kind signature, no?
I'm happy with (PARGEN) if you are happy with the typing rule including the side condition.
Simon
comment:15 Changed 18 months ago by goldfire
Happily, I think everyone here has converged on (PARGEN) with the critical side condition that we do not unify a meta-variable with a kind containing a skolem. I realized yesterday that the side condition was necessary but didn't have the chance to write it up.
In any case, I think we can end the design phase here and move onto the implementation phase. Most of (PARGEN) should be easy, as (PARGEN) is exactly what is implemented by NonParametricKinds, except for the side condition. To implement the side condition, I propose a new disjunct for MetaInfo named NonSkolemTv which is like TauTv but cannot unify with a type containing a skolem. The unifier, in TcUnify, would just check this condition at the appropriate point. Sound reasonable?
comment:16 Changed 18 months ago by goldfire
And, yes, open type families keep their complete kind signature.
comment:17 Changed 18 months ago by simonpj
I'm not quite ready to declare victory. Some technical questions:
- See T1a and T4 under (PARGEN). These are unexpected failures. The tricky side condition means that you must declare all the polymorphism involving k, or none, but nothing in between.
- What is the rule for non-recursive bindings? Specifically:
data NR f (a::k) = MkNR (f a)
Is this accepted (as now), or rejected by the "tricky side condition". If the former that gives an uncomfortable difference between recursive and non-recursive bindings.
- I'd like to see a rule written down for closed type families, and a sketch of the accompanying algorithm.
And finally, at a non-technical level, I'm not convinced that (PARTIAL) is untenable.
- Yes, people using kind polymorphism might need to add some kind signatures, but that would arguably improve their code.
- We could issue a warning for one release cycle.
- The tricky side condition becomes simple: all kind polymorphism must be declared.
- And sometimes kind polymorphism might even trip you up:
foo :: forall f a. f a -> f a foo = ... where g :: f Int -> Int
This will, I think, fail. Because foo's signature actually meansfoo :: forall k. forall (f:k->*) (a:k). f a -> f a
so the (f Int) application is ill-kinded.
Edward is an example of a high-end kind-polymorphism user. How bad would (PARTIAL) be?
Implementation. Re NonSkolemTv, I'd hate to pollute MetaInfo for such a corner case. We can simply test the side condition at the end, just as the rule suggests. Generating a helpful error message is quite a challenge!
comment:18 Changed 18 months ago by goldfire
- I agree that (PARGEN) is a breaking change. But, I'll quote Simon's comment from the wiki page, which I fully agree with:
So moving from (BASELINE) to (PARGEN) would be a breaking change, but only in rather obscure circumstances. I am intensely relaxed about that particular backward-compatibility problem!
- Non-recursive bindings should behave the same as recursive ones. This means that NR would be rejected due to the side condition.
Closed type families
(PARGEN) (including the side condition) should work out just fine. Here would be the declarative typing rule:
k2 = forall fkv(k1). k1[k'1 .. k'n/_] fkv(k1) \not\in k'j forall i: G, F:k2 |- (F ps_i = t_i) ok k3 = gen(G, k2) --------------------------------------------------- G |- type family F :: k1 where { F ps_i = t_i } : k3
This is very similar to the declarative typing rule for letrec given on the wiki page. Here, I am ignoring issues with arity/saturation and using a syntax where the kind signature of a type family is given as k1 with blanks, instead of the tyvarbndr syntax used in source code.
This is in fact an improvement over the current state of affairs, which is essentially this rule without the side condition. Because of the omitted side condition, we don't have principal types! For example,
type family X (a :: k) where X True = False
X could be X :: k -> k or X :: k -> Bool. Neither is more general than the other. GHC chooses X :: k -> Bool, but it's not principled. This is what I get for implementing kind inference for closed type families without writing out declarative rules! In any case, the solution to this problem (closed type families) seems to be the same as the solution for datatypes and classes, quite happily: add the side condition.
Using (PARTIAL)
I'm still quite strongly against (PARTIAL). Let's ignore the backward compatibility problem, for now. With (PARTIAL), we have a very different story about kind polymorphism than we do about type polymorphism. GHC tries to be as polymorphic as possible with types: writing foo x = x gives foo :: forall a. a -> a, not foo : () -> (). It seems very odd to me to not do the same in kinds. Yet, (PARTIAL) would say that data Foo x = MkFoo (Foo x) would get Foo :: * -> * instead of Foo :: forall k. k -> k. Especially as we're inching toward dependent types, this seems like a step backward.
And, though I defer to Edward's experience, I tend to think the backward compatibility problem would be annoying for what may seem like a dubious benefit (polymorphic recursion in the presence of partial kind signatures).
Conclusion
I am worried that (PARGEN) has a strange surface area, and I think that any error messages that the side condition produces should include a URL for further reading. (I actually think that links to further reading would helpful in several error messages!) But, if this surface area is deemed too bumpy, I personally would prefer to go back to (BASELINE) than to go to (PARTIAL).
comment:19 Changed 18 months ago by goldfire
Hold the phone! This is all wrong!
Interaction with partial type signatures
I just ended a great discussion with Iavor and with David Darais about this whole issue, and we came to the conclusion that (PARGEN) is a bad idea. The problem isn't that (PARGEN) doesn't work (I still think it does work) but that it paints us into a corner.
We looked at PartialTypeSignatures, which describes an upcoming feature to allow wildcards in type signatures of terms, in case the programmer does not want to write down an entire type. With partial type signatures, wildcards can unify with lexically-quantified variables, contrary to the side condition debated in this ticket. And, the feature looks useful, as types get more complicated and the burden of writing out a full signature increases.
There is no discussion (that I saw) of polymorphic recursion w.r.t. partial type signatures, so I don't know what the term-level plan is on that issue.
A change of heart
Looking at all of this (now back at the type level), I see that we have two, mutually exclusive options: 1) Allow polymorphic recursion in the presence of a partial kind signature via (PARGEN) + side condition. 2) Allow more flexible partial kind signatures.
I think (2) is of greater benefit to users than is (1). As a further bonus, (2) is much simpler than (1) -- indeed, it is what we have today.
A new proposal
Thus, I propose a radically different solution to the original problem: come up with a syntax for a CUSK for classes. Call this proposal (NEWCUSK). Specifically:
A class or datatype is said to have a CUSK if and only if all of its type variables are annotated. Otherwise, like (BASELINE).
As with (BASELINE), polymorphic recursion is possible only in the presence of a CUSK. Under (NEWCUSK), Edward's original example will not work, but adding an annotation to a (presumably, (a :: k2)) would make it work.
(NEWCUSK) means that a declaration like data Foo a b :: * will not have a CUSK. This would be a breaking change if a programmer used :: * to force other variables to have kind * and/or used polymorphic recursion. Note that in the case where a user puts the :: right after the datatype name, there are no type variables, so every one is vacuously annotated -- no breakage here.
If we are opposed to making this breaking change, the (NEWCUSK) proposal still works while also having a top-level :: indicate a CUSK. I just think that the top-level :: syntax is a strange bump in the language design. (Though, to be fair, I couldn't think of something better at the time.)
Relationship to GADT pattern-matching
In continuing to think about all this, I have come to realize that (PARGEN) is quite strange, indeed. Let's consider some more familiar territory:
data G a where GBool :: G Bool foo GBool = True
GHC rightly rejects the code above saying, essentially, that foo does not have a principal type. foo :: G a -> a and foo :: G a -> Bool are both very reasonable candidates and are not comparable. (The error message, of course, discusses untouchables... but it's really talking about principal types.)
But, if we apply the (PARGEN) philosophy to type-checking GADT pattern-matches, we would decide that foo :: G a -> Bool is the principal type. This is because the (PARGEN) philosophy would say to prevent the inferred result type from mentioning any type variables refined in the pattern match. I am playing a little fast and loose here talking about the un-articulated "(PARGEN) philosophy", but I'm hoping readers can understand my concern. A perhaps tighter way of saying this is: I conjecture that the over-permissiveness of the type system in Fig. 10 (p. 25) of OutsideIn (as discussed in Sec. 4.4) might be fixed by adding a side condition to the Case rule, essentially saying that any type variables refined in a local equality assumption cannot then appear in other types discovered by using constraints generated by the case. To make this meaningful, the type system would have to be re-written in a bidirectional manner, so we can know which types come "from" the case vs. which types go "to" the case. I leave the exact formulation as future work, but I'd be very curious to read a response to this conjecture.
Conclusion
What do others think of (NEWCUSK) and the points I've brought up here? It seems to solve the original problem (no polymorphic recursion available in classes) while not upsetting too much else. And, it seems to have nicer theoretical properties and is easier to explain to users.
comment:20 Changed 18 months ago by dolio
This example:
foo :: forall f a. f a -> f a foo = ... where g :: f Int -> Int
just seems like bad policy to me. The top signature is a partial signature, the missing pieces (the kinds) should be solved, and it is not sufficient to only look at the signature of foo to do that solving. I can construct similar examples in Ermine, although it is a bit convoluted due to our currently meager tooling:
:type let { f : forall f a. f a -> f a ; f x = x } in f forall {k} (f : k -> *) (a : k). f a -> f a
:type let { f : forall f a. f a -> f a ; f x = x where { z = g x ; g : some f a b. f a -> f b -> b ; g _ _ = "" } } in f forall (f : * -> *) (a : *). f a -> f a
Effectively, this signature is the same as (for us)
some k1 k2. forall (f : k1) (a : k2). f a -> f a
although we don't currently handle some-bound kinds; unspecified kinds act similarly, though.
The problem with this example seems to be deciding on things too early, not generalizing. For instance, doesn't PARTIAL error if you apply the same kind of policy to:
foo :: forall f a. f a -> f a foo = ... where g :: f Maybe -> f Maybe
f will be defaulted to * -> *, but that is wrong for application to Maybe.
comment:21 Changed 18 months ago by simonpj
OK, so (NEWCUSK) is precisely (BASELINE) with a slightly expanded definition of CUSK (c.f. the current definition in the user manual). I'm all for that; I quite liked (BASELINE): it's pretty simple and lines up precisely with the term-level story (a la Mark Jones).
Just to check, though, let's be sure we agree:
data T1 f a = MkT1 (f a) -- No CUSK -- Gets T1 :: forall k. (k->*) -> k -> * -- The generalisation takes place in the third bullet of step 3 of (BASELINE) data T2 a b :: * -- Presumably a nullary data type -- No CUSK -- Gets T2 :: forall k1 k2. k1 -> k2 -> * -- Very similar to T1
I have no idea what you mean by "2) Allow more flexible partial kind signatures" in comment:19.
Could you update the wiki page with:
- (NEWCUSK) (= (BASELINE) + revised defn of CUSK)
- the typing rules for closed type families
Thanks!
Simon
comment:22 Changed 18 months ago by goldfire
By "2) Allow more flexible partial kind signatures", I meant that, under (NEWCUSK), the kind of un-annotated type variables can unify with lexically-scoped kind variables, allowing the following to type-check:
data Q f (a :: k) where MkQ :: f a -> Q f a -- fails under (PARTIAL) and (PARGEN) -- succeeds under (BASELINE) and (NEWCUSK)
If partial type signatures on terms are so useful (and I think they will be), then partial kind signatures, like the one on Q, would be, too.
Yes, I am agreed with the examples in comment:21.
(NEWCUSK) description was already on wiki page, and I have now added rules for closed type families. These rules are a breaking change from today's closed type families, but a "good" breakage, in that it tightens up the specification from the current wonky state of affairs. See the example from comment:18, which would no longer type-check.
comment:23 follow-up: ↓ 24 Changed 18 months ago by dolio
I don't have a particular dislike for NEWCUSK, I think. Anything that allows you to do polymorphic recursion with sufficient annotation is at least tolerable.
However, I don't really understand what the problem with PARGEN is that caused the added restrictions that make it less desirable. Is the entire problem closed, partially annotated type families that are non-parametric, and therefore act like GADT definitions and lack principal types? And is the motivation to then have a uniform rule for both these and everything else that doesn't require detecting situations that would fail to have a principal type?
You (goldfire) and I discussed what we're doing in Ermine, which is now quite a lot like the (original) PARGEN strategy. It seems to work for the sort of partially specified type signatures we have, and it works for polymorphic recursion, including polymorphic recursion with partial signatures. For instance:
let { u : a -> a -> a ; u x _ = x ; f : some b c. forall a. a -> b -> c ; f x y = let { z = u x y } in g x y ; g _ y = f () y } in f
Without the signature on f the inferred type is () -> () -> c, but with it the inferred type is b -> b -> c.
Anyhow, I'm not particularly adamant about your choice of (original) PARGEN vs. NEWCUSK. But I'd like to understand what exactly the objections were, and whether they're tied to GADTs and type families.
comment:24 in reply to: ↑ 23 Changed 18 months ago by goldfire
Replying to dolio:
Is the entire problem closed, partially annotated type families that are non-parametric, and therefore act like GADT definitions and lack principal types?
Not just these. Also partially annotated datatypes and classes will lack principal kinds. Here is the example:
data W f (a :: k) where MkW :: W Maybe Int -> W f a
Do we have W :: (k -> *) -> k -> * or W :: (* -> *) -> k -> *, among others? Both seem reasonable, though the latter is certainly easier to infer. To eliminate ambiguity, we have to add the restriction to (PARGEN). This makes it clear that the second kind is the only allowable one.
And is the motivation to then have a uniform rule for both these and everything else that doesn't require detecting situations that would fail to have a principal type?
Yes, that's my understanding.
You (goldfire) and I discussed what we're doing in Ermine, which is now quite a lot like the (original) PARGEN strategy.
From my understanding, your original implementation was like (PARGEN) + side condition. Only after discussing did you (effectively) remove the side condition.
It seems to work for the sort of partially specified type signatures we have, and it works for polymorphic recursion, including polymorphic recursion with partial signatures. For instance:
let { u : a -> a -> a ; u x _ = x ; f : some b c. forall a. a -> b -> c ; f x y = let { z = u x y } in g x y ; g _ y = f () y } in fWithout the signature on f the inferred type is () -> () -> c, but with it the inferred type is b -> b -> c.
I believe the original (PARGEN) "works", in that it is sound, for an appropriate definition of sound. However, it can infer non-principal types, which seems bad. I don't think it's very wrong for Ermine to adopt (PARGEN) - side condition, but I have to say I don't think it's very right, either.
I believe that the example you give has a principal type, which is inferred correctly.
Anyhow, I'm not particularly adamant about your choice of (original) PARGEN vs. NEWCUSK. But I'd like to understand what exactly the objections were, and whether they're tied to GADTs and type families.
I would say that the problems are wholly independent of GADTs and type families. There is a problem with closed type families that is closely related, but the fundamental problem is independent.
comment:25 Changed 18 months ago by dolio
This example:
data W f (a :: k) where MkW :: W Maybe Int -> W f a
is just another Milner-Mycroft problem. The inferred kind is (* -> *) -> k -> *.
The principal kind is k0 -> k -> *, but it can't be inferred. Just like
data W f a = W (W Maybe Int)
will have inferred kind (* -> *) -> * -> *, even though k0 -> k -> * is valid and strictly more general.
From my understanding, your original implementation was like (PARGEN) + side condition. Only after discussing did you (effectively) remove the side condition.
Yes, but this is because we were using skolem variables, which can lead to confusing rejection of partially annotated mutual definitions. When we started basing our algorithm on something closer to GHC, using what I think you called SigTvs, plus a per-signature distinctness check, that objection was no longer valid.
I believe the original (PARGEN) "works", in that it is sound, for an appropriate definition of sound. However, it can infer non-principal types, which seems bad.
Hindley-Milner infers non-principal types (so does GHC, of course).
I don't think it's very wrong for Ermine to adopt (PARGEN) - side condition, but I have to say I don't think it's very right, either.
I still don't think we have an example that doesn't involve GADTs or type families.
comment:26 Changed 18 months ago by simonpj
I have edited the wiki page so that the beginning part says what we actually plan to do. See if you agree with it.
Simon
comment:28 Changed 16 months ago by simonpj
On Tuesday I promised Richard I'd write up the two variations of our proposed consensus, on the wiki page. I have now done so.
A couple of questions labelled Simon are in the text. Over to Richard.
Simon
comment:29 Changed 16 months ago by simonpj
See also #9151 which will be nailed by this change.
Simon
comment:30 Changed 16 months ago by goldfire
Comments on wiki page addressed.
comment:31 Changed 16 months ago by simonpj
Responses from Simion
comment:32 Changed 16 months ago by goldfire
Response from Richard.
comment:33 Changed 16 months ago by simonpj
I've added a reply, as below. Is it ok to remove the para about "non-parametric" equations?
Simon: I agree that, since type syononyms can't be recursive except via a data type, if you use "A possible variation" below, then you can get away with saying that type synonyms cannot have CUSKs. It seems a bit non-orthogonal; and could occasionally be tiresome. Imagine
data T1 a b c = ...S... data T2 p q r = ...S... data T3 x y q = ...S... type S f g h = ...T1...T2...T3...
Then, since you can't decorate S with a CUSK, you might need to annotate all of T1, T2 and T3 with CUSKs.
It would not be hard to say what a CUSK for a type synonym was: just wrap the whole RHS in a kind signature:
type T (a::k1) (b::k1->k2) = b a :: k2
I think I'd argue mildy for that, just to make the design orthogonal.
comment:34 Changed 16 months ago by goldfire
What about
type S (a :: k) = Proxy (T (Just a)) data T a = MkT (S a)
Does that S have a CUSK? If it does, the code should be accepted, I think. Or, would a user have to put in a totally-superfluous :: * at the end of S's RHS? Regardless of the kind of T, a proxy is always in kind *.
comment:35 Changed 16 months ago by simonpj
Is it ok to remove the para about "non-parametric" equations?
Re your question, no that S doesn't have a CUSK, any more than
data T a b = MkT (T a b) a b
does. It's "obvious" that S's RHS has kind *, but only because you have mentally done kind inference on S's right hand side.
Of course, the example you give would be accepted in fact. Neither has a CUSK, but normal inference and generalisation suffices. There is no polymorphic recursion.
Simon
comment:36 Changed 16 months ago by goldfire
OK. wiki page updated, including type synonym CUSKs as described. I still find that a little wonky, but just in the concrete syntax, not the semantics.
For what it's worth, my example in comment:34 does seem to require at least one CUSK -- at least, it does in 7.8.
In any case, I think we're in agreement here about what to do, no?
comment:37 Changed 16 months ago by simonpj
Yes, I think we're in agreement. Over to you when you have a moment.
Simon
comment:38 Changed 16 months ago by Richard Eisenberg <eir@…>
comment:39 Changed 16 months ago by Richard Eisenberg <eir@…>
comment:40 Changed 16 months ago by Richard Eisenberg <eir@…>
comment:41 Changed 16 months ago by Richard Eisenberg <eir@…>
comment:42 Changed 16 months ago by Richard Eisenberg <eir@…>
comment:43 Changed 16 months ago by Richard Eisenberg <eir@…>
comment:44 Changed 16 months ago by Richard Eisenberg <eir@…>
comment:45 Changed 16 months ago by goldfire
- Milestone set to 7.10.1
- Resolution set to fixed
- Status changed from new to closed
- Test Case set to polykinds/T9200 polykinds/T9200b
The above patches implement much of what was discussed here. They do not implement the "possible variation" part from the wiki page, as I ran out of time and the plumbing became a little challenging. See ticket #9427 to track the implementation of that piece.
This is a user-facing, breaking change (in obscure circumstances) and should not be merged for 7.8.4.
Yes, this absolutely should be able to work but it currently doesn't, perhaps by design. Happily, I believe this one is actually quite easy to solve. See this comment. I believe if we changed the kind-checking strategy for classes from ParametricKinds to NonParametricKinds, this problem would be solved. I also believe that this change would allow only more programs to type-check, and would thus be fully backward compatible. Indeed, in my branch where I'm implementing a merged type/kind language, I've gotten rid of ParametricKinds.
Tracing this design through history, I think it came from work on the patch described in #6093. There, Simon proposed (and heard no complaints about) allowing polymorphic recursion if and only if a "complete user-supplied kind signature" ("cusk") is given. See section 7.8.3 of the manual. However, a cusk is not possible for classes, giving Edward no way to proceed here.
At the end of the comment trail of #6093, Simon asks for a better design. Here goes: allow recursive instantiations of a kind variable to vary if and only if that kind variable is mentioned in a type. I'm being quite literal here in my meaning of mentioned -- the kind variable must simply appear syntactically in the code. By appearing in the source, GHC can know to generalize over the kind variable "early on" and then can deal with the polymorphic recursion. With this design, the need for cusks goes away.
I think the initial reason that cusks came about in the first place is from a perceived need to either allow polymorphic recursion or to disallow it. In terms, this is straightforward: there is either a type signature or there isn't. In types, however, Haskell syntax allows for various forms of partial kind signatures. But, we had the desire for a simple "yes" or "no" answer about polymorphic recursion, and so the cusk was born. What I'm proposing here is to embrace the gray area between "yes" and "no" and permit some polymorphic recursion -- specifically, over those kind variables that are mentioned in the source.
Is this history accurate, from those involved (that is, mostly, Simon)? Do we see code that would be accepted under my proposal that would be surprising to users? Happily, implementing my proposal is dead easy -- just twiddle kind-inference strategies, as my proposal is called NonParametricKinds there.
As some context, here is an example from closed type families (the one place where NonParametricKinds is used) that is confusing:
The kind of LooksLikeId surely looks like that of an identity function: k -> k. But, because type families are not parametric (hence the name NonParametricKinds), we can inspect the kind and branch on it. Note that the equations above do not overlap, as they are distinguished by their kind parameters. The definition above would fail to type-check without the k annotation, as that is necessary for GHC to know to generalize over the kind instead of just solve for it.
Thoughts on any of this? Am I deeply misunderstanding some properties of type inference here?