Milner-Mycroft failure at the kind level

changed weight to 5

added Tbug Trac import typechecker labels

Trac metadata

Trac field	Value
CC	ekmett → ekmett@gmail.com

Yes, this absolutely should be able to work but it currently doesn't, perhaps by design. Happily, I believe this one is actually quite easy to solve. See this comment. I believe if we changed the kind-checking strategy for classes from ParametricKinds to NonParametricKinds, this problem would be solved. I also believe that this change would allow only more programs to type-check, and would thus be fully backward compatible. Indeed, in my branch where I'm implementing a merged type/kind language, I've gotten rid of ParametricKinds.

Tracing this design through history, I think it came from work on the patch described in #6093 (closed). There, Simon proposed (and heard no complaints about) allowing polymorphic recursion if and only if a "complete user-supplied kind signature" ("cusk") is given. See section 7.8.3 of the manual. However, a cusk is not possible for classes, giving Edward no way to proceed here.

At the end of the comment trail of #6093 (closed), Simon asks for a better design. Here goes: allow recursive instantiations of a kind variable to vary if and only if that kind variable is mentioned in a type. I'm being quite literal here in my meaning of mentioned -- the kind variable must simply appear syntactically in the code. By appearing in the source, GHC can know to generalize over the kind variable "early on" and then can deal with the polymorphic recursion. With this design, the need for cusks goes away.

I think the initial reason that cusks came about in the first place is from a perceived need to either allow polymorphic recursion or to disallow it. In terms, this is straightforward: there is either a type signature or there isn't. In types, however, Haskell syntax allows for various forms of partial kind signatures. But, we had the desire for a simple "yes" or "no" answer about polymorphic recursion, and so the cusk was born. What I'm proposing here is to embrace the gray area between "yes" and "no" and permit some polymorphic recursion -- specifically, over those kind variables that are mentioned in the source.

Is this history accurate, from those involved (that is, mostly, Simon)? Do we see code that would be accepted under my proposal that would be surprising to users? Happily, implementing my proposal is dead easy -- just twiddle kind-inference strategies, as my proposal is called NonParametricKinds there.

As some context, here is an example from closed type families (the one place where NonParametricKinds is used) that is confusing:

type family LooksLikeId (x :: k) :: k where
  LooksLikeId x = Bool
  LooksLikeId x = Maybe

The kind of LooksLikeId surely looks like that of an identity function: k -> k. But, because type families are not parametric (hence the name NonParametricKinds), we can inspect the kind and branch on it. Note that the equations above do not overlap, as they are distinguished by their kind parameters. The definition above would fail to type-check without the k annotation, as that is necessary for GHC to know to generalize over the kind instead of just solve for it.

Thoughts on any of this? Am I deeply misunderstanding some properties of type inference here?

That's very interesting Richard. I had not followed this particular aspect of your implementation of closed type families.

The new proposed strategy

I would explain the strategies rather differently to you.

Baseline strategy (BASELINE), originally due to Mark Jones. Here is the strategy that we currently follow for ordinary, recursive term-level functions, and for recursive data types. I'll describe it for data types, with this example:

   data S f a b = MkS (T a f) (S f a b)
   data T (a::k) (f::k -> *) :: * where
      MkT :: f a -> S f a Maybe -> S f a Int -> T a f

Identify which type constructors have Complete User Type Signatures (CUSK). In this example, T does. Extend the environment with these, fixed, kinds:

       T :: forall k. k -> (k->*) -> *

Perform strongly-connected component analysis on the non-CUSK decls, ignoring dependencies on a type constructor with a CUSK. In our example, we get a single recursive component, containing S.
For each s-c component in turn:

* Bind the type constructor to a fresh meta-kind variable:

        S :: kappa0

Kind-check all the declarations of the component in this environment. This will generate some unifications, so in the end we get

        kappa0 ~ (kappa1 -> *) -> kappa1 -> kappa2 -> *

The kappa1 arises from instantiating T at its call site in S

Generalise. So we get

        S :: forall k1 k2. (k1->*) -> k1 -> k2 -> *

Extend the environment with these generalised kind bindings, and kind-check the CUSK declarations.

The Key Point is that we can kind-check S ''without looking at T's definition at all'', because we completely know T's kind. That in turn means that we can exploit inferred polymorphism for S when kind-checking T. As we do here: T uses S in two different ways (S f a Maybe) and (S f a Int).

'''Richard's strategy*' (RICHARD). The key idea is that *all polymorphism is declared'', so nothing gets to be kind-polymorphic unless you say so. But the payoff is that you can give partial kind signatures. I'm not going to call it !NonParametric becuase I think that's a terrible name. Here's the strategy.

Sort the declarations into strongly-connected components. No special treatment for CUSKs.
For each declaration, extend the environment with a kind binding that has a forall for each explicit user-written kind variable, but meta-kind variables otherwise. For example

      data Foo (a :: k1 -> k1) b c = ...

would get a kind binding

      Foo :: forall k1. (k1->k1) -> kappa1 -> kappa2 -> *

Our earlier example would give

      T :: forall k. k -> (k->*) -> *
      S :: kappa3 -> kappa4 -> kappa5 -> *

Kind-check the declartions in this environment. At a call of Foo, say, we'd instantiate the forall k1 with a fresh meta-kind variable, but would share kappa1, kappa2 among all calls to Foo.
Default any unconstrained meta kind variables to *

That's it! No generalisation step. The only polymorphism is that declared by the user.

So our earlier S/T example would be rejected because it relies on S being polymorphic in its third parameter. If you want the S/T example to work you could write

   data S f (a::k1) (b::k2) = MkS (T a f) (S f a b)
   data T (a::k) f where
      MkT :: f a -> S f a Maybe -> S f a Int -> T a f

That's enough to ensure that S and T's kinds start with

     S :: forall k1 k2. ...blah...
     T :: forall k. ...blah...

Type signatures

Another place that we currently (i.e. using (BASELINE)) do kind generalisation is in type signatures. If you write

f :: m a -> m a 
f = ...

then the type signature is kind-generalised thus:

This user-written signature 
  f :: m a -> m a 
means this (BASELINE)
  f :: forall k (a:k) (m:k->*). m a -> m a

And f's RHS had better be that polymorphic.

Under (RICHARD) it would be consistent to say this:

This user-written signature 
  f :: m a -> m a 
means this (RICHARD)
  f :: forall (a:*) (m:k->*). m a -> m a

If you want the kind-polymorphic one, you'd have to write thus

This user-written signature 
  f :: forall k (a:k) (m:k->*). m a -> m a
means this (RICHARD)
  f :: forall k (a:k) (m:k->*). m a -> m a

Declarative typing rules

Happily (RICHARD) has a nice declarative typing rule.

Here is what the conventional declarative typing rule, in the absence of polymorphism for a single self-recursive function looks like:

        G, f:t |- e:t
        G, f:t |- b:t'
      ---------------------------
        G |- letrec f = e in b : t'

Here the "t" is a monotype (no foralls) that the declarative typing rules clairvoyantly conjures up out of thin air.

Once you add Hindley-Milner style polymorphism, the rule gets a bit more complicated

        G, f:t |- e:t
        G, f:gen(G,t) |- b:t'
      ---------------------------
        G |- letrec f = e in b : t'

where 'gen' is generalising.

The (RICHARD) rule might look like this:

        t = forall vs. sig[t1..tn/_]
        G, f : t |- e : forall vs.t
        G, f : t |- b:t'
      ---------------------------
        G |- letrec f :: forall vs. sig; f = e in b : t'

Here I'm expressing the user-specified knowledge as a signature forall vs.sig, with '_' for bits you don't want to specify.

       f :: forall a. _ -> a -> _

Then the rule intantiates each '_' with a clairvoyantly guessed monotype (which might mention the 'vs', or 'a' in this example), and off you go.

Reflection

I think we could reasonably switch to (RICHARD) throughout.

As Richard's comments in TcHsType point out, we don't want maximal polymorphism. His example is:

    type family F a where
      F Int = Bool
      F Bool = Char

We could generate

   F :: forall k1 k2. k1 -> k2

so that (F Maybe) is well-kinded, but stuck. But that's probably not what we want. It would be better to get F :: * -> *

But what about

    type family G a f b where
      G Int  f b = f b
      G Bool f b = Char -> f b

You could just about argue that the programmer intends

   F :: forall k. * -> (k->*) -> k -> *

It's quite similar to this:

  data PT f a = MkPT (f a)

which today, using (BASELINE), we infer to have kind

  PT :: forall k. (k->*) -> k -> *

But I'd be perfectly happy if PT got a monomorphic inferred kind, which is what (RICHARD) would do:

  PT :: (*->*) -> * -> *

If you want the poly-kinded PT, use a signature:

  -- Any of these would do
  data PT f             (a :: k) = MkPT (f a)
  data PT (f :: k -> *) a        = MkPT (f a)
  data PT (f :: k -> *) (a :: k) = MkPT (f a)

One oddity is that we'd do (BASELINE) for terms and (RICHARD) for types. But perhaps that's ok. They are different.

Terms ought to be as polymorphic as possible but arguably not types. Examples above. Also, since kind polymorphism is still in its infancy, maybe it's no bad thing if all kind polymorphism is explicitly signalled every time a kind-polymorphic binder is introduced.
Terms have well-established separate type signatures, but we don't have a syntax for separate kind signatures of types and classes.

If we moved from (BASELINE) to (RICHARD), some programs that work now would fail:

the original S/T example above
a data type like PT where the user did actually want the kind-polymorphic version.

But that might be a price worth paying for the simplicity, uniformity, and predictability you'd get in exchange.

Just a thought, in our compiler for Ermine we use a strategy very close to Mark P Jones' strategy with CUSKs.

However when we have a type that is not fully what you'd call a CUSK because at least one of the type variables are unannotated we actually allow it to participate in a binding group SCC like any other type.

We instantiate just those unannotated positions in the type with fresh kind variables not all.

Then we run through the binding group checking subsumption for each, letting those checks accrete constraints on the unannotated types before we generalize over the whole SCC.

A universe level down the code looks like:

https://github.com/ermine-language/ermine/blob/master/src/Ermine/Inference/Type.hs#L150

At the kind level the code is in a local branch and I'll push it out to github soon.

This lets us fit somewhere in the middle between Mark P Jones' strategy and Richard's.

Unannotated variables participate in cycles and unify out to whatever they must, while kind variables supplied explicitly annotated types can't be forced to unify with them without yielding skolem escapes.

The rule then becomes that if you annotate some types with a kind involving explicit kind variable in a class declaration all kinds that unify with that kind variable will need to be annotated, but other kinds can be left untouched.

In exchange you don't get more specific kinds than you ask for #9201 (closed), we can handle partially annotated polymorphic recursion like occurs here, and existing code so long as it doesn't rely on #9201 (closed) would still typecheck.

I think Simon got one key detail in (RICHARD) wrong: I do propose to generalize after kind-checking all the declarations, not defaulting to *.

The way I see it is this: there is "good" polymorphism and "bad" polymorphism.

Here is "good":

type family IsJust x where
  IsJust Nothing = False
  IsJust (Just x) = True

Without any annotation, I would want to infer forall k. Maybe k -> * as the kind for IsJust. This goes against Simon's conclusion that all kind polymorphism should be annotated. I actually would argue that the genie about this sort of polymorphism is out of the bottle, and that Simon's proposal would be too big a breaking change.

Here is "bad" polymorphism:

type family F x where
  F Bool = True
  F Maybe = False

It is perfectly reasonable to infer forall k. k -> * for the kind of F. Indeed, this could even be implemented without too much trouble. (I think I did implement this behavior along the way to the current behavior for closed type families!) But, I think most users would prefer to get a kind error in this case. If they want this polymorphism, use an annotation.

What's the difference between "good" and "bad" polymorphism? "Good" is parametric and "bad" is non-parametric. Hence the name NonParametricKinds.

The examples above come from type families. Extending these ideas to datatypes is awkward because datatypes don't (currently) support kind-indexing (essentially, being GADT-like in kind parameters) in the right way. The closest we can really get is to think about polymorphic recursion, which is where this all started. Thus, in the context of datatypes and classes, I agree that the name NonParametricKinds is bogus.

What this all leads to is this: we want to allow "good" polymorphism without an annotation but to allow "bad" polymorphism only if specifically requested. This is exactly the story with closed type families today. And, if I'm understanding all the details correctly, in the datatype/class world, this means that we get non-polymorphic recursion to work all the time, but polymorphic recursion only when requested. After all, if you cross your eyes, polymorphic recursion can look a little like non-parametric polymorphism: both are cases when a type parameter is used inconsistently in a definition.

I think the strategy I'm outlining is not unlike the behavior Edward describes in Ermine. The only difference I can spot is that my strategy allows unification variables to unify with skolems, where Ermine's seems to avoid this. For example:

type family G x (y :: k) :: Constraint where
  G x y = (x ~ y)

Here, the kind of x must be k, but that's acceptable. Would that fail in Ermine? I could certainly see that failure here could be reasonable.

As a small aside, I wonder how any of this interacts with the coming partial type signatures at the term level. How do those interact with polymorphic recursion?

Currently we'd reject that, but if you can come up with a story that allows it I'd happily switch.

We can (and do) use it for partial type signatures by combining it with HMF style "some" annotations, but since we disallow the partial types from referencing the quantified ones (since some can only occur in an Annot, outside of the type) it is perhaps less useful than a design that can do what you want could be.

What's insufficient about the algorithm I've described here? The G above is accepted by today's GHC. I don't see where a skolem escape would fail.

Ermine's algorithm for this effectively has an additional quantifier, some. Your G definition is effectively equivalent to:

u : a -> a -> T

g : some a. forall b. a -> b -> T
g x y = u x y

'some' is effectively a quantifier for unification variables. So for instance, if I wrote:

h : some a b. a -> b -> T
h x y = u x y

checking would succeed, with h : a -> a -> T. Also, some cannot occur arbitrarily in a type. It can only occur outer-most in a type ascription.

The reason ermine rejects g right now is that the metavariable a is at an outer level of binding. They cannot be generalized over until the entire SCC is done being checked, while g is expecting to be polymorphic in b. This is the kind of check that would fail for an example like:

g : some a. forall b. a -> b -> T
g x y = h x y

h y x = g x y

We could try to be more intelligent, and do additional generalization as long as skolems wouldn't flow into other types in the binding group, or something, but we haven't gotten that fancy yet.

Having thought about this some, I realized that there's no reason to fail due to skolems 'escaping' into other types of the binding group, since we're about to generalize all the types anyway. So I've tweaked our algorithm to not care about that. Now the equivalent of your G example checks, and so does my example above:

g : some a. forall b. a -> b -> T
g x y = h x y
h y x = g x y

yields:

g : forall b. b -> b -> T
h : forall b. b -> b -> T

This seems sensible to me, and sounds like your proposal.

I have started a wikipagetorecordthesealternativedesigns.
I'm afraid that I do not understand how you distinguish "good" from "bad" polymorphism. Examples are not enough!
I'm afraid I do understand Ermine's algorithm.

Could I ask you to write a section on the wiki page that describes as clearly, as you possibly can, the designs you have in mind? That would make sure we are all talking about the same thing.

By the same token, if you don't understand anything in the designs I have described there, please say so and I'll clarify.

I'm adding Stephanie and Dimitrios in cc

Simon

Added some comments (delineated with Richard: ... End Richard) to the [GhcKinds/KindInference wiki page]. I believe the (PARGEN) option you describe there agrees with my proposal. I also believe that my proposed algorithm now (with dolio's changes to Ermine) matches Ermine's.

Having thought some more, I feel like there is potential weirdness with the changes I mentioned earlier, even if it works nicely for some simple examples. For instance, consider:

u : a -> a -> a

f : some a b. a -> a -> b
f x y = u (g x y) (h x y)

g : some c b. forall a. c -> a -> b
g x y = f x y

h : some c b. forall a. c -> a -> b
h x y = f x y

Now, since f takes two arguments of the same type, this requires each of g and h to do the same. But, to ensure g and h are sufficiently polymorphic, we need to skolemize. These skolems then flow into the c unification variables, which in turn are unified due to f. So this will cause a skolem mismatch.

So, I'm not sure it's a good idea for these sorts of examples to work, even if they could in some cases. It seems like it could easily be confusing when similar examples happen to work or not work depending on where skolems flow. So I guess I'm back to thinking that it's better for your G example to not check:

u : a -> a -> T

g : some c. forall k. c -> k -> T
g x y = u x y

simply because it's due to a less confusing rule.

Ermine has, of course, always generalized unification variables that are still free after this process; that is what is expected for types. I think it's handy for kinds as well (probably excepting the 'bad polymorophism' examples).

Having conferred with Ed, I guess GHC uses a different strategy that doesn't involve (proper) skolem variables, so it might not be subject to my complaint. Using some kind of after-the-fact distinctness check instead of proper skolems seems not to arbitrarily rule out certain definitions in a confusing manner, which was my main concern.

Richard, I have tried to address your questions on the wiki page. Better now?

I still don't know what the difference between bad and good polymorphism is, but maybe it doesn't matter now, since you say that it is "tangential".

You say "Because open type families do not have a body, they would still need their own kind inference story, where unconstrained meta-variables default to *." but I don't understand that. Add a subsection for it? (Open type families simply have their own, complete, kind signature, no?

I'm happy with (PARGEN) if you are happy with the typing rule including the side condition.

Simon

Happily, I think everyone here has converged on (PARGEN) with the critical side condition that we do not unify a meta-variable with a kind containing a skolem. I realized yesterday that the side condition was necessary but didn't have the chance to write it up.

In any case, I think we can end the design phase here and move onto the implementation phase. Most of (PARGEN) should be easy, as (PARGEN) is exactly what is implemented by NonParametricKinds, except for the side condition. To implement the side condition, I propose a new disjunct for MetaInfo named NonSkolemTv which is like TauTv but cannot unify with a type containing a skolem. The unifier, in !TcUnify, would just check this condition at the appropriate point. Sound reasonable?

And, yes, open type families keep their complete kind signature.

Trac field	Value
Version	7.8.2
Type	Bug
TypeOfFailure	OtherFailure
Priority	normal
Resolution	Unresolved
Component	Compiler (Type checker)
Test case
Differential revisions
BlockedBy
Related
Blocking
CC	ekmett
Operating system
Architecture

Milner-Mycroft failure at the kind level

Child items 0

Activity

The new proposed strategy

Type signatures

Declarative typing rules

Reflection

Milner-Mycroft failure at the kind level

Relates to

Activity

The new proposed strategy

Type signatures

Declarative typing rules

Reflection