|Version 10 (modified by 8 years ago) (diff),|
A Kind System for GHC
Currently thinking about adding a more expressive Kind System to GHC. This document is currently very WIP and does feature mistakes...
Haskell has a very powerful and expressive static type system. The problem is when you come to do any programming at the type level, that is typed by an unsatisfactorily inexpressive kind system. We propose to make it just a little bit stronger.
Note: the aim here initially is to implement a small useful extension to Haskell, and not to retrofit the entirety of e.g. Omega into GHC yet ;))
Consider the simple example of lists parameterised by their lengths. There are many variations on this theme in the Haskell lore, this authors favourite follows thusly:
data Zero data Succ n data List :: * -> * -> * where Nil :: List a Zero Cons :: a -> List n a -> List a (Succ n)
There are many eugh moments in this code:
- We first declare two new types (
Succ), and, thanks to the EmpyDataDecls extension, say that they are uninhabited by values (except bottom/error).
* -> *, so it is perfectly valid to create a haskell function with a signature:
foo :: Zero -> Succ Zero -> Bool
Really the programmers intent is that
Succare in a disjoint namespace from *-kinded types, and thus this function signature should be disallowed.
* -> *, whereas really the programmer wants to enforce that the argument to
Succwill only ever consist of
Succs. i.e. the
* -> *kind given to
Succis far to relaxed.
- We then declare a new data type to hold lists parameterised by their lengths.
* -> * -> *, which really doesn't tell us anything other than its arity. An alternative definition could have been:
data List item len where ..., although this adds only pedagogical information, and nothing new that the compiler can statically check.
Consconstructor actually has a mistake in it. The second argument (
List n a) has the names to the type parameters flipped. The compiler cannot detect this, and the error will become apparent at use sites which are at a distance from this declaration site.
- Nothing stops a user creating the silly type
List Int Inteven though the intention is that the second argument is structured out of
We propose to add new base kinds other than
* using a simple notation. The above example could become:
data kind Nat = Zero | Succ Nat data List :: * -> Nat -> * where Nil :: List a Zero -- Cons :: a -> List n a -> List a (Succ n) -- Compiler detects error Cons :: a -> List a n -> List a (Succ n)
- We first declare a new kind
Nat, that is defined by two types,
Succare types, they do not classify any haskell values (including undefined/bottom). So the
foo :: Zero -> Succ Zero -> Booltype signature from earlier would be rejected by the compiler.
- We then declare the type
List, but we now say the second argument to
Listhas to be a type of kind
Nat. With this extra information, the compiler can statically detect our erroneous
Consdeclaration and would also reject silly types like
List Int Int.
The idea would be to mirror existing Haskell data declarations. There is a clear analogy as we are now creating new kinds consiting of type constructors as opposed to new types consisting of data constructors.
To destinguish kind declarations from data declarations we can either add a new form of kind declaration:
kind Bool = True | False
However this steals
kind as syntax with the usual problems of breaking existing programs.
Alternatively (preferably), we can add a modifier to data declarations to indicate that we mean a kind declaration:
data kind Bool = True | False
Interaction with normal functions
Functions cannot have arguments of a non * kind. So the following would be disallowed:
bad :: Zero -> Bool -- Zero has kind Nat
This follows straighforwardly from the kind of (->) in GHC already:
?? -> ? -> *, see IntermediateTypes
Type variables may however be inferred to have non-* kinds. E.g.
data NatRep :: Nat -> * where ZeroRep :: NatRep Zero SuccRep :: (NatRep n) -> NatRep (Succ n) tReplicate :: forall n a . NatRep n -> a -> List a n ...
In the above,
n would be inferred to have kind
a would have kind
Interaction with GADTs
GADTs can already be annotated with a mixture of names with optional explicit kind signatures and just kind signatures. These kind signatures would now be able to refer to the newly declared, non-* kinds. However the ultimate kind of a GADT must still be
data Ok a (b :: Bool) :: Nat -> * where OkC :: Ok Int True Zero OkC' :: Ok String False (Succ Zero) data Bad a :: Nat -> Nat where -- result kind is not * ...
In the above example, there is the question of what kind we should assign to
Ok. Currently it would be inferred to be
*. That inference engine would need to be improved to include inference of other kinds.
GADT constructors must only accept arguments of kind
* (as per the restrictions on (->) described below), but may also collect constraints for the kind inference system.
TODO are there real ambiguous cases? _Assuming_ data types have their kind signatures inferred before functions are type checked and must be monomorphic in their kinds, I don't see how there could be unless a variable is totally unconstrained (i.e. not mentioned)
foo :: forall a . Int
However this is accepted (6.8.3), although ghci drops the 'a'. Even if it was used in a scoped setting (TODO example of where that makes sense without a type class grounding it), the moment it is used it'll get a kind constraint. Do PolymorphicKinds break this assumption?
Interaction with Type Classes
Type classes are currently indexed by variables with a fixed kind. Type classes could now be indexed by variables with non-value kinds. E.g.
class LessThanOrEqual (n1 :: Nat) (n2 :: Nat) -- ok instance LessThanOrEqual Zero Zero instance LessThanOrEqual n m => LessThanOrEqual n (Succ m)
This example is ill-kinded though:
class Bad x -- Defaults to x::* instance Bad Int -- OK instance Bad Zero -- BAD: ill-kinded
By default declaration arguments are inferred to be of kind
* if there is nothing in the class declaration (member functions or explicit kind signature) to change this. This seems sensible for backward-compatibility.
Interaction with Type Synonym Families
TODO Also see: ClosedTypeFamilies
Interaction with Data Type Families
Also see PolymorphicKinds which this would build upon...
Data kinds could also be parameterised by kinds in the same way that data types can be parameterised by types. This will require polymorphic kinds, see below:
data kind Maybe k = Nothing | Just k
Maybe has sort
* -> *,
Nothing has kind
forall k . Maybe k and
Just has kind
forall k . k -> Maybe k.
A detour of Sorts
GHC currently allows users to specify simple kind signatures. By allowing declaration of other kinds, and parameterisation of kinds, we will require kinds to have sorts. Initially we may want to push everything up one layer, so our language of sorts is generated by the sort that classifies kinds
*, or functions
sort -> sort.
This means we could allow explicit sort-signatures on kind arguments, e.g.:
TODO think really hard about this example.
data kind With (k :: * -> *) = WithStar (k *) | WithNat (k Nat) data Blah :: * -> With Maybe -> * where B1 :: Int -> Blah (WithStar (Just Int)) B2 :: Int -> Blah (WithNat Nothing) -- type error!
Alt formulation of With using GADK syntax. Does this help?
data kind With :: forall (k :: * -> *) . k -> * where WithStar :: (k *) -> With k WithNat :: (k Nat) -> With k
It might also be nice to support GADK (Generalized Algebraic Data Kind) syntax for declaring kinds, ala:
data kind Maybe :: * -> * where Nothing :: Maybe k Just :: k -> Maybe k
Again, note that
Maybe above is decorated with a
data kind Maybe k where Nothing :: Maybe k Just :: k -> Maybe k
At the moment I haven't thought about existential kinds or kind-refinement that GADK syntax makes natural. Probably beyond the scope of this work, but should be open for someone to add in the future. TODO think about motivating examples.
Note: I don't think it's worth having existential kinds without kind-refinement as we don't have kind-classes, and so no user could ever make use of them. Kind refinement does allow existential kinds to make sense however (or at least be usable). The question then is when does kind-refinement come into play - pattern matches. TODO generate some examples to motivate this.
With bits stolen from IntermediateTypes
There are already 2 sorts,
CO, for kinds that classify types, and for coercions that classify evidence.
Previously I've called sort
* now has many overloaded meanings in this document:
- As a term, it is an infix function
- It has no meaning as a type
- As a kind it is the kind of all lifted types
- As a sort it is the sort of all kinds that parameterise types
Option 1 : Just use
Since all these meanings are in different namespaces, they arn't ambiguous and can be left as-is.
- Short, neat and syntactically light
- Simple to lex
- Probably confusing without familiarity to the namespaces
- Not future compatible if we were to support arbitary stratification
Option 2 : Pick new symbols
Explicit, new names. E.g.
* for terms,
@ for kinds,
& for sorts (insert your own choices).
- Hopefully simpler to lex
- Even more symbols to learn
Option 3 : Levelled Stars
Omega style level names. So:
* is a term,
*1 is a kind,
*2 is a sort etc.
Possibly with some form of aliases to make the common levels more memorable, e.g.
*k for kinds,
*s for sorts.
- Future compatible for arbitary stratification
- Aliases are mnemonic
- Possibly tricky to parse/lex
Of course, if we are going to worry about
* at different levels, the same could apply to other machinary that is applied at different levels (I'm looking at you (->)).
With reference to: TypeNaming
Strictly, the new kinds that have been introduced using
data kind syntax inhabit a new namespace. Mostly it is unambiguous when you refer to a type and when you refer to a kind. However there are some corner cases, particularly in module import/export lists.
Option 1 : Collapse Type and Kind namespace
- Follows behaviour of type classes, type functions and data type functions.
- Inconsistent. It would allow the user to create
Falseas types, but not to be able to put them under kind
Bool. (You'd need to name your kind a
Option 2 : Fix ambiguities
- As more extensions are put into the language, it'll have to happen sooner or later
- Will involve creating a whole new namespace
- Several corner cases
Auto Promotion of Types to Kinds
Many simple data declarations it would be convinient to also have at the type level. Assuming we resolve the TypeNaming and ambiguity issues above, we could support automatically deriving the data kind based on the data.
There are some other issues to be wary of (care of Simon PJ):
- Auto lifting of:
data Foo = Foo Int
Automated lifting this would try and create a kind
Foowith an associated type
Foo. But we've just declared a type
Fooin the data declaration.
- Automatic lifting of GADTs / existentials and parametric types is tricky until we have a story for them.
- Automatic lifting of some non-data types could be problematic (what types parameterise the kind
- We have no plan to auto-lift term *functions* to become type functions. So it seems odd to auto-lift the type declarations which are, after all, the easy bit.
Syntactically however, there are some options for how to do this in cases when it is safe to do:
Option 0: Always promote [when safe]
data Foo = Bar | Baz
will impliclty create a kind
Foo and types
Option 1: Steal the
This has an advantage of allowing standalone deriving for those data types that are declared elsewhere but not with Kind equivalents
data Maybe a = Nothing | Just a deriving (Kind) deriving instance (Kind Bool)
Option 2: Add an extra flag to the
data and kind Maybe a = Nothing | Just a
This has the problems of verbosity and is hard to apply after the fact to an existing data type.
Simple type synonyms have a natural analogy at the kind level that could be a useful feature to provde. Depending on whether we keep the kind and type namespaces separate (above) we could just abuse the current
type Foo = Either Baz Bang syntax to also allow creating
kind synonyms, or if we need to invent some new syntax.
kind Foo = Either Baz Bang would seen natural, or perhaps more safely
type kind Foo = Either Baz Bang.
newkind doesn't make sense to add as there is no associated semantics to gain at the type level that
data kind doesn't already provide.
TODO clean up, delete or something
data kind List k = Nil | Cons k (List k)
This lets us represent a list of types of any kind, and it seems natural to write a function to give us it's length:
type family ListLength :: List k -> Nat type instance ListLength Nil = Zero type instance ListLength (Cons _ rest) = Succ (ListLength rest)
So here we want to be able to index type families by a polykind.
TODO I'm not so sure about the motivation of this anymore...
However any data-level instantiations of the
List k must be monomorphic. E.g. Lists parameterised by the types of their elements:
data SafeList :: (List *) -> * where Nil :: SafeList Nil Cons :: a -> SafeList a rest -> SafeList (Cons a rest) safeHead :: SafeList (Cons a rest) -> a safeHead (Cons a _) = a
Or the value reflection of a bit list:
data StaticBitString :: List Bool -> * where BSTrue :: StaticBitString rest -> StaticBitString (Cons True rest) BSFalse :: StaticBitString rest -> StaticBitString (Cons False rest)
data Blah :: forall_kind k . List k -> Nat (Length k) -> * WitnessEmpty :: Blah Nil Zero WitnessSucc :: Blah rest n -> Blah (Cons ? rest) (Succ n)
What is the ? above?