Defining kinds without an associated datatype
This page tracks feature requests for declaring closed data kinds without associated data types (#6024 (closed)) and declaring open data kinds that can be freely extended after they are declared (#11080). What comes below is a design proposal that is not yet implemented (as of Jan 2015). Main person responsible for working on the implementation is Jan Stolarek (JS).
#6024 (closed))
Motivation for closed data kinds (When using -XDataKinds
GHC automatically promotes every datatype to a kind,
and its constructors to types. This forces us to declare a datatype for every
kind. However, sometimes we are not interested in the datatype at all, only on
the kind. Consider the following data kind that defines a small universe for
generic programming:
data Universe star = Sum (Universe star) (Universe star)
| Prod (Universe star) (Universe star)
| K star
This universe comes with an associated interpretation:
data Interpretation :: Universe * -> * where
L :: Interpretation a -> Interpretation (Sum a b)
R :: Interpretation b -> Interpretation (Sum a b)
Prod :: Interpretation a -> Interpretation b -> Interpretation (Prod a b)
K :: a -> Interpretation (K a)
In this case, having to declare a datatype for Universe
has two disadvantages:
-
We cannot use kinds (such as
*
) while defining a datatype, so we are forced to makeUniverse
a parametrised datatype, and later always instantiate this parameter to*
(like in the kind ofInterpretation
). Note: this is no longer the case - see below. -
We lose constructor name space, because the datatype constructor names will be taken, even though we will never use them to construct terms. So
Prod
andK
cannot be used as constructors ofInterpretation
as above, because those are also constructors ofUniverse
.
#11080)
Motivation for open data kinds (Users might want to create type-level symbols for the purpose of indexing types.
In the past one way of doing this was by using -XEmptyDataDecls
. But symbols
created in this way were always placed in *
and that does not allow to use
kinds to limit what types are admitted as indices. -XDataKinds
allows to
create symbols that are assigned a kind other than *
but these kinds are
closed and adding new symbols is not possible. Thus:
- We want a way of defining open kinds that can be later extended with new inhabitants.
Solution
I (JS) propose that the mechanism for declaring closed and open data kinds
becomes part of -XDataKinds
. The proposal is backwards compatible.
Closed kinds
Starting with GHC 8.0 users can use -XTypeInType
extension to write:
data Universe = Sum Universe Universe
| Prod Universe Universe
| K (*)
This addresses disadvantage (1) but still leaves us with disadvantage (2). So the idea behind #6024 (closed) is to let users define things like:
-- closed kind using H98 syntax
data kind Universe = Sum Universe Universe
| Prod Universe Universe
| K (*)
-- closed kind using GADTs syntax
data kind Universe where
Sum :: Universe -> Universe -> Universe
Prod :: Universe -> Universe -> Universe
K :: * -> Universe
By using data kind
, we tell GHC that we are only interested in the Universe
kind, and not the datatype. Consequently, Sum
, Prod
, and K
will be valid
only in types only, and not in terms.
Open kinds
Open data kinds would be declared using following syntax:
-- open kind
data kind open Universe
data kind member Sum :: Universe -> Universe -> Universe
data kind member Prod :: Universe -> Universe -> Universe
data kind member K :: * -> Universe
Note that open kinds can be parametrized just like closed kinds:
data kind open Dimension :: *
data kind member Length :: Dimension
data kind open Unit :: Dimension -> *
data kind member Meter :: Unit 'Length
data kind member Foot :: Unit 'Length
Caveats
Kind and Type Namespaces
Currently GHC has separate namespaces for types and data constructors. We have
a simple rule: all data constructors go into data namespace. With -XDataKinds
promoted data constructors still live in data constructor namespace and there is
a hack in the renamer: when renaming types it first looks for a symbol in type
namespace and if that fails then it searches for the symbol in the data
namespace.
Assume we have:
data kind Foo = MkFoo
In order to resolve disadvantage (2), ie. not pollute data constructor namespace
with MkFoo
, we would have to put MkFoo
in the type namespace. This means
that our simple rule "data constructors go into data namespace" would have to be
broken.
Richard Eisenberg argues
that this is bad and in the case of above declaration
MkFoo
should go into data namespace. But that does not solve disadvantage (2)
and thus misses the point of #6024 (closed) (given that disadvantage (1) is already
solved by -XTypeInType
). Richard also argues that members of an open data
kind should also be placed in data namespace. Putting MkFoo
into data
namespace will also allow us to have quite good error messages from the
typechecker, rather than cryptic error messages from the renamer about things
being out of scope.
Non-promotable data types?
Let's assume for a moment that we decide to place kind-only constructors in the
type namespace (ie. not follow Richard's proposal). Consider again the example
of Universe
kind and Interpretation
data type. Enabling -XTypeInType
makes GADTs promotable. This means that data constructors K
and Prod
if
Interpretation
data type could be validly used in types. But this would lead
to name collission with K
and Prod
constructors of Universe
kind. There
would be no way of disambiguating whether K
refers to constructor of
Universe
or promoted constructor of Interpretation
. We don't want to end up
in a situation where some of the data constructors can be promoted (L
, R
)
and some can't (K
, Prod
). So we would need to make Interpretation
data
type unpromotable. But detecting that seems Real Hard.
Recursive Groups
We need to be careful about recursive groups. For example, this is valid:
data S = S T
data T = T S
but this is not:
data kind S = S T
data T = T S
Future-proofing the design
GHC is growing more and more type level symbols. These symbols vary in their properties like generativity, injectivity, matchability or being open/closed - see 9840#comment:6 for an overview. Here we propose adding yet another way of defining symbols. Can we introduce more order into world of type-level symbols? Can we have some unifying syntax? Can we anticipate what kind of symbols we might want to have in the future?
Alternative Notations
- Use
data only
instead ofdata type
. - Use
'data
instead ofdata kind
- Use
type data
instead ofdata kind
- Use
data constructor
instead ofdata kind member
- Use
data extension Unit where { Meter :: Unit; Foot :: Unit }
instead ofdata kind member