|Version 3 (modified by chak, 9 years ago) (diff)|
Type Functions: Desugaring
Desugaring indexed data types
The kind signature of an indexed data type
data T (a1::<kind1>) .. (an::<kindn>) :: <kind>
turns into an FC type function declaration
type T_n : <kind1> -> .. -> <kindn> -> <kind>
A member of an indexed data type
data T t1 .. tn b1 .. bm = <constructors>
turns into an equality axiom and a vanilla data declaration
axiom cTinst : (forall c1..cr. T_n t1 .. tn) :=: (forall c1..cr. Tinst c1 .. cr) data Tinst c1 .. cr b1 .. bm = <constructors>
where the ci are the free variables of the tj. Moreover, we morally replace all occurences of T in the rest of the program by T_n. No such replacement is required in the actual implementation as the arity index at FC type functions is just a formal device used in the formal development. In the implementation, it is perfectly fine to retain the original name and maintain the arity information separately.
Neverthless, we need to generate a new name for the vanilla data types representing family members (i.e., Tinst above). We use a similar mechanism as for the generation of the dictionary type constructors of type classes. In particular, we generalise the field algTcClass of the internal representation for datatypes, TyCon.AlgTyCon, to be three valued: none, Class for data types representing dictionaries, and <which structure?> for data types representing members of a family.
To ensure that the FC code generated by the above desugaring still type checks, we need to introduce cast expressions using cTinst to move between the indexed type T_n and the representation types, such as Tinst, of its members. The simplicity of type checking and desugaring indexed data types - as opposed to general type functions - is due to the locations where these casts need to be added being well defined. More precisely, there are two different kinds of locations corresponding to the introduction and elimination of indexed data types:
- Wrappers for data constructors introduce indexed types.
- Case expressions scrutinising indexed types eliminate them.
Wrappers for indexed data types
The wrapper of a data constructor acts as an impedance matcher between the source-level signatures of the constructor and its actual representation; in particular, it evaluates strict arguments and unboxes flattened arguments. In the case of a constructor for an indexed data type, it additionally has to apply the coercion between the type function representing the source type and its representation as a vanilla data type. So, for example, if we have (continuing the example from above)
data T t1 .. tn b1 .. bm = C s1 .. sk
then we generate a wrapper
C = /\c1..cr b1..bm -> \x1..xk -> Con C [c1,..,cr,b1,..,bm] [x1,..,xk] |> sym (cTinst@c1..@cr b1 .. bm)
The generation of constructor wrappers is performed by MkId.mkDataConIds.
Case expressions for indexed data types
When we scrutinise an indexed type in a case expression, we need to first cast it to the vanilla data type representing the family member from which the constructors guarding the alternatives are drawn. (This implies that we cannot have any case expression mixing constructors from two or more family members. In fact, if we had that capability, we would have open GADT definitions in the Löh/Hinze sense.)
So, whether we need to cast the scrutinee of a case expression depends on the constructors appearing in the alternatives, which are type checked by TcPat.tcConPat. This function uses TcUnify.boxySplitTyConApp to match the type of the scrutinee against the result type of the data constructor. In the case of GADTs and indexed types, this is not just a matter of extracting the arguments from the type constructor application, but we need to match against type patterns. This matching is already conveniently performed by the code for GADTs.
If the data constructor is from an indexed type, we need to propagate a coercion (to be applied to the scrutinee) outwards. For this, GHC also already has a mechanism, namely the variant CoPat of HsPat.Pat. It enables us to attach a coercion function, of type HsBinds.ExprCoFun, to a pattern, which the desugarer will pick up in Match.matchCoercion and apply to the match variable of the case expression.
ExprCoFun represents, besides coercions due to type instantiation, also type equality coercions of type Coercion.Coercion. We use them for coercions that are exactly the converse of the coercion used in the wrapper of the data constructor of the current case alternative. (There is also an equivalent of CoPat for expressions, namely HsCoerce of HsExpr.HsExpr.)
Representation of type functions after type checking
Type functions have a number of properties in common with class instances; in particular, they require a machinery for matching type patterns against types, as instance heads do during context simplification. Hence, we probably want some structure similar to InstEnv.Instance for type functions - for instances this is maintained in the field iSpec of TcEnv.InstInfo (for type functions we don't need anything like iBinds as they are pure type-level entities). If possible, it would be ideal if we can reuse (or generalise) some of the matching machinery for instance heads.
The essentials of a module after type checking are in HscTypes.ModGuts; in particular, we have two fields mg_insts :: [Instance] and mg_binds :: [CoreRule] containing all instance heads and all rewrite rules respectively. Similarly, we now want something like mg_tyequa :: [TyEqua] to represent all type equations.
Refined idea: Instead of duplicating the InstInfo/Instance infrastructure for instances of indexed types, we could just add a second variant to InstInfo. This has the advantage that functions, such as tcInstDecls1, still only have to return a list of InstInfo and not two different lists.