Version 43 (modified by chak, 8 years ago) (diff)

--

## Closure conversion without indexed types

The following scheme approaches the problem of mixing converted and unconverted code from the point of view of GHC's Core representation, avoiding the use of classes as much as possible. In particular, the scheme gracefully handles any declarations that themselves cannot be converted, but occur in a converted module. The two essential ideas are that (1) we move between converted and unconverted values/code using a conversion isomorphism and (2) we treat unconverted declarations differently depending on whether or not they involve arrows; e.g., the definition of Int by way of unboxed values (which we cannot convert) doesn't prevent us from using Ints as is in converted code.

### Conversion status

To all TyCons, DataCons, and Ids, we add a value of type

data StatusCC a
= NoCC      -- Declaration has not been converted
| ConvCC a  -- Here is the converted version


For example, Id gets a field of type StatusCC Id. A declaration thisDecl can be in one of three categories:

• NoCC: We did not convert that declaration, either because it was declared in an unconverted module or because it uses some feature that prevents conversion.
• ConvCC thisDecl: Original and converted declaration coincide (e.g., type declarations not involving arrows directly or indirectly).
• ConvCC convDecl: The variant convDecl is the closure-converted form of thisDecl.

An example of a feature that prevents conversion are unboxed values. We cannot make a closure from a function that has an unboxed argument, as we can neither instantiate the parametric polymorphic closure type with unboxed types, nor can we put unboxed values into the existentially quantified environment of a closure.

TODO We won't really store StatusCC in TyCons, DataCons, and Ids, but instead have three finite maps maintaining the same information.

### Conversion pairs

Conversion functions come in pairs, which we wrap with the following data type for convenience:

data a :<->: b = (:<->:) {to :: a -> b, fr ::b -> a}


The functions witness the isomorphism between the two representations, as usual.

### Converting type declarations

#### Preliminaries

The alternatives of TyCon.TyCon get a new field tyConCC :: StatusCC TyCon. This field is NoCC for data constructors for which we have no conversion and ConvCC T_CC if we have a conversion, where the converted declaration T_CC may coincide with T. In the latter case, there is also a conversion constructor isoT between values inhabitating types formed from the original and converted constructor. The type of a conversion constructor is determined by the kind of the converted type, as follows:

isoTy (t::k1->k2) = forall a a_CC.
isoTy (a::k1) -> isoTy (t a::k2)
isoTy (t::*)      = t :<->: t^


where type conversion t^ is defined below.

As an example, consider

data T (f::*->*) = T1 (f Int) | T2 (f Bool)


The type of the conversion constructor is as follows (using more meaningful type variable names):

isoTy (T::(*->*)->*) =
forall f f_CC.
(forall a a_CC.
(a :<->: a_CC) -> (f a :<->: f_CC a_CC)) ->
T f :<->: T_CC f_CC


The conversion constructor might be implemented as

isoT isof = toT :<->: frT
where
toT (T1 x) = T1 (to (tff tfInt ) x)
toT (T2 y) = T2 (to (tff tfBool) y)
frT (T1 x) = T1 (fr (tff tfInt ) x)
frT (T2 y) = T2 (fr (tff tfBool) y)


where isoInt and isoBool are the conversion constructors for Ints and Bools.

Moreover, we represent closures - the converted form of function arrows - as follows:

data a :-> b = forall e. !(e -> a -> b) :$e isoArr :: a :<->: a_CC -- argument conversion -> b :<->: b_CC -- result conversion -> (a -> b) :<->: (a_CC :-> b_CC) isoArr (toa :<->: fra) (tob :<->: frb) = toArr :<->: frArr where toArr f = const (tob . f . fra) :$ ()
frArr (f :$e) = frb . f e . toa  So, the function array constructor (->)::*->*->* has a StatusCC value of ConvCC ((:->) with isoArr as the conversion constructor. Closure application is defined as ($:) :: (a :-> b) -> a -> b
(f :$e)$: x = f e x


#### Conversion rules

If a type declaration for constructor T occurs in a converted module, we need to decide whether to convert the declaration of T. We decide this as follows:

1. If the declaration of T mentions another algebraic type constructor S with tyConCC S == NoCC, we cannot convert T and set its tyConCC field to NoCC as well.
2. If all algebraic type constructors S that are mentioned in T's definiton have tyConCC S == ConvCC S, we do not convert T, but set its tyConCC field to ConvCC T and generate a suitable conversion constructor isoT. (NB: The condition implies that T does not mention any function arrows.)
3. If the declaration of T uses any features that we cannot (or for the moment, don't want to) convert, we set its tyConCC field to NoCC - except if Case 2 applies.
4. Otherwise, we generate a converted type declaration T_CC together a conversion constructor isoT, and set tyConCC to ConvCC T_CC. Conversion proceeds by converting all data constructors (see below).

Moreover, we handle other forms of type constructors as follows:

• FunTyCon: It's StatusCC value was defined above. We handle any occurence of the function type constructor like that of an algabraic type constructor with the StatusCC value given above, but we may not want to explcitly store that value in a field of FunTyCon, as (:->) would then probably need to go into TyWiredIn in.
• TupleTyCon: The StatusCC value of a tuple constructor T is ConvCC T; i.e., we don't need converted tuple type constructors. However, we need a suitable conversion functions isoT for all supported tuple types somewhere. Unfortunately, there are many tuple types, and hence, many conversion functions. An alternative might be to special case tuples during conversion generation and just inline the needed case construct.
• SynTyCon: Closure conversion operates on coreView; hence, we will see no synonyms. (Well, we may see synonym families, but will treat them as not convertible for the moment.)
• PrimTyCon: We essentially ignore primitive types during conversion. We assume their converted and unconverted form are identical, which implies that they never inhibit conversion and that they need no conversion constructors.
• CoercionTyCon and SuperKindTyCon: They don't categorise values and are ignored during conversion.

#### Converting data constructors

We convert data constructors by converting their argument types and their representation DataCon gets a new filed dcCC :: StatusCC DataCon. In particular, the signature of the worker is converted. However, in contrast to other functions, we only convert the argument and result types; the arrows tying them together are left intact. For example, if the original wrapper has the type signature

MkT :: (Int -> Int) -> Int


the converted wrapper is

MkT_CC :: (Int :-> Int) -> Int


As a consequence, whenever we convert a partial wrapper application in an expression, we need to introduce a closure on the spot.

We do not specially handle wrappers of data constructors. They are converted just like any other toplevel function.

#### Examples

For example, when we convert

data Int = I# Int#


the tyConCC field of Int is set to ConvCC Int and we have

isoInt :: Int :<->: Int
isoInt = toInt :<->: frInt
where
toInt (I# i#) = I# i#
frInt (I# i#) = I# i#


As another example, the tyConCC field of

data Maybe a = Nothing | Just a


has a value of ConvCC Maybe and we have

isoMaybe :: (a :<->: a_CC) -> (Maybe a :<->: Maybe a_CC)
isoMaybe isoa = toMaybe :<->: frMaybe
where
toMaybe isoa Nothing  = Nothing
toMaybe isoa (Just x) = Just (to isoa x)
frMaybe isoa Nothing  = Nothing
frMaybe isoa (Just x) = Just (fr isoa x)


### Converting classes and instances

We don't alter class and instance declarations in any way. However, the dictionary type constructors and dfuns are processed in the same way as other data types and value bindings, respectively; i.e., they get a StatusCC field and we generate converted versions and conversion constructors as usual.

As an example, assume Num Int were defined as

class Num a where
(+)    :: a -> a -> a
negate :: a -> a
instance Num Int where
negate = primNegateInt


with the Core code being

data Num a =
Num {
(+)    :: a -> a -> a,
negate :: a -> a
}
dNumInt = Num Int


Then, closure conversion gives us

data Num_CC a =
Num_CC {
(+_CC)    :: a :-> a :-> a,
negate_CC :: a :-> a
}
dNumInt_CC :: Num_CC Int  -- Int \equiv Int_CC
dNumInt_CC = Num_CC
(to isoIntToInt primNegateInt)
where
isoIntToIntToInt = isoArr isoInt isoIntToInt
isoIntToInt      = isoArr isoInt isoInt


### Converting type terms

We determine the converted type t^ of t as follows:

T^            = T_CC , if tyConCC T == ConvCC T_CC
= T    , otherwise
a^            = a_CC
(t1 -> t2)^   = t1 -> t2   , if kindOf t1 == #
or kindOf t2 == #
= t1^ :-> t2^, otherwise
(t1 t2)^      = t1^ t2^
(forall a.t)^ = forall a.t^


Here some examples,

(Int -> Int)^           = Int :-> Int
(forall a. [a] -> [a])^ = [a] :-> [a]
([Int -> Int] -> Int)^  = [Int :-> Int] :-> Int
(Int# -> Int# -> Int#)^ = Int# -> Int# -> Int#
((Int -> Int) -> Int#)^ = (Int -> Int) -> Int#
(Int -> Int -> Int#)^   = Int :-> (Int -> Int#)


Why do we use (t1 -> t2)^ = t1 -> t2 when either argument type is unboxed?

### Converting value bindings

#### Toplevel

When converting a toplevel binding for f :: t, we generate f_CC :: t^.

The alternatives GlobalId and LocalId of Var.Var get a new field idCC :: StatusCC Id whose values, for a declaration f, we determine as follows:

• If Id's declaration uses any features that we cannot (or currently, don't want to) convert, set idCC to NoCC.
• If all type constructors involved in f's type are marked NoCC or AsIsCC, we set f's idCC field to AsIsCC.
• Otherwise, convert f and set its ifCC field to ConvCC f_CC.

chak: revision front

### Converting core terms

Apart from the standard rules, we need to handle the following special cases:

• We come across a value variable v where idCC v == NoCC whose type is t: we generate convert t v (see below).
• We come across a case expression where the scrutinised type T has tyConCC T == NoCC: we leave the case expression as is (i.e., unconverted), but make sure that the idCC field of all variables bound by patterns in the alternatives have their idCC field as NoCC. (This implies that the previous case will kick in and convert the (unconverted) values obtained after decomposition.)
• Whenever we have an FC cast from or to a newtype T, where tyConCC T == NoCC, we need to add a convert tau or trevnoc tau, respectively. We can spot these casts by inspecting the kind of every coercion used in a cast. One side of the equality will have the newtype constructor.
• We come across a dfun: If its idCC field is NoCC, we keep the selection as is, but apply convert t e from it, where t is the type of the selected method and e the selection expression. If idCC is ConvCC d_CC, and the dfun's class is converted, d_CC is fully converted. If it's class is not converted, we also keep the selection unconverted, but have a bit less to do in convert t e. TODO This needs to be fully worked out.

### Generating conversions

Whenever we had convert t e above, where t is an unconverted type and e a converted expression, we need to generate some conversion code. This works roughly as follows in a type directed manner:

convert T          = id   , if tyConCC T == NoCC or AsIsCC
= to_T , otherwise
convert a          = id
convert (t1 t2)    = convert t1 (convert t2)
convert (t1 -> t2) = createClosure using (trevnoc t1)
and (convert t2) on argument and result resp.


where trevnoc is the same as convert, but using from_T instead of to_T.

The idea is that conversions for parametrised types are parametrised over conversions of their parameter types. Wherever we call a function using parametrised types, we will know these type parameters (and hence can use convert) to compute their conversions. This fits well, because it is at occurences of Ids that have idCC == NoCC where we have to perform conversion.

The only remaining problem is that a type parameter to a function may itself be a type parameter got from a calling function; so similar to classes, we need to pass conversion functions with every type parameter. So, maybe we want to stick fr and to into a class after all and requires that all functions used in converted contexts have the appropriate contexts in their signatures.

### TODO

#### Examples

Have an example with two modules one unconverted, where the converted imports the unconverted.

Also have an example that motivates why we have to vectorise/CC declarations such as Int.

#### Conversion functions

Similar to HasGenerics and instead of storing Id of conversion constructors, we can derive from the name of the TyCon.

#### Data constructors

How to exactly handle the worker and wrapper? Can we replace arrows by closure types in the worker? Or do we always have to add a wrapper?

Simpler''' Don't try to make a complete cloned data constructor. By the time of CC, its all just Core and so wrappers are just like any other global function.

#### Original functions

The previous story was that when vectorising f and generating f_CC, we now define

f :: tau
f = trevnoc tau f_CC


Now, with the approximate conversion scheme above, we may not have trevnoc tau. In this case, we still generate f_CC, but also leave the rhs of f alone (i.e., compile the original functions).

When we give up on converting a complete right-hand side, we still want to convert all subexpressions that we can convert.