Changes between Version 6 and Version 7 of Roles


Ignore:
Timestamp:
Dec 17, 2013 4:18:54 PM (21 months ago)
Author:
goldfire
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • Roles

    v6 v7  
    11= Roles =
    22
    3 The idea of ''roles'' comes from the paper [http://www.seas.upenn.edu/~sweirich/papers/popl163af-weirich.pdf Generative Type Abstraction and Type-level Computation], published at POPL 2011. The implementation of roles in GHC, however, is somewhat different than stated in that paper. This page focuses on the user-visible features of roles. RolesImplementation talks about the implementation in GHC. See also [http://typesandkinds.wordpress.com/2013/08/15/roles-a-new-feature-of-ghc/ Richard's blog post about roles].
     3The idea of ''roles'' comes from the paper [http://www.seas.upenn.edu/~sweirich/papers/popl163af-weirich.pdf Generative Type Abstraction and Type-level Computation], published at POPL 2011. The implementation of roles in GHC, however, is somewhat different than stated in that paper. This page focuses on the user-visible features of roles. RolesImplementation talks about the implementation in GHC. See also [http://typesandkinds.wordpress.com/2013/08/15/roles-a-new-feature-of-ghc/ Richard's blog post about roles]. (Note: some aspects of that blog post are out of date, as of December 17, 2013.)
    44
    55== The problem we wish to solve ==
    66
    7 GHC has had a hole in its type system for several years, as documented in #1496, #4846, #5498, and #7148. The common cause behind all of this is the magic behind `-XGeneralizedNewtypeDeriving`. Here is an example:
     7GHC has had a hole in its type system for several years, as documented in #1496, #4846, #5498, and #7148. The common cause behind all of this is the magic behind `-XGeneralizedNewtypeDeriving` (GND). Here is an example:
    88
    99{{{
     
    2727A newtype is a new algebraic datatype that wraps up exactly one field (in our example, of type `Int`). Yet, the semantics of Haskell makes a guarantee that wrapping and unwrapping a value (with `MkAge` or `unAge`) has no runtime cost. Thus, internally, we must consider `Age` to be wholly equivalent to `Int`.
    2828
    29 The problem with this idea comes with type families. (There are other ways to tickle the bug, but one example is enough here.) A type family can branch on ''Haskell'' type, and of course, in Haskell (unlike in the internals of a compiler), `Age` is most certainly ''not'' `Int`. (If it were, newtypes would be useless for controlling instance selection, a very popular use case.) So, in our example, we see that `Inspect Age` is `Int`, but `Inspect Int` is `Bool`. Now, note the type of `bad`, the method in class `BadIdea`. When passed an `Int`, `bad` will return a `Bool`. When passed an `Age`, `bad` will return an `Int`. What happens on the last line above, when we use !GeneralizedNewtypeDeriving? Internally, we take the existing instance for `Int` and just transmogrify it into an instance for `Age`. But, this transformation is very dumb -- because `Age` and `Int` are the same, internally, the code for the `Age` instance and the code for the `Int` instance are the same. This means that when we call `bad (MkAge 5)`, we run `5` through the existing implementation for `bad`, which produces a `Bool`. But, of course, the type of `bad (MkAge 5)` is `Int`, and so we have effectively converted a `Bool` to an `Int`. Yuck.
     29The problem with this idea comes with type families. (There are other ways to tickle the bug, but one example is enough here.) A type family can branch on ''Haskell'' type, and of course, in Haskell (unlike in the internals of a compiler), `Age` is most certainly ''not'' `Int`. (If it were, newtypes would be useless for controlling instance selection, a very popular use case.) So, in our example, we see that `Inspect Age` is `Int`, but `Inspect Int` is `Bool`. Now, note the type of `bad`, the method in class `BadIdea`. When passed an `Int`, `bad` will return a `Bool`. When passed an `Age`, `bad` will return an `Int`. What happens on the last line above, when we use GND? Internally, we take the existing instance for `Int` and just transmogrify it into an instance for `Age`. But, this transformation is very dumb -- because `Age` and `Int` are the same, internally, the code for the `Age` instance and the code for the `Int` instance are the same. This means that when we call `bad (MkAge 5)`, we run `5` through the existing implementation for `bad`, which produces a `Bool`. But, of course, the type of `bad (MkAge 5)` is `Int`, and so we have effectively converted a `Bool` to an `Int`. Yuck.
    3030
    3131== The solution ==
     
    5959}}}
    6060
    61 is parametric in its first parameter but not its second. For reasons that will soon become clear, we say that a parametric type variable has role representational and a non-parametric one has role nominal.
     61is parametric in its first parameter but not its second. We say that a parametric type variable has a representational role and a non-parametric one has a nominal role.
    6262
    63 With this notion in place, we simply need the following rule: '''!GeneralizedNewtypeDeriving works only with classes whose last type parameter is at role representational.'''
     63== `Coercible` ==
     64
     65The libraries with GHC 7.8 offer a new class
     66
     67{{{
     68class Coercible a b where
     69  coerce :: a -> b
     70}}}
     71
     72The idea is that a `Coercible` instance exists allowing coercions between any two types that are representationally equal. A programmer can then use `coerce` to get between the types. The instances themselves are magically generated as necessary; it is not allowed for programmers to declare their own `Coercible` instances. So, we have `Coercible Age Int` but never `Coercible Bool Int`.
     73
     74The reason we need roles is to describe how these representational equalities (or, equivalently, `Coercible` instances) "lift" through other types. For example, is `[Age]` representationally equal to `[Int]`? Sure. But, is `GADT Age` representationally equal to `GADT Int`? I hope not!
     75
     76The rule is this: we have `instance Coercible a b => Coercible (T a) (T b)` if and only if the first parameter has a representational role. Thus, we have `instance Coercible a b => Coercible [a] [b]` but not `instance Coercible a b => Coercible (GADT a) (GADT b)`. This generalizes straightforwardly when there are multiple parameters, and it's worth noting that `Coercible` is always reflexive, even when nominal roles are involved.
     77
     78== !GeneralizedNewtypeDeriving implemented using `coerce` ==
     79
     80Now that we have all of this `Coercible` machinery, we can define the behavior of GND in terms of it -- we simply `coerce` each method of the derived class. For example:
     81
     82{{{
     83newtype RestrictedIO a = MkRIO { unRIO :: IO a }
     84  deriving Monad
     85}}}
     86
     87generates
     88
     89{{{
     90instance Monad RestrictedIO where
     91  return = coerce (return :: a -> IO a) :: forall a. a -> RestrictedIO a
     92  (>>=) = coerce ((>>=) :: IO a -> (a -> IO b) -> IO b) :: forall a b. RestrictedIO a -> (a -> RestrictedIO b) -> RestrictedIO b
     93  fail = coerce (fail :: String -> IO a) :: forall a. String -> RestrictedIO a
     94}}}
     95
     96Note that each of these is just a call to `coerce` over the method in the instance for the newtype's representation type (in this case, `IO a`). All those type annotations are necessary to make sure that the type checker does the right conversion (and that scoped type variables are bound appropriately).
     97
     98Putting all of this together, GND works exactly when each of the methods being derived is `Coercible` into the new type.
    6499
    65100== Phantom parameters ==
    66101
    67 It turns out that a third role is also useful (though unnecessary for type soundness): the phantom role. It is often the case that programmers use type variables simply to constrain the type checker, not to make any statement about the runtime representation of a type. For example `data Phant a = MkPhant Int`. Because `a` doesn't appear on the right-hand side, we say that `a` is at role phantom. Why is this nice? Because it allows us to say that, say, `Phant Int` and `Phant Bool` are representationally equal, because they really do have the same representation.
     102It turns out that a third role is also useful (though unnecessary for type soundness): the phantom role. It is often the case that programmers use type variables simply to constrain the type checker, not to make any statement about the runtime representation of a type. For example `data Phant a = MkPhant Int`. Because `a` doesn't appear on the right-hand side, we say that `a` is at role phantom. Why is this nice? Because it allows us to say that, say, `Phant Int` and `Phant Bool` are representationally equal, because they really do have the same representation. Thus, there would be `instance Coercible (Phant a) (Phant b)` for any `a` and `b`.
    68103
    69104== Role inference ==
    70105
    71106How do we know what role a type parameter should have? We use role inference! We start with a few base facts: `(->)` has two representational parameters; `(~)` has two nominal parameters; and all type families' parameters are nominal. Then, we just propagate the information. By defaulting parameters to role phantom, any parameters unused in the right-hand side (or used only in other types in phantom positions) will be phantom. Whenever a parameter is used in a representational position (that is, used as a type argument to a constructor whose corresponding variable is at role representational), we raise its role from phantom to representational. Similarly, when a parameter is used in a nominal position, its role is upgraded to nominal. We never downgrade a role from nominal to phantom or representational, or from representational to phantom. In this way, we infer the most-general role for each parameter.
     107
     108The exception to the above algorithm is for classes: all parameters for a class default to a nominal role. This is because we generally consider, say, `Ord Age` and `Ord Int` to be quite distinct, even if their representation is the same under the hood. Changing the behavior of type classes is a major use case for newtypes, and we wouldn't want to subvert that!
    72109
    73110== Role annotations ==
     
    97134The other place where role annotations may be necessary are in .hs-boot files, where the right-hand sides of definitions can be omitted. As usual, the types/classes declared in an .hs-boot file must match up with the definitions in the .hs file, including down to the roles. The default role is representational in hs-boot files, corresponding to the common use case. Note that this '''will break code'''. But, the change is necessary to close the type-safety hole discussed above.
    98135
    99 Role annotations will be allowed on type variables in `data`, `newtype`, and `class`, declarations. They will not be allowed on type/data family declarations or in explicit `forall`s in function type signatures.
     136Role annotations are allowed on type variables in `data`, `newtype`, and `class`, declarations. They will not be allowed on type/data family declarations or in explicit `forall`s in function type signatures.
    100137
    101138== Roles and `Traversable` ==
     
    110147According to the rules for roles, the parameter `t` must be at role nominal, as it is used as a parameter to the type variable `f`. We must account for the possibility that `f` will be instantiated with a type whose last parameter is at role nominal, so we force `t` to be at role nominal as well.
    111148
    112 This means that !GeneralizedNewtypeDeriving (hereafter "GND") no longer works with Traversable. But, !DeriveTraversable ''does'' still work. However, GHC previously preferred using GND over !DeriveTraversable when both were available, which assumedly produced the same code. How is this all possible? If GND doesn't work anymore, is there something wrong with !DeriveTraversable? The answer is that GND and !DeriveTraversable make ''different'' instances, contrary to expectations. The reason is that !DeriveTraversable has to use `fmap` to cast the result of `traverse` from the representation type back to the newtype. According to the Functor laws, `fmap`ping this cast should be a no-op (the law of interest is `fmap id == id`). But, if that law is not obeyed, `fmap`ping the cast may change the result of the `traverse`. Contrast this with a GND instance, which magically casts the result, without using `fmap`. If the Functor law is not obeyed, these two instances have different behavior.
     149This means that GND no longer works with Traversable. But, !DeriveTraversable ''does'' still work. However, GHC previously preferred using GND over !DeriveTraversable when both were available, which assumedly produced the same code. How is this all possible? If GND doesn't work anymore, is there something wrong with !DeriveTraversable? The answer is that GND and !DeriveTraversable make ''different'' instances, contrary to expectations. The reason is that !DeriveTraversable has to use `fmap` to cast the result of `traverse` from the representation type back to the newtype. According to the Functor laws, `fmap`ping this cast should be a no-op (the law of interest is `fmap id == id`). But, if that law is not obeyed, `fmap`ping the cast may change the result of the `traverse`. Contrast this with a GND instance, which magically casts the result, without using `fmap`. If the Functor law is not obeyed, these two instances have different behavior.
    113150
    114151Despite this, I believe that using GND with `Traversable` is indeed type-safe. Why? Because of the parametricity guaranteed in `Functor` and `Applicative`. The reason GND is prohibited with `Traversable` is that we are worried `f`'s last parameter will be at role nominal. While it is possible to write `Functor` and `Applicative` instances for such a type, the methods of those classes can't really use the any constructors that force the role to be nominal. For example, consider this: