Version 3 (modified by goldfire, 3 years ago) (diff)


Update to Roles

It has become (somewhat) clear that the Roles mechanism as implemented in GHC 7.8 is insufficient. (See examples below.) This page is dedicated to creating a new design for roles that might fix the problems.

Problem examples

We have three known examples of where current roles are failing us.

Adding join to Monad

As part of the Applicative-Monad Proposal, we wish to add join to Monad, thus:

class Applicative m => Monad m where
  join :: forall a. m (m a) -> m a

This is all well and good, and would work with GeneralizedNewtypeDeriving (GND) most of the time. But, consider this:

newtype T m a = T (m a)
  deriving (Functor, Applicative, Monad)

readRef :: MonadIO m => IORef a -> T m a
readRef = T . liftIO . readIORef

The designer of T will export it abstractly, allowing only the reading of IORefs in the T monad but not other operations.

Sadly, the use of GND fails here, with join in the Monad class. Here is the relevant generated code:

instance Monad m => Monad (T m) where
  join = coerce (join :: m (m a) -> m a) :: forall a. T m (T m a) -> T m a

Thus, GHC must derive Coercible (m (m a) -> m a) (T m (T m a) -> T m a). This requirement reduces (in part) to Coercible (m (m a)) (T m (T m a)). We try to solve this by applying a newtype-unwrapping instance, Coercible x (m a) => Coercible x (T m a). Then, we must solve Coercible (m (m a)) (m (T m a)). And here, we are stuck. We do know Coercible (m a) (T m a), but we don't know the role of m's parameter, and must assume (for safety) that it could be nominal. Thus, we can't solve for Coercible (m (m a)) (m (T m a)).

This problem would occur if join were in Monad and a programmer used GND on any monad transformer. This is a common enough idiom to make us want to fix the situation.

The MVector class

A redesign of the vector package is underway, introducing the Vector class, among other changes. Here is the offending method:

class (...) => Vector v a where
  basicUnsafeIndexM :: Monad m => v a -> Int -> m a

Here, a is the type of the thing stored in the vector, and it is natural to want to coerce a vector of Ints to a vector of Ages. But, GND would not work here, for very similar reasons to the case above -- we won't be able to coerce m Int to m Age, because we don't know enough about m.

The acme-schoenfinkel package

This next example is the one known case of type-safe code that existed before GHC 7.8 that does not work with GHC 7.8's roles. The package acme-schoenfinkel-0.1.1 package (by Ertugrul Söylemez) defines

    WrappedSchoenfinkel {
      unwrapSchoenfinkel :: cat a b
    deriving (Alternative, Applicative, Arrow, ArrowApply,
              ArrowChoice, ArrowLoop, ArrowPlus, ArrowZero,
              Category, Functor)

GND fails on ArrowApply:

class Arrow a => ArrowApply (a :: * -> * -> *) where
  app :: forall b c. a (a b c, b) c

The problem here echoes the join problem quite closely.

The best known solution

Edward Kmett initially described an approach in an email, roughly as follows (names subject to bikeshedding, as always):

Currently, Data.Type.Coercion makes this definition:

data Coercion a b where
  Coercion :: Coercible a b => Coercion a b

Building on that, define a class Rep like this:

class Rep (f :: k1 -> k2) where
  co :: Coercible a b => Coercion (f a) (f b)

The idea is that a type is in the Rep class if its next parameter is representational. Thus, we would have instances Rep Maybe, Rep [], Rep Either, Rep (Either a), etc. We would not have Rep G, where G is a GADT.

Using this, we can define GND over join thus (continuing the example above):

instance (Rep m, Monad m) => Monad (T m) where
  join   = case co :: Coercion (m (m a)) (m (T m a)) of
             Coercion -> coerce (join :: m (m a) -> m a)
           :: forall a. T m (T m a) -> T m a

This compiles without difficulty.

Of course, we need to bake this reasoning into the compiler and the existing Coercible solver, essentially with a Coercible "instance" like

instance (Rep f, Coercible a b) => Coercible (f a) (f b)

We also would want automatic generation of instances of Rep, not unlike the generation of instances for Coercible.

Open user-facing design questions

  1. How far should instance generation go? For example, for the type
newtype ReaderT r m a = ReaderT (r -> m a)

the instance

instance Rep m => Rep (ReaderT r m)

can be written. Should this be inferred? I (Richard) can imagine a beefed up role inference algorithm which could figure this out. But, perhaps there exist harder cases that would not be inferrable.

  1. Should users be able to write instances for Rep by hand? They cannot do so for Coercible.
  1. What should the method in Rep be? Coercion a b -> Coercion (f a) (f b), Coercible a b => Coercion (f a) (f b), and Coercible a b => Coercible (f a) (f b) (definable internally) are all contenders.
  1. Should we do something analogous for phantom roles?

Open implementation questions

  1. Currently, all coercions (including representational ones) are unboxed (and thus take up exactly 0 bits) in a running program. (We ignore -fdefer-type-errors here.) But, Core has no way of expressing functions in the coercion language, and the co method above essentially desugars to a coercion function. Either we have to add functions to the language of coercions, or we have to keep the coercions generated by Rep instances boxed at runtime, taking of the space of a pointer and potentially an unevaluated thunk.
  1. The ReaderT example defined ReaderT as a newtype. The Rep instance shown is indeed writable by hand, right now. But, if ReaderT were defined as a data type, the Rep instance would be impossible to write, as there are no newtype-unwrapping instances. It seems a new form of axiom would be necessary to implement this trick for data types. This axiom would have to be produced at the data type definition, much like how newtype axioms are produced with newtype definitions.
  1. The Coercible solver is getting somewhat involved already (#9117, #9131)