Overlapping type families, segafult

changed weight to 5

If we make the associated data instance an associated type instance, the module fails to compile. Perhaps it should fail too for data families.

main = print (makeD :: D (String, String))

this seems to be ambiguous wrt which D constructor is chosen, right? (at least wrt the instance of class C and thence makeD right?)

Great bug! I know exactly what is going on.

There is a long-standing awkwardness in GHC’s implementation of newtype, described in Note [Newtype eta] in TyCon, which I reproduce below. The core of it comes down to this. Consider

newtype Parser a = MkParser (IO a) derriving( Monad )

Do we have (Monad Parser ~R Monad IO)? These are the types of the Monad dictionaries for Parser and IO resp. In the olden days to implement Generalised Newtype Deriving (GND) we coerced directly between these dictionary types, so we really needed them to be ~R. As a result, the axiom induced by the above newtype had to be

	ax :: Parser ~R IO

and not

	ax a :: Parser a  ~R  IO a

So we had to eta-reduce axioms.

This is a royal pain. And we have to do it for newtype instances too. See Note [Eta reduction for data family axioms] in TcInstDcls. And it turns out that the new, serious bug #9371 (closed) is directly attributable to this royal pain. (Just to record how: FamInstEnv.lookup_fam_inst_env' goes to some trouble to match up arities; but the my_unify function passed to it by lookupFamInstEnvConflicts simply bypasses all this, and as a result misses the conflict.)

Side note: our Safe coercions” paper does not say in Section 2 how to deal with data/newtype instances.

NOW, in our new story for GND, we never coerce between Monad Parser and Monad IO. Instead we coerce the individual methods (Section 7 of the paper). Why? I think it’s because our roles are not higher-kinded, but we didn’t document the change. Does anyone else remember? I think there were actually multiple reasons.

However this change means that we DON’T need (IO ~R Parser) any more! Only (IO ty ~R Parser ty). So the reason for the eta reduction seems to have gone away.

Do you agree? If so I could simplify GHC by removing the complicated eta stuff. I would rather do that than fix #9371 (closed) by adding more complexity

I see that this was actually proposed in #8503 (closed), but the thread diverged onto other matters. Richard’s last #9371 (closed) said “So we don't need eta-reduction as much as maybe we once did, but it's still useful.” but provided no evidence in support of this claim.

I somehow think this is connected to #9123 (closed) but I can’t put my finger on why.

Simon

Note [Newtype eta]
~~~~~~~~~~~~~~~~~~
Consider
        newtype Parser a = MkParser (IO a) derriving( Monad )
Are these two types equal (to Core)?
        Monad Parser
        Monad IO
which we need to make the derived instance for Monad Parser.

Well, yes.  But to see that easily we eta-reduce the RHS type of
Parser, in this case to ([], Froogle), so that even unsaturated applications
of Parser will work right.  This eta reduction is done when the type
constructor is built, and cached in NewTyCon.  The cached field is
only used in coreExpandTyCon_maybe.

Here's an example that I think showed up in practice
Source code:
        newtype T a = MkT [a]
        newtype Foo m = MkFoo (forall a. m a -> Int)

        w1 :: Foo []
        w1 = ...

        w2 :: Foo T
        w2 = MkFoo (\(MkT x) -> case w1 of MkFoo f -> f x)

After desugaring, and discarding the data constructors for the newtypes,
we get:
        w2 :: Foo T
        w2 = w1
And now Lint complains unless Foo T == Foo [], and that requires T==[]

This point carries over to the newtype coercion, because we need to
say
        w2 = w1 `cast` Foo CoT

so the coercion tycon CoT must have
        kind:    T ~ []
 and    arity:   0

Trac metadata

Trac field	Value
CC	→ Conor.McBride@cis.strath.ac.uk, dimitris@microsoft.com, goldfire, nomeata, sweirich@cs.upenn.edu

At first reading, it seemed that ticket:9371#comment:86449 did not relate to this ticket. But, looking closer, I see how. I'll explain for others' benefit:

The line data D x = D1 (Either x ()) becomes somewhat like the two declarations

data DFInst1 x = D1 (Either x ())
type D x = DFInst1 x

We create a proper datatype DFInst1 that has no eta-reduction or anything strange, really, other than the fact that the user can't ever write its name. Data families are internally interpreted somewhat like type families, and thus we create the connection between D and DFInst1. However, as Simon explains, the second of my definitions above is eta-reduced.

When processing data D (x, y) = D2 (x, y), the relevant type-ish instance is D (x, y) = DFInst2 x y, which cannot be eta-reduced. Then, when comparing the instances D = DFInst1 and D (x, y) = DFInst2 x y, the conflict-checker says that the left-hand sides (<empty> and (x,y), respectively) are quite surely apart, because they are of different lengths!

There is a fix here that involves much less theory than Simon's approach: fix the conflict checker to take this into account. There should never be different lengths involved here (absent the eta-reduction), so when there are, we could just note a conflict. Even simpler, we could notice somehow that a data family is involved and spit out a conflict right away, as two data family instances are never compatible.

That said, Simon asks a worthwhile question: do we need eta-reduction given the new implementation of GND? I think "yes":

import Control.Monad.Trans.Identity

newtype MyList a = MkList [a]
class C m where
  x :: IdentityT m Int
instance C [] where
  x = IdentityT [1,2,3]
deriving instance C MyList

This (silly) example requires the eta-reduction to work, and I don't see a compelling reason to reduce expressivity here. Yes, it would simplify the code somewhat, but I still don't think it's worth it.

In answer to Simon's "why not cast Monad Parser to Monad IO": a lack of higher-kinded roles was a big part of the discussion, but there was another: superclasses. It's possible that the GND'd class has superclasses and that the superclass instances for the original type and the newtype are different. If we casted the whole instance, we would get the superclass instances from the original type, which would be quite strange. Indeed, before making any change related to roles, the old GND code also did not cast instances, because of this superclass issue.

"Fix the conflict checker" is clearly an option. It's what I described as "adding more complexity".

Moreover, let us note that GHC's current behaviour is not documented in the user manual or in any paper. If you are arguing to stick with it, we should at least write it up! Any volunteers?

Moreover, doesn't your example work perfectly well without eta reduction? To typecheck the deriving instance we need

  IdentityT [] Int  ~R  IdentityT MyList Int

Now, using the Coercible instances in Fig 2 of the paper, we can reduce that to

  [Int] ~R  MyList Int

and that is true with the non-eta-reduced axiom.

If IdentityT's data constructor was not in scope, then indeed the program will still typecheck -- but arguably doing so exposes something about the representation of the newtype, harming abstraction. You certainly couldn't have written it be hand.

It all seems quite debatable to me. Do we really want a significant cluster of complexity in the implementation, to implement an un-documented feature, which no one is asking for, and whose very existence is debatable?

Replying to [ticket:9371#comment:86455 simonpj]:

Moreover, doesn't your example work perfectly well without eta reduction? To typecheck the deriving instance we need
IdentityT [] Int  ~R  IdentityT MyList Int
Now, using the Coercible instances in Fig 2 of the paper, we can reduce that to
[Int] ~R  MyList Int
and that is true with the non-eta-reduced axiom.

If IdentityT's data constructor was not in scope, then indeed the program will still typecheck -- but arguably doing so exposes something about the representation of the newtype, harming abstraction. You certainly couldn't have written it be hand.

I hadn't considered the newtype-unwrapping axiom when writing the example. You're right -- we don't need eta reduction. But, if IdentityT were a datatype instead of a newtype, we would, without further changes to the example. The "harming abstraction" note is exactly the debate about whether or not we should be able to coerce Map String Int to Map String Age without Map's constructor being in scope.

It all seems quite debatable to me. Do we really want a significant cluster of complexity in the implementation, to implement an un-documented feature, which no one is asking for, and whose very existence is debatable?

It still makes me nervous to reduce expressiveness -- I just don't want users of 7.10 to suddenly lose something, post here that they need it back, and then have it restored for 7.12, leaving a lot of CPP in their code.

OK I agree that the same issue arises if IdentityT was a data type, so my point about newtype unwrapping for IdentityT was a bit of a red herring.

And yes I agree that this "silly example" is indeed a case where eta-reduction of newtype axioms (and data/newtype instance!) axioms is useful. And since the code is implemented, I suppose that the shortest path is to fix the conflict detector.

But it does hurt me that we have NO formalised description of the eta-reduction story in any of our papers.

Simon

assigned to @rae

So, to be concrete, Richard might you do this?

It's not clear that two equations conflict just because their LHSs differ. Eg

   F x = Maybe
   F x Int = Maybe Int

But it's certainly safe to say "there's a conflict", and in fact the lengths only differ for data families in which case each RHS is a fresh data type.

This is somewhat urgent because what we have now is actually unsound.

If you can't do it, yell, and I will.

Simon

mentioned in commit f29bdfbc

mentioned in commit a09508b7

changed milestone to %7.8.4

Trac metadata

Trac field	Value
Test case	→ indexed-types/should_fail/T9371

added 1 deleted label

changed milestone to %7.10.1

Moving (in bulk) to 7.10.4

closed

removed 1 deleted label

changed milestone to %7.8.4

Merged to 7.8.4

Trac metadata

Trac field	Value
Resolution	Unresolved → ResolvedFixed

mentioned in issue #15704 (closed)

added runtime crash label

added Pnormal label

Trac field	Value
Version	7.8.3
Type	Bug
TypeOfFailure	OtherFailure
Priority	normal
Resolution	Unresolved
Component	Compiler
Test case
Differential revisions
BlockedBy
Related
Blocking
CC
Operating system
Architecture

Overlapping type families, segafult

Child items ...

Activity

Overlapping type families, segafult

Relates to

Activity