Opened 11 months ago

Closed 7 months ago

#9371 closed bug (fixed)

Overlapping type families, segafult

Reported by: pingu Owned by: goldfire
Priority: normal Milestone: 7.8.4
Component: Compiler Version: 7.8.3
Keywords: Cc: goldfire, dimitris@…, sweirich@…, nomeata, Conor.McBride@…
Operating System: Linux Architecture: x86_64 (amd64)
Type of failure: Runtime crash Test Case: indexed-types/should_fail/T9371
Blocked By: Blocking:
Related Tickets: Differential Revisions:

Description

Not entirely sure what's going on here. I don't think this should type check; it appears to segfault whilst calling show on the wrong type.

This is probably not the absolute minimum required to reproduce.

I have reproduced on 7.8.3 and 7.9.20140727.

{-# LANGUAGE TypeFamilies #-}
{-# LANGUAGE FlexibleInstances #-}
{-# LANGUAGE UndecidableInstances #-}
{-# LANGUAGE OverlappingInstances #-}
import Data.Monoid

class C x where
    data D x :: *
    makeD :: D x

instance Monoid x => C x where
    data D x = D1 (Either x ())
    makeD = D1 (Left mempty)

instance (Monoid x, Monoid y) => C (x, y) where
    data D (x,y) = D2 (x,y)
    makeD = D2 (mempty, mempty)

instance Show x => Show (D x) where
    show (D1 x) = show x


main = print (makeD :: D (String, String))

This does not segfault if you add:

  instance (Show x, Show y) => Show (D (x,y)) where
      show (D2 x) = show x

Change History (13)

comment:1 Changed 11 months ago by goldfire

If we make the associated data instance an associated type instance, the module fails to compile. Perhaps it should fail too for data families.

comment:2 Changed 11 months ago by carter

main = print (makeD :: D (String, String))

this seems to be ambiguous wrt which D constructor is chosen, right? (at least wrt the instance of class C and thence makeD right?)

comment:3 Changed 11 months ago by simonpj

  • Cc goldfire dimitris@… sweirich@… nomeata Conor.McBride@… added

Great bug! I know exactly what is going on.

There is a long-standing awkwardness in GHC’s implementation of newtype, described in Note [Newtype eta] in TyCon, which I reproduce below. The core of it comes down to this. Consider

newtype Parser a = MkParser (IO a) derriving( Monad )

Do we have (Monad Parser ~R Monad IO)? These are the types of the Monad dictionaries for Parser and IO resp. In the olden days to implement Generalised Newtype Deriving (GND) we coerced directly between these dictionary types, so we really needed them to be ~R. As a result, the axiom induced by the above newtype had to be

	ax :: Parser ~R IO

and not

	ax a :: Parser a  ~R  IO a

So we had to eta-reduce axioms.

This is a royal pain. And we have to do it for newtype instances too. See Note [Eta reduction for data family axioms] in TcInstDcls. And it turns out that the new, serious bug #9371 is directly attributable to this royal pain. (Just to record how: FamInstEnv.lookup_fam_inst_env' goes to some trouble to match up arities; but the my_unify function passed to it by lookupFamInstEnvConflicts simply bypasses all this, and as a result misses the conflict.)

Side note: our Safe coercions” paper does not say in Section 2 how to deal with data/newtype instances.

NOW, in our new story for GND, we never coerce between Monad Parser and Monad IO. Instead we coerce the individual methods (Section 7 of the paper). Why? I think it’s because our roles are not higher-kinded, but we didn’t document the change. Does anyone else remember? I think there were actually multiple reasons.

However this change means that we DON’T need (IO ~R Parser) any more! Only (IO ty ~R Parser ty). So the reason for the eta reduction seems to have gone away.

Do you agree? If so I could simplify GHC by removing the complicated eta stuff. I would rather do that than fix #9371 by adding more complexity

I see that this was actually proposed in #8503, but the thread diverged onto other matters. Richard’s last comment:7 said “So we don't need eta-reduction as much as maybe we once did, but it's still useful.” but provided no evidence in support of this claim.

I somehow think this is connected to #9123 but I can’t put my finger on why.

Simon

Note [Newtype eta]
~~~~~~~~~~~~~~~~~~
Consider
        newtype Parser a = MkParser (IO a) derriving( Monad )
Are these two types equal (to Core)?
        Monad Parser
        Monad IO
which we need to make the derived instance for Monad Parser.

Well, yes.  But to see that easily we eta-reduce the RHS type of
Parser, in this case to ([], Froogle), so that even unsaturated applications
of Parser will work right.  This eta reduction is done when the type
constructor is built, and cached in NewTyCon.  The cached field is
only used in coreExpandTyCon_maybe.

Here's an example that I think showed up in practice
Source code:
        newtype T a = MkT [a]
        newtype Foo m = MkFoo (forall a. m a -> Int)

        w1 :: Foo []
        w1 = ...

        w2 :: Foo T
        w2 = MkFoo (\(MkT x) -> case w1 of MkFoo f -> f x)

After desugaring, and discarding the data constructors for the newtypes,
we get:
        w2 :: Foo T
        w2 = w1
And now Lint complains unless Foo T == Foo [], and that requires T==[]

This point carries over to the newtype coercion, because we need to
say
        w2 = w1 `cast` Foo CoT

so the coercion tycon CoT must have
        kind:    T ~ []
 and    arity:   0
Last edited 11 months ago by simonpj (previous) (diff)

comment:4 Changed 11 months ago by goldfire

At first reading, it seemed that comment:3 did not relate to this ticket. But, looking closer, I see how. I'll explain for others' benefit:

The line data D x = D1 (Either x ()) becomes somewhat like the two declarations

data DFInst1 x = D1 (Either x ())
type D x = DFInst1 x

We create a proper datatype DFInst1 that has no eta-reduction or anything strange, really, other than the fact that the user can't ever write its name. Data families are internally interpreted somewhat like type families, and thus we create the connection between D and DFInst1. However, as Simon explains, the second of my definitions above is eta-reduced.

When processing data D (x, y) = D2 (x, y), the relevant type-ish instance is D (x, y) = DFInst2 x y, which cannot be eta-reduced. Then, when comparing the instances D = DFInst1 and D (x, y) = DFInst2 x y, the conflict-checker says that the left-hand sides (<empty> and (x,y), respectively) are quite surely apart, because they are of different lengths!

There is a fix here that involves much less theory than Simon's approach: fix the conflict checker to take this into account. There should never be different lengths involved here (absent the eta-reduction), so when there are, we could just note a conflict. Even simpler, we could notice somehow that a data family is involved and spit out a conflict right away, as two data family instances are never compatible.

That said, Simon asks a worthwhile question: do we need eta-reduction given the new implementation of GND? I think "yes":

import Control.Monad.Trans.Identity

newtype MyList a = MkList [a]
class C m where
  x :: IdentityT m Int
instance C [] where
  x = IdentityT [1,2,3]
deriving instance C MyList

This (silly) example requires the eta-reduction to work, and I don't see a compelling reason to reduce expressivity here. Yes, it would simplify the code somewhat, but I still don't think it's worth it.

In answer to Simon's "why not cast Monad Parser to Monad IO": a lack of higher-kinded roles was a big part of the discussion, but there was another: superclasses. It's possible that the GND'd class has superclasses and that the superclass instances for the original type and the newtype are different. If we casted the whole instance, we would get the superclass instances from the original type, which would be quite strange. Indeed, before making any change related to roles, the old GND code also did not cast instances, because of this superclass issue.

comment:5 follow-up: Changed 11 months ago by simonpj

"Fix the conflict checker" is clearly an option. It's what I described as "adding more complexity".

Moreover, let us note that GHC's current behaviour is not documented in the user manual or in any paper. If you are arguing to stick with it, we should at least write it up! Any volunteers?

Moreover, doesn't your example work perfectly well without eta reduction? To typecheck the deriving instance we need

  IdentityT [] Int  ~R  IdentityT MyList Int

Now, using the Coercible instances in Fig 2 of the paper, we can reduce that to

  [Int] ~R  MyList Int

and that is true with the non-eta-reduced axiom.

If IdentityT's data constructor was not in scope, then indeed the program will still typecheck -- but arguably doing so exposes something about the representation of the newtype, harming abstraction. You certainly couldn't have written it be hand.

It all seems quite debatable to me. Do we really want a significant cluster of complexity in the implementation, to implement an un-documented feature, which no one is asking for, and whose very existence is debatable?

comment:6 in reply to: ↑ 5 Changed 11 months ago by goldfire

Replying to simonpj:

Moreover, doesn't your example work perfectly well without eta reduction? To typecheck the deriving instance we need

  IdentityT [] Int  ~R  IdentityT MyList Int

Now, using the Coercible instances in Fig 2 of the paper, we can reduce that to

  [Int] ~R  MyList Int

and that is true with the non-eta-reduced axiom.

If IdentityT's data constructor was not in scope, then indeed the program will still typecheck -- but arguably doing so exposes something about the representation of the newtype, harming abstraction. You certainly couldn't have written it be hand.

I hadn't considered the newtype-unwrapping axiom when writing the example. You're right -- we don't need eta reduction. But, if IdentityT were a datatype instead of a newtype, we would, without further changes to the example. The "harming abstraction" note is exactly the debate about whether or not we should be able to coerce Map String Int to Map String Age without Map's constructor being in scope.

It all seems quite debatable to me. Do we really want a significant cluster of complexity in the implementation, to implement an un-documented feature, which no one is asking for, and whose very existence is debatable?

It still makes me nervous to reduce expressiveness -- I just don't want users of 7.10 to suddenly lose something, post here that they need it back, and then have it restored for 7.12, leaving a lot of CPP in their code.

comment:7 Changed 11 months ago by simonpj

OK I agree that the same issue arises if IdentityT was a data type, so my point about newtype unwrapping for IdentityT was a bit of a red herring.

And yes I agree that this "silly example" is indeed a case where eta-reduction of newtype axioms (and data/newtype instance!) axioms is useful. And since the code is implemented, I suppose that the shortest path is to fix the conflict detector.

But it does hurt me that we have NO formalised description of the eta-reduction story in any of our papers.

Simon

comment:8 Changed 11 months ago by simonpj

  • Owner set to goldfire

So, to be concrete, Richard might you do this?

It's not clear that two equations conflict just because their LHSs differ. Eg

   F x = Maybe
   F x Int = Maybe Int

But it's certainly safe to say "there's a conflict", and in fact the lengths only differ for data families in which case each RHS is a fresh data type.

This is somewhat urgent because what we have now is actually unsound.

If you can't do it, yell, and I will.

Simon

comment:9 Changed 11 months ago by Richard Eisenberg <eir@…>

In a09508b792eed24fc4d8a363df2635026bfa2de6/ghc:

Test #9371 (indexed-types/should_fail/T9371)

comment:10 Changed 11 months ago by Richard Eisenberg <eir@…>

In f29bdfbcedda6cb33ab05d884c151f2b31f4e4e0/ghc:

Fix Trac #9371.

This was very simple: lists of different lengths are
*maybe* apart, not *surely* apart.

comment:11 Changed 11 months ago by goldfire

  • Milestone set to 7.8.4
  • Status changed from new to merge
  • Test Case set to indexed-types/should_fail/T9371

comment:12 Changed 9 months ago by thoughtpolice

  • Milestone changed from 7.8.4 to 7.10.1

Moving (in bulk) to 7.10.4

comment:13 Changed 7 months ago by thoughtpolice

  • Milestone changed from 7.10.1 to 7.8.4
  • Resolution set to fixed
  • Status changed from merge to closed

Merged to 7.8.4

Note: See TracTickets for help on using tickets.