Opened 20 months ago

Last modified 8 days ago

#13408 patch feature request

Consider inferring a higher-rank kind for type synonyms

Reported by: goldfire Owned by:
Priority: normal Milestone: 8.8.1
Component: Compiler Version: 8.0.1
Keywords: TypeInType Cc: RyanGlScott
Operating System: Unknown/Multiple Architecture: Unknown/Multiple
Type of failure: None/Unknown Test Case:
Blocked By: Blocking:
Related Tickets: Differential Rev(s): Phab:D5301
Wiki Page:

Description

In terms, a definition comprising one non-recursive equation may have a higher-rank type inferred. For example:

f :: (forall a. a -> a -> a) -> Int
f z = z 0 1

g = f

g gets an inferred type equal to f's. However, this fails at the type level:

type F (z :: forall a. a -> a -> a) = z 0 1

type G = F

If anything is strange here, it's that the term-level definition is accepted. GHC should not be in the business of inferring higher-rank types. But there is an exception for definitions comprising one non-recursive equation.

This ticket proposes expanding this behavior to the type level, allowing G to be accepted.

This is spun off from #13399, but is not tightly coupled to that ticket.

Change History (9)

comment:1 Changed 17 months ago by RyanGlScott

Cc: RyanGlScott added

comment:2 Changed 15 months ago by goldfire

Resolution: wontfix
Status: newclosed

Actually, this is all bogus. In type G = F, F is undersaturated, which is disallowed. And once we write type G z = F z, when we're inferring a higher-rank type for z, we've gone further off the map than I'm comfortable with.

Additionally, getting this right would require tracking recursiveness in type declarations more than we do already, and it's not worth the pain.

So I'm closing as wontfix. Anyone who wants this can always add a kind signature.

comment:3 Changed 2 weeks ago by RyanGlScott

Resolution: wontfix
Status: closednew

I talked with goldfire today about this issue, but in a slightly different context. The issue this time was that I had:

data A :: Type -> Type
data B a :: A a -> Type

And I wanted to define a type synonym alias for B:

type C = B

This time, we came to the opposite conclusion of comment:2 — that is, GHC should be able to accept this. There are two main reasons why:

  • Unlike in the original example, in which the RHS of a type synonym was attempting to partially apply a type synonym (which is problematic), there is no problem with partially applying a data type constructor such as B.
  • The implementation might be easier than what goldfire had originally thought when writing comment:2. In the code that kind-checks type synonym declarations, we would need to check that the type synonym is the sole declaration in a mutually recursive group. (That is to say, no funny business with recursion is going on.) If this holds true, then we could grab the kind from the RHS (even if it's fancy, like the kind of B :: forall a -> A a -> Type) and use that to infer the kind of the whole type synonym.

Having this work is especially important in the context of this particular example since there is currently no way to write the kind forall a -> A a -> Type in source Haskell. (If there were a way to write this, then type C = (B :: forall a -> A a -> Type) would already be a suitable workaround.)

goldfire, please correct me if I got any details wrong here.

comment:4 Changed 2 weeks ago by goldfire

You got it. I do think a little twiddling in kcTyClGroup will get you what you want.

comment:5 Changed 2 weeks ago by RyanGlScott

To be clear, I have no idea what "a little twiddling" constitutes in the slightest. The only clues I have are:

  • The code around Note [Single function non-recursive binding special-case] looks relevant:
                             -- Single function binding,
  | NonRecursive <- is_rec   -- ...binder isn't mentioned in RHS
  , Nothing <- sig_fn name   -- ...with no type signature
  =     -- Note [Single function non-recursive binding special-case]
        -- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
        -- In this very special case we infer the type of the
        -- right hand side first (it may have a higher-rank type)
        -- and *then* make the monomorphic Id for the LHS
        -- e.g.         f = \(x::forall a. a->a) -> <body>
        --      We want to infer a higher-rank type for f
    setSrcSpan b_loc    $
    do  { ((co_fn, matches'), rhs_ty)
            <- tcInferInst $ \ exp_ty ->
                  -- tcInferInst: see TcUnify,
                  -- Note [Deep instantiation of InferResult]
               tcExtendBinderStack [TcIdBndr_ExpType name exp_ty NotTopLevel] $
                  -- We extend the error context even for a non-recursive
                  -- function so that in type error messages we show the
                  -- type of the thing whose rhs we are type checking
               tcMatchesFun (L nm_loc name) matches exp_ty

        ; ... }

I'm guessing that the type that you get from tcInferInst is important here.

  • This appears to be the relevant code for kind-checking type synonyms:
getInitialKind decl@(SynDecl { tcdLName = L _ name
                             , tcdTyVars = ktvs
                             , tcdRhs = rhs })
  = do  { tycon <- kcLHsQTyVars name TypeSynonymFlavour (hsDeclHasCusk decl)
                                ktvs $
            case kind_annotation rhs of
              Nothing -> newMetaKindVar
              Just ksig -> tcLHsKindSig (TySynKindCtxt name) ksig
        ; return [tycon] }
  where
    -- Keep this synchronized with 'hsDeclHasCusk'.
    kind_annotation (L _ ty) = case ty of
        HsParTy _ lty     -> kind_annotation lty
        HsKindSig _ _ k   -> Just k
        _                 -> Nothing

If I pass in the [LTyClDecl GhcRn] argument from kcTyClGroup, I can figure out if that type synonym is part of a recursive group or not. Similarly, if kind_annotation rhs returns Nothing, then I know that we don't have a CUSK. If those two things simultaneously hold true, then I'm in a situation where we should infer the overall kind.

...except that I'm not sure at all how to go about doing so. Several mysteries have presented itself:

  • Where exactly should tcInferInst go into this? Does the type that it give you in its callback refer to the overall kind of the type synonym? Just the return kind? Something else?
  • Once I have access to the type in tcInferInst, what do I even do with it? Do I pass it to tcLHsKindSig? Is there another function that's better suited to this situation?
  • tcInferInst also returns a type. What should I do with that?
  • Is the tcExtendBinderStack stuff from the term-level code relevant to this situation?

Without some kind of direction, I'm very under-qualified to fix this, I'm afraid.

comment:6 Changed 11 days ago by goldfire

It's just a little twiddling! How hard could it be? :)

You ask good questions. But your first guesses are off:

  • tcInferInst is the right notional idea, but not a function you would actually use here.
  • getInitialKind looks at the "left-hand side" of a type declaration and invents a skeleton for the type's kind. (Example: from data T a (b :: Type -> Type) c, it guesses T :: kappa1 -> (Type -> Type) -> kappa2 -> Type for fresh unification variables kappa1 and kappa2.) Because you don't need to guess the kind, it's not what you want here.

What you need to do is to bring the type variables on the LHS into scope (with either guessed or provided kinds), kind-check the RHS, and then say that the LHS's kind is the same as the RHS's. Here is some direction to how to do so:

  • Have a special case in kcTyClGroup that detects a single type synonym. Type synonyms cannot be recursive, so we don't have to worry about that. (Or maybe we do, because we detect recursion in type synonyms later, perhaps? If that's the case, the renamer may want to either reject recursive type synonyms or remember the recursiveness using a new field in the AST.)
  • kcLHsQTyVars brings the LHS tyvars into scope, with appropriate kinds. Claim that the type does not have a CUSK. (Even if it does have a CUSK, CUSKness can matter only with recursion.)
  • tcLHsType does a nice job of checking the type itself -- but you'll need only the kind here, not the type.
  • You'll need to skolemise and quantify, much like the main branch of kcTyClGroup.

Does that give enough guidance?

I do think that, actually, we can do better here. The plan above would still have GHC take two passes over these special type synonyms, when really we can do it in one pass. But the infrastructure isn't set up for that, so it would take a more significant refactoring. Still I think the plan above is an improvement over the status quo.

comment:7 Changed 10 days ago by RyanGlScott

Thanks, goldfire. With the push in comment:6, I was able to barf out this code:

  • compiler/typecheck/TcTyClsDecls.hs

    diff --git a/compiler/typecheck/TcTyClsDecls.hs b/compiler/typecheck/TcTyClsDecls.hs
    index 5725cfd..438884b 100644
    a b kcTyClGroup :: [LTyClDecl GhcRn] -> TcM [TcTyCon] 
    483483-- Third return value is Nothing if the tycon be unsaturated; otherwise,
    484484-- the arity
    485485kcTyClGroup decls
     486  | [ldecl@(L _ decl@(SynDecl { tcdLName = L _ name
     487                              , tcdTyVars = ktvs
     488                              , tcdRhs = rhs }))] <- decls
     489  , not (hsDeclHasCusk decl)
     490  = do tycon <- kcLHsQTyVars name TypeSynonymFlavour False ktvs $
     491                snd <$> tcLHsType rhs
     492       solveEqualities $ tcExtendKindEnvWithTyCons [tycon] $ kcLTyClDecl ldecl
     493       candidates <- gather1 tycon
     494       _ <- quantifyTyVars emptyVarSet candidates
     495       poly_tycon <- generalise tycon
     496       return [poly_tycon]
     497
     498  | otherwise
    486499  = do  { mod <- getModule
    487500        ; traceTc "---- kcTyClGroup ---- {" (text "module" <+> ppr mod $$ vcat (map ppr decls))

Surprisingly, this actually seems to do the trick—type C = B actually typechecks with this! The code is a little messy in its current state, but it appears to get the job done. Surely this must be too good to be true?

Well, it turns out it is too good to be true. With this patch, there are three test cases whose error messages degrade in quality. They are:

  • ghc-e-fail2:
  • ghc-e/should_fail/ghc-e-fail2.run/ghc-e-fail2.

    diff -uw "ghc-e/should_fail/ghc-e-fail2.run/ghc-e-fail2.stderr.normalised" "ghc-e/should_fail/ghc-e-fail2.run/ghc-e-fail2.run.stderr.normalised"
    old new  
    11
    2 <interactive>:0:1:
    3     Cycle in type synonym declarations:
    4       <interactive>:0:1-10: type A = A
     2<interactive>:0:10:
     3     Type constructor ‘A’ cannot be used here
     4        (it is defined and used in the same recursive group)
     5     In the type ‘A’

I think the reason this happens is because we're now calling tcLHsType before the code that perform the type-synonym cycle detection, and tcLHsType eventually invoked promotionErr when it sees A on the RHS (resulting in the "same recursive group" message).

I think calling tcExtendKindEnvWithTyCons before tcLHsType would avoid promotionErr being invoked, but the problem is that the tycon that we need to pass tcExtendKindEnvWithTyCons is returned by kcLHsQTyVars ... (tcLHsType rhs) itself! So I'm stuck in a Catch-22 situation.

  • T7433 and T10451:
  • polykinds/T7433.run/T7433.

    diff -uw "polykinds/T7433.run/T7433.stderr.normalised" "polykinds/T7433.run/T7433.comp.stderr.normalised"
    old new  
    33     Data constructor ‘Z’ cannot be used here
    44        (perhaps you intended to use DataKinds)
    55     In the type ‘ 'Z’
    6       In the type declaration for ‘T’
  • polykinds/T10451.run/T10451.

    diff -uw "polykinds/T10451.run/T10451.stderr.normalised" "polykinds/T10451.run/T10451.comp.stderr.normalised"
    old new  
    22T10451.hs:22:12:
    33    Constraint tuple arity too large: 64 (max arity = 62)
    44      Instead, use a nested tuple
    5     In the type ‘(Eq a, Eq a, Eq a, Eq a, Eq a, Eq a, Eq a, Eq a, Eq a,
     5     In the type ‘(Eq a, Eq a, Eq a, Eq a, Eq a, Eq a, Eq a, Eq a,
    66                  Eq a, Eq a, Eq a, Eq a, Eq a, Eq a, Eq a, Eq a, Eq a, Eq a, Eq a,
    77                  Eq a, Eq a, Eq a, Eq a, Eq a, Eq a, Eq a, Eq a, Eq a, Eq a, Eq a,
    88                  Eq a, Eq a, Eq a, Eq a, Eq a, Eq a, Eq a, Eq a, Eq a, Eq a, Eq a,
    99                  Eq a, Eq a, Eq a, Eq a, Eq a, Eq a, Eq a, Eq a, Eq a, Eq a, Eq a,
    10                   Eq a, Eq a, Eq a, Eq a, Eq a, Eq a, Eq a, Eq a, Eq a, Eq a, Eq a)’
    11     In the type declaration for ‘T
     10                    Eq a, Eq a, Eq a, Eq a, Eq a, Eq a, Eq a, Eq a, Eq a, Eq a, Eq a,
     11                    Eq a)

I'm not sure yet what is causing this.

comment:8 Changed 10 days ago by goldfire

A few remarks about the code:

  • Put the solveEqualities around the first line, as the tcLHsType can emit constraints, and these will need to be solved.
  • Remove the second line. We don't need to run kcTyClDecl here. kcTyClDecl's purpose in life is to generate constraints on the unification variables generates in getInitialKind. But we're not calling getInitialKind, so no need for kcTyClDecl.
  • Generalization looks about right.

About the changed error messages:

  • The first will be hard to fix, for the reasons you describe. You could probably put some variant of APromotionErr in the env't to get a better message here. No, you can't just put the tycon in the env't, because you're making the tycon. Don't even think about fixM here.
  • The others just want for a addTyConCtxt around the call to generalise, which does a wee bit of validity checking.

Otherwise, looks good!

comment:9 Changed 8 days ago by RyanGlScott

Differential Rev(s): Phab:D5301
Milestone: 8.8.1
Status: newpatch

Alright, I've given it my best shot at Phab:D5301.

Note: See TracTickets for help on using tickets.