Opened 3 months ago

Closed 3 months ago

## #13848 closed bug (duplicate)

# Unexpected order of variable quantification with GADT constructor

Reported by: | RyanGlScott | Owned by: | |
---|---|---|---|

Priority: | normal | Milestone: | |

Component: | Compiler (Type checker) | Version: | 8.0.1 |

Keywords: | TypeApplications | Cc: | |

Operating System: | Unknown/Multiple | Architecture: | Unknown/Multiple |

Type of failure: | None/Unknown | Test Case: | |

Blocked By: | Blocking: | ||

Related Tickets: | #11721 | Differential Rev(s): | |

Wiki Page: |

### Description

Here's some code:

{-# LANGUAGE DataKinds #-} {-# LANGUAGE ExistentialQuantification #-} {-# LANGUAGE GADTs #-} {-# LANGUAGE KindSignatures #-} {-# LANGUAGE ScopedTypeVariables #-} module Foo where data N = Z | S N data Vec (n :: N) a where VNil :: forall a. Vec Z a VCons :: forall n a. a -> Vec n a -> Vec (S n) a

I want to use `TypeApplications`

on `VCons`

. I tried doing so in GHCi:

GHCi, version 8.2.0.20170523: http://www.haskell.org/ghc/ :? for help Loaded GHCi configuration from /home/rgscott/.ghci [1 of 1] Compiling Foo ( Foo.hs, interpreted ) Ok, modules loaded: Foo. λ> :set -XTypeApplications -XDataKinds λ> :t VCons @Z @Int 1 VNil <interactive>:1:8: error: • Expected a type, but ‘'Z’ has kind ‘N’ • In the type ‘Z’ In the expression: VCons @Z @Int 1 VNil <interactive>:1:11: error: • Expected kind ‘N’, but ‘Int’ has kind ‘*’ • In the type ‘Int’ In the expression: VCons @Z @Int 1 VNil <interactive>:1:17: error: • Couldn't match type ‘'Z’ with ‘Int’ Expected type: Vec Int 'Z Actual type: Vec 'Z 'Z • In the fourth argument of ‘VCons’, namely ‘VNil’ In the expression: VCons @Z @Int 1 VNil

Huh? That's strange, I would have expected the first type application to be of kind `N`

, and the second to be of kind `*`

. But GHC disagrees!

λ> :set -fprint-explicit-foralls λ> :type +v VCons VCons :: forall a (n :: N). a -> Vec n a -> Vec ('S n) a

That's downright unintuitive to me, since I explicitly specified the order in which the quantified variables should appear in the type signature for `VCons`

.

Similarly, if you leave out the `forall`

s:

data Vec (n :: N) a where VNil :: Vec Z a VCons :: a -> Vec n a -> Vec (S n) a

Then `:type +v`

will also report the same quantified variable order for `VCons`

. This is perhaps less surprising, since the `n`

and `a`

in `data Vec (n :: N) a`

don't scope over the constructors, so GHC must infer the topological order in which the variables appear in each constructor. But I would certainly not expect GHC to do this if I manually specify the order with `forall`

.

### Change History (8)

### comment:1 Changed 3 months ago by

### comment:2 Changed 3 months ago by

The question remains: how could we preserve the order in which the user wrote the `forall`

d variables? The assumption that they should be partitioned into separate universal and existential groups is baked pretty deeply into the GHC AST, as `DataCon`

has separate `dcUnivTyVars`

and `dcExTyVars`

fields, and never the twain shall meet.

One possible solution is to introduce a `dcOrigTyVars :: Maybe [TyVar]`

field, which is `Just`

the `forall`

d variables (in whatever order the user writes) or `Nothing`

otherwise. GHC could then use this information when computing `dataConUserType`

, which I believe GHC uses for determining the order in which `TypeApplications`

happen.

### comment:3 Changed 3 months ago by

Hmm. Fortunately we already have just the hook we need.

Every interesting data constructor (including every GADT data con) has a wrapper; it is built by `MkId.mkDataConRep`

. Look at `wrapper_reqd`

to see which data cons have wrappers.

The wrapper mediates between the nice user view of the data constructor and the internal "representation data constructor" that GHC uses internally. So for example

data T a where MkT :: forall p q. p -> q -> T (p,q)

The representation data con has this type

MkT :: forall r. forall p q. (r ~ (p,q)) => p -> q -> T r

But the wrapper is defined like this

$WMkT :: forall p q. p -> q -> T (p,q) $WMkT = /\p q. \(x::p) (y::q). MkT @(p,q) @p @q <(p,q)> x y

The `<(p,q)>`

is a coercion argument (refl) witnessing `(p,q)~(p,q)`

Now currently `dataConUserType`

(which claims to show the user-written type of the data con) is thus (in `DataCon`

):

dataConUserType (MkData { dcUnivTyVars = univ_tvs, dcExTyVars = ex_tvs, dcEqSpec = eq_spec, dcOtherTheta = theta, dcOrigArgTys = arg_tys, dcOrigResTy = res_ty }) = mkForAllTys (filterEqSpec eq_spec univ_tvs) $ mkForAllTys ex_tvs $ mkFunTys theta $ mkFunTys arg_tys $ res_ty

This `filterEqSpec`

business is fishy! But now we understand about wrappers, for data constructors that have wrappers we could insead use the type of the wrapper, thus:

dataConUserTYpe (MkData { dcRep = rep , dcRepType = rep_ty } = case rep of DCR { dcr_wrap_id = wrap_id } -> idType wrap_id NoDataConRep -> rep_ty -- If there is no wrapper, the "rep-type" is the same -- as the "user type"

Whizzo.

Now, to return to the ticket,

- We should ensure that the wrapper type reflects exactly the type the user wrote including type variable order
- That in turn may force to make a wrapper when we don't right now, just to swizzle round the type variables

Make sense?

### comment:4 Changed 3 months ago by

I don't think it's going to be that simple. If you look at the implementation of `mkDataConRep`

, you'll notice that it figures out the type of the `wrap_id`

like this:

wrap_ty = dataConUserType data_con

So obviously we don't want to change `dataConUserType`

's implementation to use `idType wrap_id`

, since that would be a circular definition.

We should ensure that the wrapper type reflects exactly the type the user wrote including type variable order

Sure, but that's easier said than done. The only info you have in `mkDataConId`

to determine the wrapper type is the `DataCon`

itself, and at that point, the type variables have already been carved up into the universal and existential ones, with no way to recover the original order. So I don't see any way to make this work without having a separate `dcOrigTyVars`

fields, as proposed in comment:2.

### comment:5 Changed 3 months ago by

Well it would take a bit more plumbing, I agree. You'd need to take the `qtkvs`

returned by `tcGadtSigType`

in `TcTyClsDecls.tcConDecl`

and pass it on to `buildDataCon`

, which can in turn give them to `mkDataConRep`

to use instead of calling `dataConUserType`

.

I'm not certain that's all, but I think so.

### comment:6 Changed 3 months ago by

OK, I think I'm starting to see the picture more clearly. However, some of the details are still hazy to me. In particular, `tcConDecl`

is far from the only place that uses `buildDataCon`

to construct a `DataCon`

. However, as far as I can tell, `tcConDecl`

is the *only* place that knows exactly what order the user wrote the constructor's type variables in.

This is important because we might write a GADT to an interface file and read it back in tyConToIfaceDecl, which also uses `buildDataCon`

. However, we've lost the order of the type variables used in the wrapper type by that point, since `IfaceData`

doesn't store the wrapper type!

This suggests to me that we need to beef up `IfaceData`

to accommodate this change. Should we be storing the wrapper type as a field of `IfaceData`

as well? Or am I barking up the wrong tree?

### comment:7 Changed 3 months ago by

Yes you are right. So yes, we have to put more info in the `IfaceConDecl`

.

Putting in the wrapper type doesn't feel right -- duplicates too much.

Better to put in the original tyvars.

And in that case we may as well record them in the `DataCon`

too, as you suggested earlier. Then we wouldn't need to pass them to `mkDataConRep`

after all, because they'll be in the `DataCon`

.

One fiddly thing I'm not sure about: in the declaration of `IfaceConDecl`

we see

data IfaceConDecl = IfCon { ifConName :: IfaceTopBndr, -- Constructor name ifConWrapper :: Bool, -- True <=> has a wrapper ifConInfix :: Bool, -- True <=> declared infix -- The universal type variables are precisely those -- of the type constructor of this data constructor -- This is *easy* to guarantee when creating the IfCon -- but it's not so easy for the original TyCon/DataCon -- So this guarantee holds for IfaceConDecl, but *not* for DataCon

Reasoning is explained in `MkIface`

where we convert a `DataCon`

to a `ConDecl`

-- Tidy the univ_tvs of the data constructor to be identical -- to the tyConTyVars of the type constructor. This means -- (a) we don't need to redundantly put them into the interface file -- (b) when pretty-printing an Iface data declaration in H98-style syntax, -- we know that the type variables will line up -- The latter (b) is important because we pretty-print type constructors -- by converting to IfaceSyn and pretty-printing that con_env1 = (fst tc_env1, mkVarEnv (zipEqual "ifaceConDecl" univ_tvs tc_tyvars)) -- A bit grimy, perhaps, but it's simple!

Meddling with `IfaceConDecl`

is not a big deal... it's just a serialisation format, and changes are very localised. So feel free to suggest what to do.

### comment:8 Changed 3 months ago by

Related Tickets: | → #11721 |
---|---|

Resolution: | → duplicate |

Status: | new → closed |

Oh dear, I just realized this is a duplicate of #11721. Closing in favor of that ticket.

**Note:**See TracTickets for help on using tickets.

Above, I implied that GHC was inferring the topological order for

`a`

and`n`

in the type signature for`VCons`

even with an explicit`forall`

. I just realized there's more to it after reading`Note [mkGADTVars]`

. The relevant excerpt:So what's

reallygoing on is that GHC is putting all of the universally quantified variables (`a`

) before the existentially quantified ones (`n`

).Still, I still think that GHC ought to adhere to what the user wrote if they bother to write an explicit

`forall`

, because otherwise you have to trace out the variable order that GHC happens to infer.