Opened 5 months ago

Closed 2 weeks ago

#15379 closed bug (fixed)

Don't reject user-written instances of KnownNat and friends in hsig files

Reported by: ezyang Owned by:
Priority: normal Milestone: 8.10.1
Component: Compiler Version: 8.4.3
Keywords: backpack Cc: ppk
Operating System: Unknown/Multiple Architecture: Unknown/Multiple
Type of failure: None/Unknown Test Case:
Blocked By: Blocking:
Related Tickets: Differential Rev(s): Phab:D4988
Wiki Page:

Description

Piyush has graciously volunteered to fix this.

Relevant commit c5609577fab8a214c50561bea861c70d4bfd47c7

Attachments (3)

test.bkp (1.1 KB) - added by ppk 4 months ago.
out-desugar (4.4 KB) - added by ppk 4 months ago.
out (7.6 KB) - added by ppk 4 months ago.
Output after simplification -ddump-simpl

Download all attachments as: .zip

Change History (23)

comment:1 Changed 5 months ago by ezyang

Keywords: backpack added

comment:2 Changed 5 months ago by ppk

I have started working on this bug at <https://github.com/piyush-kurur/ghc/tree/literal-instac>. Not yet on phab because I am waiting for the fix of #15138 to get merged. Will rebase and send the patches then. But here is a status report of what I have done

  1. Added a testcase
  1. Made the minimal modification that we discussed over IRC

Unfortunately just 2 does not help as there is a overlapping instance error that is being thrown. I will investigate this and let you known.

comment:3 Changed 5 months ago by ppk

Just reporting the things that I learned here (thanks to discussions with @mpickering on IRC)

  1. With the minor change, the overlapping instance error is thrown only for inbuilt type classes like KnownNat and KnownSymbol but not for Show
  1. Overlapping instance is not a problem for Typeable, the other builtin type class that is affected by this change.

This means that there is something in the witnessing of KnownNat and KnownSymbol that is making it see multiple instances. I have updated the test case with all these special classes so that we keep track of these differences.

In addition @ezyang just mentioned that it is not clear whether the uniform code will work with both hsboot and hsig files. Since we are not confident about the hsboot part, I will only handle the hsig files in the patches to come.

comment:4 Changed 5 months ago by ppk

Resolving the overlapping instance problem is more tricky than I thought. I am adding additional notes here so that some one can better guide the process. This is my current understanding on the issues

The KnownNat and KnownSymbol instance witness and dictionary generating is done by ghc itself in the file compiler/typecheck/ClsInst.hs. However this is problematic in the backpack setting because of the following

Let us consider the following signature fragment

signature Abstract where
  data Foo :: Nat
  instance KnownNat Foo

Suppose this abstract signature is included in a module Util

module Util where
  import Abstract
  printFoo :: IO ()
  printFoo = do print $ natVal (Proxy :: Proxy Foo)

Notice that the natVal function will need a dictionary of KnownNat Foo at this point and will need to get it from a concrete implementation of the kind

module Concrete where
   type Foo = 42

It is not clear to me how this can be generated "inside the compiler" like what happens in ClsInst.hs.

comment:5 Changed 5 months ago by simonpj

I'm not sure I'm following all the details here, but can you take inspiration from the way that instances in hs-boot files work? If you have

Foo.hs-boot
  data T
  instance Eq T

then in the hi-boot file (which we get from compiling Foo.hs-boot) we will have something like

$d77 : Eq T

That is, the arbtrarily named $d77 is a witness for Eq T. Modules importing Foo.hi-boot use $d77 when they need Eq T.

Then when compiling Foo.hs we generate the real instance

$d98 = <the code for equality on T> :: Eq T

and (having read the Foo.hi-boot file, we also generate a compatibility stub

$d77= $d98

so that references to $d77 will end up in $d98.

Maybe the same approach will work for hsig files. Edward will know.

comment:6 Changed 5 months ago by ppk

I think I have fixed it. It turned out that the patch is simpler than I though. Essentially I just made the matchGlobalInstance function to first actually search for an instance and failing which try out the KnownNat and KnownSymbol cases.

The commit is https://github.com/piyush-kurur/ghc/commit/364de810f315faba50a4fbe4a68fba3876eb0555 I have also written up an explanation. See if it makes sense.

comment:7 Changed 5 months ago by simonpj

The current implementation only ever has instances for KnownNat 1, KnownNat 2 etc. This is described in Note [KnownNat & KnownSymbol and EvLit] in TcInteract.

But now you want to allow instance KnownNat T for some abstract type T; you explain this well in comment:4.

Very well; but I really dislike messing with matchInstEnv. Better: make mathKnownNat have an otherwise case that, instead of returning NoInstance uses matchInstEnv. With a careful Note to explain.

Interestingly, matchHasField already does this, but without explanation :-(.

I also worry about the other classes treated specially in matchGlobalInst. For some (matchCTuple , matchLiftedEquality, matchLiftedCoercible) they can't fail, so that's fine. But Typeable can fail; I wonder if you might end up wanting a Typeable instance in a hsig file. I think that's all... but the big point is that GHC's treatment of built-in classes may need attention because of Backpack; KnownNat is just an example.

comment:8 in reply to:  7 Changed 5 months ago by ppk

Replying to simonpj:

The current implementation only ever has instances for KnownNat 1, KnownNat 2 etc. This is described in Note [KnownNat & KnownSymbol and EvLit] in TcInteract.

Yes that note was quite useful in fact.

But now you want to allow instance KnownNat T for some abstract type T; you explain this well in comment:4.

Very well; but I really dislike messing with matchInstEnv. Better: make mathKnownNat have an otherwise case that, instead of returning NoInstance uses matchInstEnv. With a careful Note to explain.

Okey I will think about it and try it out. The only thing puzzling for me is that the old code is returning Overlapping instances, which to me appears to be finding too many instances rather than finding too few. So there is some problem with my comment:4 which I cannot really pin-point.

[snip]

But Typeable can fail; I wonder if you might end up wanting a Typeable instance in a hsig file. I think that's all... but the big point is that GHC's treatment of built-in classes may need attention because of Backpack; KnownNat is just an example.

Again the Typeable instances in backpack works fine even with the old version of the code (unlike KnownNat or KnownSymbol). In fact, I realised and has tweaked the test case to have all the three problematic classes (KnownNat, KnownSymbol and Typeable) but only the first two was giving the overlapping instance error.

comment:9 Changed 5 months ago by ppk

Okey I think I see where the Overlapping instances come from in the case of KnownNat and KnownSymbol. Let me take the following code sample

signature Abstract where
   data Foo :: Nat
   instance KnownNat Foo

module Util where
   import Abstract
   foo = natVal (Proxy :: Proxy Foo)

module Concrete where
   type Foo = Int

module Main where
 {- mixin Concrete and get the ConcreteUtil -}

In the main module there are two dictionaries for KnownNat Foo, one which comes via Util (which incidently is actually coming from Concrete and the other directly coming from Concrete. But these two are "different" in the sense that the old code generates KnownNat on the fly and these two generations manifest as two different instances.

Is there some way this hypothesis can be tested out say be tracing the compilation for the example ?

comment:10 Changed 5 months ago by ppk

Differential Rev(s): Phab:D4988

comment:11 Changed 5 months ago by ppk

Status: newpatch

comment:12 Changed 5 months ago by ezyang

Piyush, I don't think the behavior of your example is consistent with the hypothesis. Consider the following program:

{-# LANGUAGE DataKinds, KindSignatures #-}
unit p where
  signature Abstract where
    import GHC.TypeLits
    data Foo :: Nat
    instance KnownNat Foo

  module Util where
    import Data.Proxy
    import Abstract
    import GHC.TypeLits
    foo = natVal (Proxy :: Proxy Foo)

I get:

ezyang@autobox:~/Dev/ghc-known-nat$ inplace/bin/ghc-stage2 --backpack test.bkp -fforce-recomp
[1 of 1] Processing p
  [1 of 2] Compiling Abstract[sig]    ( p/Abstract.hsig, nothing )
  [2 of 2] Compiling Util             ( p/Util.hs, nothing )

test.bkp:12:11: error:
    • Overlapping instances for KnownNat Foo
        arising from a use of ‘natVal’
      Matching instances:
        instance [safe] KnownNat Foo -- Defined at test.bkp:6:14
      There exists a (perhaps superclass) match:
      (The choice depends on the instantiation of ‘’
       To pick the first instance above, use IncoherentInstances
       when compiling the other instance declarations)
    • In the expression: natVal (Proxy :: Proxy Foo)
      In an equation for ‘foo’: foo = natVal (Proxy :: Proxy Foo)
   |
12 |     foo = natVal (Proxy :: Proxy Foo)
   |           ^^^^^^^^^^^^^^^^^^^^^^^^^^^

The overlap occurs before Concrete is considered at all.

comment:13 in reply to:  12 Changed 5 months ago by ppk

Replying to ezyang:

Piyush, I don't think the behavior of your example is consistent with the hypothesis. Consider the following program:

{-# LANGUAGE DataKinds, KindSignatures #-}
unit p where
  signature Abstract where
    import GHC.TypeLits
    data Foo :: Nat
    instance KnownNat Foo

  module Util where
    import Data.Proxy
    import Abstract
    import GHC.TypeLits
    foo = natVal (Proxy :: Proxy Foo)

Yes I realised it yesterday when I went back and tested with a similar example. In the backpack case, the comment:4 is the correct analysis: we know that the type T in question has a KnownNat instance but cannot find its dictionary. Unfortunately the error it triggers is not a cannot find instance but the following.

<https://github.com/piyush-kurur/ghc/blame/55a3f8552c9dc9b84e204ec6623c698912795347/compiler/typecheck/TcErrors.hs#L2484-L2501>

(Notice the match on null match_given in that code).

This also explains why the cleanup suggested by @simonpj works because matchKnownNat now does not try generating instances on the fly and fails but actually looks up the environment. The real explanation therefore looks like an unfortunate error message and the explanation in comment:4 is the correct one.

Please let me know if this real explanation makes sense ;-)

comment:14 Changed 5 months ago by ppk

Some of the test cases in backpack/should_fail/ are failing. Of which I think the bkpfail46.bkp is actually not giving a compilation failure (and hence the test is failing). Can you have a look at this test case ? Was this a previous limitation of backpack that was now fixed or is it the case that some thing that should not have been acceptable is now being accepted (thereby seriously questioning the type safety).

comment:15 Changed 5 months ago by ppk

Also check bkpfail25. Where I am now getting the error that is explicitly mentioned should not be got.

Changed 4 months ago by ppk

Attachment: test.bkp added

Changed 4 months ago by ppk

Attachment: out-desugar added

Changed 4 months ago by ppk

Attachment: out added

Output after simplification -ddump-simpl

comment:16 Changed 4 months ago by ppk

As mentioned by @simonpj, it would be desirable to actually pinpoint how the dictionaries of KnownNat and friends get resolved in the context of backpack. This comment is more to document what I have learned by simply dumping the desugared version.

I have attached a test backpack program (attachment:test.bkp) with the output after compiling with -ddump-ds and -ddump-simpl. The desugared version is avaliable in attachment:out-desugar

One can clearly see that an the KnownNat instance is substituted for and the appropriate versions of Util is created Line 56 and Line 92. Although there is no trace of the Util module without the substitution, if we had no mixing-in there will also be a variant of Util with the abstract dictionary for KnownNat NatType.

So my understanding would be that backpack is going ahead and "inlining" the dictionary when mixing a module that uses a abstract signature with a concrete implementation of that signature. I hope that makes sense.

comment:17 in reply to:  14 Changed 4 months ago by ppk

Replying to ppk:

Some of the test cases in backpack/should_fail/ are failing. Of which I think the bkpfail46.bkp is actually not giving a compilation failure (and hence the test is failing). Can you have a look at this test case ? Was this a previous limitation of backpack that was now fixed or is it the case that some thing that should not have been acceptable is now being accepted (thereby seriously questioning the type safety).

The test cases were failing not due to changes here but due to problems in the dependent patch for #15138 (see comment:5:ticket:15138 and comment:6:ticket:15138). This comment is therefore irrelevant for deciding on this patch.

Last edited 4 months ago by ppk (previous) (diff)

comment:18 Changed 4 months ago by bgamari

Milestone: 8.6.18.8.1

These won't be fixed for in GHC 8.6.

comment:19 Changed 2 months ago by Ben Gamari <ben@…>

In ce7a1c4/ghc:

Support builtin classes like KnownNat in backpack

This commit allows backpack signatures to enforce constraints like
KnownNat
on data types.  Thus it fixes #15379.  There were two important
differences in the
way GHC used to handle classes like KnowNat

1.  Hand crafted instances of `KnownNat` were  forbidden, and

2. The dictionary for an  instance `KnownNat T` was generated on the
fly.

For supporting backpack both these points have to be revisited.

Disallowing instances of KnownNat
--------------------------------------------

Users were disallowed to declare instances of certain builtin classes
like KnownNat for obvious safety reasons --- when we use the
constraint like `KnownNat T`, we want T to be associated to a natural
number. However, due to the reuse of this code while processing backpack
signatures, `instance KnownNat T` were being disallowed even in module
signature files.

There is an important difference when it comes to instance declarations
in a signature file. Consider the signature `Abstract` given below

```
signature Abstract where
  data T :: Nat
  instance KnownNat T

```

Inside a signature like `Abstract`, the `instance Known T` is not really
creating an instance but rather demanding any module that implements
this signature to enforce the constraint `KnownNat` on its type
T.  While hand crafted KnownNat instances continued to be prohibited in
modules,
this commit ensures that it is not forbidden while handling signatures.

Resolving Dictionaries
----------------------------

Normally GHC expects any instance `T` of class `KnownNat` to eventually
resolve
to an integer and hence used to generate the evidence/dictionary for
such instances
on the fly as in when it is required. However, when backpack module and
signatures are involved
It is not always possible to resolve the type to a concrete integer
utill the mixin stage. To illustrate
consider again the  signature `Abstract`

> signature Abstract where
>   data T :: Nat
>   instance KnownNat T

and a module `Util` that depends on it:

> module Util where
>     import Abstract
>     printT :: IO ()
>     printT = do print $ natVal (Proxy :: Proxy T)

Clearly, we need to "use" the dictionary associated with `KnownNat T`
in the module `Util`, but it is too early for the compiler to produce
a real dictionary as we still have not fixed what `T` is. Only when we
mixin a concrete module

> module Concrete where
>   type T = 42

do we really get hold of the underlying integer.

In this commit, we make the following changes in the resolution of
instance dictionary
for constraints like `KnownNat T`

1. If T is indeed available as a type alias for an integer constant,
   generate the dictionary on the fly as before, failing which

2. Do not give up as before but look up the type class environment for
the evidence.

This was enough to make the resolution of `KnownNat` dictionaries work
in the setting of Backpack as
when actual code is generated, the signature Abstract (due to the
`import Abstract` ) in `Util` gets
replaced by an actual module like Concrete, and resolution happens as
before.

Everything that we said for `KnownNat` is applicable for `KnownSymbol`
as well.

Reviewers: bgamari, ezyang, goldfire, simonpj

Reviewed By: simonpj

Subscribers: simonpj, ezyang, rwbarton, thomie, carter

GHC Trac Issues: #15379

Differential Revision: https://phabricator.haskell.org/D4988

comment:20 Changed 2 weeks ago by bgamari

Milestone: 8.8.18.10.1
Resolution: fixed
Status: patchclosed
Note: See TracTickets for help on using tickets.