Opened 2 years ago

Closed 2 months ago

#7672 closed bug (fixed)

boot file entities are sometimes invisible and are not (semantically) unified with corresponding entities in implementing module

Reported by: skilpat Owned by: ezyang
Priority: normal Milestone: 7.12.1
Component: Compiler (Type checker) Version: 7.4.2
Keywords: backpack Cc: pho@…
Operating System: Linux Architecture: x86_64 (amd64)
Type of failure: GHC rejects valid program Test Case: rename/should_compile/T7672
Blocked By: Blocking: #10336
Related Tickets: Differential Revisions:

Description

In a recursive module (i.e. a module that transitively imports itself), the unique "Name" of an entity E declared in this module's boot file should be precisely the same as that of the corresponding E defined in the module. Right now GHC appears to treat them as separate entities. (In the module systems literature, this problem has been identified as the "double vision problem" [1, Ch 5] and in general has caused problems with implementations of recursive modules. Derek Dreyer and his coauthors have proposed a number of solutions [2], and so have Im et al. more recently in the context of OCaml [3].)

With that being said, the immediate problem here seems to be that GHC does not actually allow, in the implementing module, the import of its boot file's entities.

There are a couple related errors I can identify with, huzzah!, very small example programs. The crux of the example is that the module A defines a data type T which is essentially the typical Nat data type -- except that the recursive type reference in the successor constructor refers to the "forward declaration's" view of the type (in the boot file) rather than the local view of that data type T.

This first example shows that the boot file import is not actually making available the entities it declares:

module A where
  data T
module A where
  import {-# SOURCE #-} qualified A as Decl(T)
  data T = Z | S Decl.T

The Decl.T reference should have the exact same identity as the locally defined T reference; after tying the module knot, this data type should be the same as if we had defined it with a T instead of Decl.T. However, the entity name T does not even appear to be gotten from the import of the boot file:

A.hs:3:18: Not in scope: type constructor or class `Decl.T'

In an earlier version of GHC I tested, 6.12.1, the error message lies on the import statement:

A.hs:2:44: Module `A' (hi-boot interface) does not export `T'

In the next example, with the same boot file, we see that the mechanism that checks whether the implementation matches the boot file fails to see the two "views" of T as the same. (Note that I changed the definition of T here to make the previous error go away.)

module A(Decl.T(..)) where
  import {-# SOURCE #-} qualified A as Decl(T)
  data T = Z | S T

Since Decl.T should point to the same entity as T, the export statement should have the same effect as if it were instead "(T(..))". However, GHC again cannot make sense of the reference "Decl.T" and then complains that the boot file's T is not provided in the implementation:

A.hs:1:10: Not in scope: type constructor or class `Decl.T'

<no location info>:
    T is exported by the hs-boot file, but not exported by the module

(Making the export list empty shows this second error message only.)

Altering this second example by omitting the alias on the import, and by changing the T reference in the type's definition to A.T, results in a well-typed module:

module A(A.T(..)) where
  import {-# SOURCE #-} qualified A(T)
  data T = Z | S A.T

A final example shows that, in a module that is not the implementing module, entities defined in the boot file are imported as one would expect! In the following example, we insert a module B, in between A's boot file and A's implementation, which merely passes along the boot file's view of T.

module A where
  data T
module B(Decl.T(..)) where
  import {-# SOURCE #-} qualified A as Decl(T)
  data U = U Decl.T
module A(T(..)) where
  import qualified B(T)
  data T = Z | S B.T

The error message here, again, lies in the reference B.T in A's implementation:

A.hs:3:18:
    Not in scope: type constructor or class `B.T'
    Perhaps you meant `A.T' (line 3)

Notice, however, that the reference to Decl.T in the B module is perfectly well-formed.

I suspect that the general problem lies with double vision, and that the more immediate problem--whereby imports of boot file entities from their implementing modules fail--is merely the manifestation of that.

In the above, wherever I have suggested an intended semantics, I refer primarily to the state of the art in recursive modules systems. A perhaps more pressing justification, however, is that both the Haskell language report and Diatchki et al.'s specification of the module system [4] (seem to) corroborate that intended semantics.

Your friend in the recursive module swamp,
Scott Kilpatrick


References

[1] Derek Dreyer. Understanding and Evolving the ML Module System, PhD thesis, 2005.
[2] Derek Dreyer. A Type System for Recursive Modules, ICFP 2007.
[3] Hyeonseung Im, Keiko Nakata, Jacques Garrigue, and Sungwoo Park. A syntactic type system for recursive modules, OOPSLA 2011.
[4] Iavor S. Diatchki, Mark P. Jones, and Thomas Hallgren. A formal specification of the Haskell 98 module system, Haskell 2002.

Change History (19)

comment:1 Changed 2 years ago by PHO

  • Cc pho@… added

comment:2 Changed 2 years ago by skilpat

  • Operating System changed from Unknown/Multiple to Linux
  • Type of failure changed from None/Unknown to GHC rejects valid program

comment:3 Changed 2 years ago by skilpat

I must scale back my assessment of the problem somewhat: double vision is indeed handled, to some extent, by GHC. In particular, a view of a type from a boot file is treated as an equivalent type to a view of that type from the implementation. That's good! But you still cannot syntactically refer to both views in the same program. That's less good.

As a slight variation to my example with three modules, we make B expose a function that mentions (the boot file's view of) T. Then in the A implementation, we check that this function may be successfully applied to a value of (the local view of) type T. Indeed, GHC accepts all this!

module A where
  data T
module B(Decl.T, f) where
  import {-# SOURCE #-} qualified A as Decl
  f :: Decl.T -> Bool
  f _ = True
module A where
  import qualified B
  data T = Z | S T
  x = B.f Z

In the A implementation I'm not ever mentioning the boot file view of T, B.T. If you try to do so, then GHC rejects the program with the same error messages as mentioned in the original report. That is, the following variation on A fails

module A where
  import qualified B
  data T = Z | S B.T
  x = B.f Z :: B.T

with two independent errors, one for each reference to B.T:

A.hs:3:18:
    Not in scope: type constructor or class `B.T'
    Perhaps you meant `A.T' (line 3)

A.hs:4:16:
    Not in scope: type constructor or class `B.T'
    Perhaps you meant `A.T' (line 3)

In light of this example, I retract my statement that GHC fails to solve the double vision problem! It's unclear, though, to what extent it does so, as there's still some amount of wackiness.

comment:4 Changed 2 years ago by igloo

  • difficulty set to Unknown
  • Milestone set to 7.8.1

Thanks for the report.

comment:5 Changed 15 months ago by thoughtpolice

  • Milestone changed from 7.8.3 to 7.10.1

Moving to 7.10.1

comment:6 Changed 13 months ago by ezyang

  • Keywords backpack added; recursive modules boot files double vision double vision problem removed

comment:7 Changed 7 months ago by thoughtpolice

  • Milestone changed from 7.10.1 to 7.12.1

Moving to 7.12.1 milestone; if you feel this is an error and should be addressed sooner, please move it back to the 7.10.1 milestone.

comment:8 Changed 3 months ago by ezyang

The reason this occurs is when we process imports in rnImportDecl, we filter out any identifiers which come from a self-boot:

    let gbl_env = mkGlobalRdrEnv (filterOut from_this_mod gres)

So these identifiers never get added to the GlobalRdrEnv, and thus aren't brought into scope.

I don't know what happens if you remove that filterOut. It comes from some very old code.

comment:9 Changed 3 months ago by simonpj

  • Owner set to ezyang

I agree. Just remove it and see. It looks outright wrong to me.

Simon

comment:10 Changed 3 months ago by ezyang

  • Blocking 10336 added

comment:11 Changed 3 months ago by ezyang

Removing the filterOut does not work. (insert more info here)

comment:12 Changed 2 months ago by Edward Z. Yang <ezyang@…>

In a27fb46ff1ea46a45e0084c3db92a24475e3bab5/ghc:

Add (failing) test case for #7672.

Signed-off-by: Edward Z. Yang <[email protected]>

comment:13 Changed 2 months ago by ezyang

  • Test Case set to T7672

comment:14 Changed 2 months ago by simonpj

Edward: I have a fix for this in-flight.

Simon

comment:15 Changed 2 months ago by Simon Peyton Jones <simonpj@…>

In 9b73cb16485f331d9dc1f37826c6d503e24a5b0b/ghc:

Refactor the GlobalRdrEnv, fixing #7672

This patch started innocently enough, by deleting a single
call from rnImportDecl, namely

    let gbl_env = mkGlobalRdrEnv (filterOut from_this_mod gres)

The 'filterOut' makes no sense, and was the cause of #7672.

But that little loose end led to into a twisty maze of little
passages, all alike, which has taken me an unreasonably long
time to straighten out. Happily, I think the result is really
much better.

In particular:

 * INVARIANT 1 of the GlobalRdrEnv type was simply not true:
   we had multiple GlobalRdrElts in a list with the same
   gre_name field. This kludgily implmented one form of
   shadowing.

 * Meanwhile, extendGlobalRdrEnvRn implemented a second form of
   shadowing, by deleting stuff from the GlobalRdrEnv.

 * In turn, much of this shadowing stuff depended on the Names of
   the Ids bound in the GHCi InteractiveContext being Internal
   names, even though the TyCons and suchlike all had External
   Names. Very confusing.

So I have made the following changes

 * I re-established INVARIANT 1 of GlobalRdrEnv.  As a result
   some strange code in RdrName.pickGREs goes away.

 * RnNames.extendGlobalRdrEnvRn now makes one call to deal with
   shadowing, where necessary, and another to extend the
   environment.  It deals separately with duplicate bindings.

   The very complicated RdrName.extendGlobalRdrEnv becomes much
   simpler; we need to export the shadowing function, now called
   RdrName.shadowNames; and we can nuke
   RdrName.findLocalDupsRdrEnv altogether.

   RdrName Note [GlobalRdrEnv shadowing] summarises the shadowing
   story

 * The Names of the Ids bound in the GHCi interactive context are
   now all External.  See Note [Interactively-bound Ids in GHCi]
   in HscTypes.

 * Names for Ids created by the debugger are now made by
   IfaceEnv.newInteractiveBinder.  This fixes a lurking bug which
   was that the debugger was using mkNewUniqueSupply 'I' to make
   uniques, which does NOT guarantee a fresh supply of uniques on
   successive calls.

 * Note [Template Haskell ambiguity] in RnEnv shows that one TH-related
   error is reported lazily (on occurrences) when it might be better
   reported when extending the environment.  In some (but not all) cases
   this was done before; but now it's uniformly at occurrences.  In
   some ways it'd be better to report when extending the environment,
   but it's a tiresome test and the error is rare, so I'm leaving it
   at the lookup site for now, with the above Note.

 * A small thing: RnNames.greAvail becomes RdrName.availFromGRE, where
   it joins the dual RdrName.gresFromAvail.

comment:16 Changed 2 months ago by simonpj

  • Test Case changed from T7672 to rename/should_compile/T7672

Despite all this clean up, the test for #7672 still does not work. Reason: in module A (comment:3), the data type T is in scope with two different provenances:

  • As LocalDef "T"
  • As Imported "B.T"

But annoyingly, Provenance can't represent both; currently it's an either/or.

Fixing this really means changing the Provenance data type.

comment:17 Changed 2 months ago by Simon Peyton Jones <simonpj@…>

In 7ea156ae3e1c66e59935f0eb877ea1a3f3bfd5b9/ghc:

Refactor RdrName.Provenance, to fix #7672

Trac #7672 has a data type T in module A that is in scope
*both* locally-bound *and* imported (with a qualified) name.
The Provenance of a GlobalRdrElt simply couldn't express that
before. Now you can.

In doing so, I flattened out Provenance into GlobalRdrElt,
so quite a lot of modules are touched in a not-very-interesting
way.

comment:18 Changed 2 months ago by simonpj

I've finally nailed this one. I thought it was going to be easy and just got sucked in. But I did lots of good clean-up along the way. Simon

comment:19 Changed 2 months ago by ezyang

  • Resolution set to fixed
  • Status changed from new to closed

Thank you Simon! Marking as fixed.

Note: See TracTickets for help on using tickets.