Opened 3 years ago

Closed 21 months ago

#10680 closed task (fixed)

Make Backpack order-independent (again)

Reported by: ezyang Owned by: ezyang
Priority: normal Milestone:
Component: Package system Version: 7.11
Keywords: Cc: skilpat
Operating System: Unknown/Multiple Architecture: Unknown/Multiple
Type of failure: None/Unknown Test Case:
Blocked By: Blocking:
Related Tickets: Differential Rev(s):
Wiki Page:

Description (last modified by ezyang)

When we moved to the new bkp file format, we also went back to the a format which is order-dependent: that is to say, the order in which you put the declarations matters. So if you write:

unit p where
  module A where
    import B
  module B where
    ...

this fails to type-check, GHC complaining that B is not in scope. I did this, in part because it's what the Backpack paper described, and because it was "simpler" to implement.

I think we should move back to an order-independent scheme, for the following reasons:

  1. Haskell users are used to not needing pay particularly close attention to the ordering of their modules, and forcing people to linearize their module descriptions would be spectacularly disruptive with large amounts of modules. So un-ordered modules are "more natural for a traditional Haskell user.
  1. Order-independence imposes some constraints on how expressive programs are (with order-dependent Backpack, you can do some pretty tricky things by ordering things certain ways); this could simplify some aspects of compiler implementation and make Backpack easier to explain.
  1. A particular case of (2): it seems a lot simpler UX-wise to let a user assume that if you import a module M in a unit, it doesn't matter where you import it: you always get the same set of identifiers brought into scope. Thus, the incremental results of signatures should not be visible, c.f. #10679

The main idea is that only the surface-syntax is un-ordered: the internal representation of units is a DAG which we work out in an elaboration phase, not altogether unsimilar from what GhcMake computes. An important auxiliary idea is that import A where A is backed by some signatures depends on EVERY signature in scope.

Here are the details:

The intermediate representation. We translate into an intermediate representation which consists of a directed graph of:

  • Each source-level module, signature and include, and
  • Each unfilled requirement (called a “signature merge” node).

The edges of the directed graph signify a “depends on” relation, and are defined as follows:

  • An include p depends on include q if, for some module name m, p requires m and q provides m.
  • An include p depends on a module m if p requires a module named m.
  • A module/signature m depends on include p if m imports a module provided by p.
  • A module/signature m depends on a module n if m imports n.
  • A module/signature m depends on a signature merge n if m imports n.
  • A module/signature m depends on a signature n if m {-# SOURCE #-} imports n.
  • A signature merge m depends on a local signature m (if it exists).
  • A signature merge m depends on a include p, if the (renamed) include requires m.

Elaboration. Take a Backpack file, construct this graph, and topsort it into a DAG of SCCs. SCCs with a single node are compileable as before. SCCs with multiple nodes will have to be managed with some mutual recursion mechanism; see refinements for more thoughts on this.

Refinements:

  1. Can a signature depend on a (home) module? Imports of this kind require a retypecheck loop. Consider this situation:
    unit p where
      signature H where
        data T
      module M where
        import H
        data S = S T
    unit q where
      include p
      module Q where
        import M
      signature H where
        import Q
        data T = T S
    
    Here, signature H in q depends on Q. When we typecheck Q, we bring M.S into the type environment with a TyThing that describes the constructor as accepting an abstract type T. However, when we subsequently typecheck the local signature H, we must refine all TyThings of T with the true description (e.g. constructor information). So you'll need to retypecheck Q (and M) in order to make sure the TyThing is correct.
  1. Can an include depend on a (home) module? If the module has no (transitive) dependency on signatures, this is fine. However, it's easy to have a circular dependency. Consider:
    unit p where
      signature A -- imports nothing
      signature B -- imports nothing
      module M
    unit q where
      include p
      module B where
        import A
        ...
    
    B depends on p for p/A.hsig; however, p depends on B because this module is filling a requirement. However, if we were to include the internal graph of p into q, the resulting graph would not have an cycles; so this is one possibility of how to untangle this situation. However, if there's still a cycle (e.g. A imports B), then you will need at least a retypecheck loop, and maybe hs-boot style compilation. We're not going to implement this for now.
  1. Can we deal with include-include dependency cycles? Yes! Just use the Backpack paper's strategy for creating a recursive unit key and compile the two packages hs-boot style. But I'm not planning on implementing this yet.
  1. Can we deal with signature-signature dependency cycles? Ordered Backpack would have supported this:
    unit a-sig where
      signature A where
        data T
    unit ab-sig where
      include a-sig
      signature B where
        import A
        data S = S T
      signature A where
        import B
        data T = T S
    
    In our model, ab-sig has a cycle. However, I believe any such cycle can be broken by creating sufficiently many units:
    unit a-sig where
      signature B where
        data T
      signature A where
        data S = S T
    unit b-sig where
      signature A where
        data S
      signature B where
        data T = T S
    unit ab-sig where
      include a-sig
      include b-sig
    
    In principle, GHC could automatically break import cycles by replacing an import with an import of a reduced signature that simply has abstract type definitions. See #10681. (I'm not sure this is possible for all language features.) This technique would also work for normal modules, assuming that every function is explicitly annotated with a type.

Change History (4)

comment:1 Changed 3 years ago by skilpat

Cc: skilpat added

comment:2 Changed 3 years ago by ezyang

Description: modified (diff)

comment:3 Changed 3 years ago by ezyang

Description: modified (diff)

comment:4 Changed 21 months ago by ezyang

Resolution: fixed
Status: newclosed

We did this, it works, hooray.

Note: See TracTickets for help on using tickets.