Changes between Version 2 and Version 3 of Backpack

Jun 6, 2014 12:56:52 AM (16 months ago)



  • Backpack

    v2 v3  
    1 [ Backpack] is a proposal for retrofitting Haskell with an applicative, mix-in module system. The theory of Backpack is developed in the paper and its accompanying technical appendix; the purpose of this wikipage is to record some of the more practical implementation considerations.
    3 == Backpack versus Cabal ==
    5 A Backpack package is roughly equivalent to a Cabal package.  However, there are some differences.  In Backpack, “module definitions” are unnamed; they are only given a name by module assignment within a package. So, strictly speaking, old style module definitions like `module A where ...` are undefined under Backpack.  Instead, the module layout of a Cabal package constitutes an implicit definition of a Backpack package, with `other-modules` thinned out. So we roughly translate a Cabal file specification (modulo versions):
     1[ Backpack] is a proposal for
     2retrofitting Haskell with an applicative, mix-in module system. The
     3theory of Backpack is developed in the paper and its accompanying
     4technical appendix; the purpose of this wikipage is to record some of
     5the more practical implementation considerations.
     7== Version ranges and signatures ==
     9In the current Haskell package ecosystem, version ranges on
     10build-depends are a way of indicating what libraries a package may build
     11with.  While the [ PVP] is
     12intended to provide a way for library authors to communicate when API
     13changes occur.  However, in practice, downstream authors find it difficult
     14to correctly choose appropriate version ranges for their software (often
     15providing a bound that is far too low, or optimistically asserting that
     16they will be compatible against all future versions of a library).
     18In Backpack, the use of package signatures subsumes this application of
     19version numbers. (Version numbers are still useful, in case an
     20application needs to explicitly exclude a buggy version of a package).
     21In the Backpack papers, these signatures are explicitly recorded.
     23An important caveat is that for backwards-compatibility, there will
     24often be many packages which sport version ranges, but do not have
     25explicit signatures for their holes.  Furthermore, a conversion from
     26version ranges to signatures may be useful for bootstrapping explicit
     27signatures going forwards.  Assuming that, given a package, it is
     28possible to compute its signature, there are a few possible ways to go
     29about computing signatures based on version ranges:
     31- We could simply take the latest acceptable version of a package
     32  and use its signature.
     34- We could compute the “greatest common signature” for the specified
     35  version range in build-depends. This signature is the widest signature
     36  for which all of the versions are compatible with.  Depending on whether
     37  or not the build-depends range is correct, the signature could be buggy.
     38  (ezyang: When I merge two type signatures together, does Backpack require
     39  them to be identical? I think so, but it's not obvious from the paper.
     40  If this is the case, it could be annoying if someone generalized a
     41  function--but I don't think there is anything we can do here.)
     42  Note that while a signature built this way is guaranteed to be
     43  backwards-compatible, it may not be forwards-compatible, in the sense
     44  that a future module might not typecheck against the signature,
     45  but still be linkable against our library.
     47- For maximum backwards and forwards compatibility, the greatest common
     48  signature could be refined by finding the thinnest signature with
     49  which our package type checks with. This is the “ground truth” with
     50  regards to what our package relies on.  If this could be completely
     51  automatically calculated, version ranges in build-depends would be
     52  completely unnecessary; however, it is not possible to infer this
     53  information just from usage sites--thus it may be useful to have
     54  some "known to build against" versions to start the process (however,
     55  a full version range may not be necessary).
     57== Cabal versus Backpack ==
     59At a first glance, a Cabal package looks roughly equivalent to a
     60Backpack package.  However, the precise details of this correspondence
     61are tricky: there a few choices to be made that affect the design space.
     63As a running example, here is a simple Cabal file specifying a
    14 into a Backpack package:
     73The package is associated with a directory of named modules
     74corresponding to other-modules and exposed-modules.
     76=== Cabal packages as a fully linked Backpack package ===
     78A simple model is to treat a Cabal package as a fully linked Backpack
     79package.  Build dependencies are unambiguous, because to Backpack, there
     80is only one implementation of any given package (selected by the
     81dependency solver).  This is roughly how Cabal packages work today, and
     82this encoding shares many of the same downsides (two versions of the
     83same package cannot coexist in the same unit, build-dependency
     84resolution has little relationship to whether or not a package can
     85actually build against a given version, etc).
     87The basic story is that named modules in a Cabal package implicitly
     88defines the module binding (where the module implementation is unnamed)
     89in the Backpack package, with non-exported modules thinned out.  Holes
     90are not used: the bindings of a Backpack package are conceptually part
     91of a global namespace, and any build dependencies are simply dumped into
     92this namespace (using include).  Here is the translation of our example:
    25 (Here the quoted strings indicate file inclusion, ignoring the listed module name.) A number of Backpack features are exercised: mix-in inclusion is used to represent build-dependencies and hidden modules (other-modules) are enforced using thinning (Section 2.4 of the paper). An important detail is that the module lists must be topologically sorted
    27 === Cabal-specific features ===
    29 Cabal is not just a packaging mechanism, but it also handles a number of concerns that Backpack does not talk about:
    31 1. Cabal packages can include C code and other foreign language code, whereas Backpack is only Haskell code,
    33 2. Cabal packages can specify a custom build process, which Backpack says nothing about, and
    35 3. Cabal packages have a conditional flags mechanism, by which various "options" can be tweaked at compile time. (Needless to say, this can cause very bad problems for modularity!) In Backpack, physical module identity should be different over conditional flags, so that modules which are compiled differently are considered differently as well.
    37 === Backpack's improvements over Cabal ===
    39 Without using any of Backpack's support for separate modular development, Backpack already delivers some improvements over Cabal.
    41 Backpack is more permissive/expressive with accidental module name clashes than Cabal is, precisely because of the logical-physical distinction in names. If packages P and Q both have a module named A, then in any third package R that depends on both P and Q, you can simply rename, e.g., P's A to PA in order to avoid that clash.  (On the other hand, you can also use renaming on inclusion to get two differently-named modules to link together.)  Renaming is briefly described in Section 2.4 of the paper.
     103Where the quoted strings indicate the file that a physical moudle
     104inclusion lives in. (Strictly speaking, we should ignore the module
     105name that lives in that file.)
     107An executable is just a package that contains a special Main module,
     108which will be used as the entry point for the executable (executables
     109are, naturally, mutually exclusive of one another.) But see
     112There are a few subtleties here:
     114- The list of includes does not need to be topologically sorted, since
     115  every depended upon package manages its own dependencies.
     117- The applicative semantics of Backpack mean that Backpack can identify
     118  all includes of the same module as physically identical, so there aren't
     119  any unification problems.
     121- Because the logical name always coincides with the physical name,
     122  we can define the physical identity of modules as just their logical name.
     124=== Cabal packages as Backpack packages with holes ===
     126While the previous model essentially faithfully preserves the previous
     127semantics of Cabal packages, we might be interested in a more flexible
     128semantics, which provides some of the advantages of Backpack but doesn't
     129require Cabal packages to be rewritten.  For example, we may want to
     130give users more flexibility in how the build-dependencies of packages
     131are fulfilled: instead of treating Cabal packages as fully linked
     132Backpack packages, we want to treat them as Backpack packages with
     135This requires two main changes:
     137- The logical namespace of modules should no longer be considered a
     138  global namespace.  This makes it possible to deal with name clashes,
     139  since if two packages have a module named A, a third package that
     140  depends on both of them can rename one instance to avoid a clash.
     141  (Unfortunatey, logical and physical names no longer coincide, so
     142  things need to be renamed.)  A killer use-case for this functionality
     143  is seamless backwards compatibility packages for old versions of
     144  libraries.
     146- Dependency resolution should not be considered as some magical process
     147  by which fully-linked Backpack packages are created, but rather, a
     148  mechanism for automatically linking packages with holes (the Cabal
     149  packages) together, whose mechanism could be overridden by
     150  the user.
     152In this universe, our Backpack translation looks more like this:
     155package base-sig where
     156  Prelude :: ???
     157  Data.Bool :: ???
     158  ...
     160package abc (A, B, C) where
     161  include base-sig
     162  Internal = "Internal.hs"
     163  A = "A.hs"
     164  B = "B.hs"
     165  C = "C.hs"
     168The crux of the matter is providing the "signature package" associated
     169with base, which contains signatures for all of its modules.  In a
     170new-world order, we might expect packages to actually provide signature
     171packages (more on this later), but for the vast majority of old
     172packages, we'll have to compute this automatically, as described in
     173"Version ranges and signatures".
     175At this point in time, the design for the extended linking mechanism
     176is not well specified, however, it might include the following:
     178- Full Backpack packages can explicitly include an upstream Cabal
     179  package with holes and explicitly fill in dependencies. This
     180  is "full manual".
     182=== The next-generation of Cabal-Backpack packages ===
     184To take full advantage of Backpack features,
     186------ >8 -------
    43188More importantly, Backpack offers a tantalizing story for managing different versions of packages, alluded to in the paper, but not elaborated on. In Cabal, the version number range of a build-depends implicitly defines a "signature" which we depend on. There are a few ways this signature could be computed:
    51196By the way, this means that naively implementing the suggestion in the Backpack paper where modules provide signature files is quite dodgy, because the signature file is likely to contain too much “stuff”, and in any case, needs to be referred to in a way that is stable across versions. On the other hand, one could easily manage partitioning signatures into “latest and greatest” versus “well, this is pretty stable for most people”; differently, one could pin to a signature for a version, and as long as the package doesn’t break backwards compatibility that signature will work for a while.
     198=== Interlude: Cabal-specific features ===
     200Cabal is not just a packaging mechanism, but it also handles a number of
     201concerns that Backpack does not talk about:
     2031. Cabal packages specify some out-of-band information, such as package
     204metadata (copyright, author, etc), preprocessor dependencies, data
     205files, documentation, test suites, how to build the package (if not
     206standard), external library dependencies, Haskell executables which
     207specify programs.
     2092. Cabal packages can include C code and other foreign language code,
     210whereas Backpack is only Haskell code, and
     2123. Cabal packages have a conditional flags mechanism, by which various
     213"options" can be tweaked at compile time. (Needless to say, this can
     214cause very bad problems for modularity!)
     216In the fully-linked setting, none of these features need to be treated
     217specially.  However, to improve over Cabal, Backpack needs to accomodate
     218(2) and (3).
    53220== More features ==
    57224Scott thinks there is a way to automatically infer the necessary hs-boot signatures for recursive module bindings. His proposal is here: