Changes between Version 2 and Version 3 of Backpack


Ignore:
Timestamp:
Jun 6, 2014 12:56:52 AM (12 months ago)
Author:
ezyang
Comment:

updates

Legend:

Unmodified
Added
Removed
Modified
  • Backpack

    v2 v3  
    1 [http://plv.mpi-sws.org/backpack/ Backpack] is a proposal for retrofitting Haskell with an applicative, mix-in module system. The theory of Backpack is developed in the paper and its accompanying technical appendix; the purpose of this wikipage is to record some of the more practical implementation considerations. 
    2  
    3 == Backpack versus Cabal == 
    4  
    5 A Backpack package is roughly equivalent to a Cabal package.  However, there are some differences.  In Backpack, “module definitions” are unnamed; they are only given a name by module assignment within a package. So, strictly speaking, old style module definitions like `module A where ...` are undefined under Backpack.  Instead, the module layout of a Cabal package constitutes an implicit definition of a Backpack package, with `other-modules` thinned out. So we roughly translate a Cabal file specification (modulo versions): 
     1[http://plv.mpi-sws.org/backpack/ Backpack] is a proposal for 
     2retrofitting Haskell with an applicative, mix-in module system. The 
     3theory of Backpack is developed in the paper and its accompanying 
     4technical appendix; the purpose of this wikipage is to record some of 
     5the more practical implementation considerations. 
     6 
     7== Version ranges and signatures == 
     8 
     9In the current Haskell package ecosystem, version ranges on 
     10build-depends are a way of indicating what libraries a package may build 
     11with.  While the [http://www.haskell.org/haskellwiki/Package_versioning_policy PVP] is 
     12intended to provide a way for library authors to communicate when API 
     13changes occur.  However, in practice, downstream authors find it difficult 
     14to correctly choose appropriate version ranges for their software (often 
     15providing a bound that is far too low, or optimistically asserting that 
     16they will be compatible against all future versions of a library). 
     17 
     18In Backpack, the use of package signatures subsumes this application of 
     19version numbers. (Version numbers are still useful, in case an 
     20application needs to explicitly exclude a buggy version of a package). 
     21In the Backpack papers, these signatures are explicitly recorded. 
     22 
     23An important caveat is that for backwards-compatibility, there will 
     24often be many packages which sport version ranges, but do not have 
     25explicit signatures for their holes.  Furthermore, a conversion from 
     26version ranges to signatures may be useful for bootstrapping explicit 
     27signatures going forwards.  Assuming that, given a package, it is 
     28possible to compute its signature, there are a few possible ways to go 
     29about computing signatures based on version ranges: 
     30 
     31- We could simply take the latest acceptable version of a package 
     32  and use its signature. 
     33 
     34- We could compute the “greatest common signature” for the specified 
     35  version range in build-depends. This signature is the widest signature 
     36  for which all of the versions are compatible with.  Depending on whether 
     37  or not the build-depends range is correct, the signature could be buggy. 
     38  (ezyang: When I merge two type signatures together, does Backpack require 
     39  them to be identical? I think so, but it's not obvious from the paper. 
     40  If this is the case, it could be annoying if someone generalized a 
     41  function--but I don't think there is anything we can do here.) 
     42  Note that while a signature built this way is guaranteed to be 
     43  backwards-compatible, it may not be forwards-compatible, in the sense 
     44  that a future module might not typecheck against the signature, 
     45  but still be linkable against our library. 
     46 
     47- For maximum backwards and forwards compatibility, the greatest common 
     48  signature could be refined by finding the thinnest signature with 
     49  which our package type checks with. This is the “ground truth” with 
     50  regards to what our package relies on.  If this could be completely 
     51  automatically calculated, version ranges in build-depends would be 
     52  completely unnecessary; however, it is not possible to infer this 
     53  information just from usage sites--thus it may be useful to have 
     54  some "known to build against" versions to start the process (however, 
     55  a full version range may not be necessary). 
     56 
     57== Cabal versus Backpack == 
     58 
     59At a first glance, a Cabal package looks roughly equivalent to a 
     60Backpack package.  However, the precise details of this correspondence 
     61are tricky: there a few choices to be made that affect the design space. 
     62 
     63As a running example, here is a simple Cabal file specifying a 
     64package: 
    665 
    766{{{ 
     
    1271}}} 
    1372 
    14 into a Backpack package: 
     73The package is associated with a directory of named modules 
     74corresponding to other-modules and exposed-modules. 
     75 
     76=== Cabal packages as a fully linked Backpack package === 
     77 
     78A simple model is to treat a Cabal package as a fully linked Backpack 
     79package.  Build dependencies are unambiguous, because to Backpack, there 
     80is only one implementation of any given package (selected by the 
     81dependency solver).  This is roughly how Cabal packages work today, and 
     82this encoding shares many of the same downsides (two versions of the 
     83same package cannot coexist in the same unit, build-dependency 
     84resolution has little relationship to whether or not a package can 
     85actually build against a given version, etc). 
     86 
     87The basic story is that named modules in a Cabal package implicitly 
     88defines the module binding (where the module implementation is unnamed) 
     89in the Backpack package, with non-exported modules thinned out.  Holes 
     90are not used: the bindings of a Backpack package are conceptually part 
     91of a global namespace, and any build dependencies are simply dumped into 
     92this namespace (using include).  Here is the translation of our example: 
    1593 
    1694{{{ 
     
    23101}}} 
    24102 
    25 (Here the quoted strings indicate file inclusion, ignoring the listed module name.) A number of Backpack features are exercised: mix-in inclusion is used to represent build-dependencies and hidden modules (other-modules) are enforced using thinning (Section 2.4 of the paper). An important detail is that the module lists must be topologically sorted 
    26  
    27 === Cabal-specific features === 
    28  
    29 Cabal is not just a packaging mechanism, but it also handles a number of concerns that Backpack does not talk about: 
    30  
    31 1. Cabal packages can include C code and other foreign language code, whereas Backpack is only Haskell code, 
    32  
    33 2. Cabal packages can specify a custom build process, which Backpack says nothing about, and 
    34  
    35 3. Cabal packages have a conditional flags mechanism, by which various "options" can be tweaked at compile time. (Needless to say, this can cause very bad problems for modularity!) In Backpack, physical module identity should be different over conditional flags, so that modules which are compiled differently are considered differently as well. 
    36  
    37 === Backpack's improvements over Cabal === 
    38  
    39 Without using any of Backpack's support for separate modular development, Backpack already delivers some improvements over Cabal. 
    40  
    41 Backpack is more permissive/expressive with accidental module name clashes than Cabal is, precisely because of the logical-physical distinction in names. If packages P and Q both have a module named A, then in any third package R that depends on both P and Q, you can simply rename, e.g., P's A to PA in order to avoid that clash.  (On the other hand, you can also use renaming on inclusion to get two differently-named modules to link together.)  Renaming is briefly described in Section 2.4 of the paper. 
     103Where the quoted strings indicate the file that a physical moudle 
     104inclusion lives in. (Strictly speaking, we should ignore the module 
     105name that lives in that file.) 
     106 
     107An executable is just a package that contains a special Main module, 
     108which will be used as the entry point for the executable (executables 
     109are, naturally, mutually exclusive of one another.) But see 
     110https://github.com/haskell/cabal/issues/1847 
     111 
     112There are a few subtleties here: 
     113 
     114- The list of includes does not need to be topologically sorted, since 
     115  every depended upon package manages its own dependencies. 
     116 
     117- The applicative semantics of Backpack mean that Backpack can identify 
     118  all includes of the same module as physically identical, so there aren't 
     119  any unification problems. 
     120 
     121- Because the logical name always coincides with the physical name, 
     122  we can define the physical identity of modules as just their logical name. 
     123 
     124=== Cabal packages as Backpack packages with holes === 
     125 
     126While the previous model essentially faithfully preserves the previous 
     127semantics of Cabal packages, we might be interested in a more flexible 
     128semantics, which provides some of the advantages of Backpack but doesn't 
     129require Cabal packages to be rewritten.  For example, we may want to 
     130give users more flexibility in how the build-dependencies of packages 
     131are fulfilled: instead of treating Cabal packages as fully linked 
     132Backpack packages, we want to treat them as Backpack packages with 
     133holes. 
     134 
     135This requires two main changes: 
     136 
     137- The logical namespace of modules should no longer be considered a 
     138  global namespace.  This makes it possible to deal with name clashes, 
     139  since if two packages have a module named A, a third package that 
     140  depends on both of them can rename one instance to avoid a clash. 
     141  (Unfortunatey, logical and physical names no longer coincide, so 
     142  things need to be renamed.)  A killer use-case for this functionality 
     143  is seamless backwards compatibility packages for old versions of 
     144  libraries. 
     145 
     146- Dependency resolution should not be considered as some magical process 
     147  by which fully-linked Backpack packages are created, but rather, a 
     148  mechanism for automatically linking packages with holes (the Cabal 
     149  packages) together, whose mechanism could be overridden by 
     150  the user. 
     151 
     152In this universe, our Backpack translation looks more like this: 
     153 
     154{{{ 
     155package base-sig where 
     156  Prelude :: ??? 
     157  Data.Bool :: ??? 
     158  ... 
     159 
     160package abc (A, B, C) where 
     161  include base-sig 
     162  Internal = "Internal.hs" 
     163  A = "A.hs" 
     164  B = "B.hs" 
     165  C = "C.hs" 
     166}}} 
     167 
     168The crux of the matter is providing the "signature package" associated 
     169with base, which contains signatures for all of its modules.  In a 
     170new-world order, we might expect packages to actually provide signature 
     171packages (more on this later), but for the vast majority of old 
     172packages, we'll have to compute this automatically, as described in 
     173"Version ranges and signatures". 
     174 
     175At this point in time, the design for the extended linking mechanism 
     176is not well specified, however, it might include the following: 
     177 
     178- Full Backpack packages can explicitly include an upstream Cabal 
     179  package with holes and explicitly fill in dependencies. This 
     180  is "full manual". 
     181 
     182=== The next-generation of Cabal-Backpack packages === 
     183 
     184To take full advantage of Backpack features, 
     185 
     186------ >8 ------- 
    42187 
    43188More importantly, Backpack offers a tantalizing story for managing different versions of packages, alluded to in the paper, but not elaborated on. In Cabal, the version number range of a build-depends implicitly defines a "signature" which we depend on. There are a few ways this signature could be computed: 
     
    51196By the way, this means that naively implementing the suggestion in the Backpack paper where modules provide signature files is quite dodgy, because the signature file is likely to contain too much “stuff”, and in any case, needs to be referred to in a way that is stable across versions. On the other hand, one could easily manage partitioning signatures into “latest and greatest” versus “well, this is pretty stable for most people”; differently, one could pin to a signature for a version, and as long as the package doesn’t break backwards compatibility that signature will work for a while. 
    52197 
     198=== Interlude: Cabal-specific features === 
     199 
     200Cabal is not just a packaging mechanism, but it also handles a number of 
     201concerns that Backpack does not talk about: 
     202 
     2031. Cabal packages specify some out-of-band information, such as package 
     204metadata (copyright, author, etc), preprocessor dependencies, data 
     205files, documentation, test suites, how to build the package (if not 
     206standard), external library dependencies, Haskell executables which 
     207specify programs. 
     208 
     2092. Cabal packages can include C code and other foreign language code, 
     210whereas Backpack is only Haskell code, and 
     211 
     2123. Cabal packages have a conditional flags mechanism, by which various 
     213"options" can be tweaked at compile time. (Needless to say, this can 
     214cause very bad problems for modularity!) 
     215 
     216In the fully-linked setting, none of these features need to be treated 
     217specially.  However, to improve over Cabal, Backpack needs to accomodate 
     218(2) and (3). 
     219 
    53220== More features == 
    54221 
     
    56223 
    57224Scott thinks there is a way to automatically infer the necessary hs-boot signatures for recursive module bindings. His proposal is here: http://www.reddit.com/r/haskell/comments/1id0p7/backpack_retrofitting_haskell_with_interfaces/cb42m8l 
     225 
     226 
     227---- 
     228 
     229https://ghc.haskell.org/trac/ghc/wiki/Commentary/GSoCMultipleInstances