Changes between Version 2 and Version 3 of Commentary/Packages

Jul 10, 2009 10:49:46 AM (5 years ago)



  • Commentary/Packages

    v2 v3  
    4444 `PackageId`:: 
    45     Inside GHC, we use the type `PackageId`, which is a `FastString` representation of `InstalledPackageId`. 
    46     The (Z-encoding of) `PackageId` prefixes each external symbol in the generated code, so that the modules of one package do 
    47     not clash with those of another package, even when the module names overlap. 
    49 The tools do not currently support having multiple packages with the same name and version.  When re-installing an existing package, the new package should have a different `InstalledPackageId` from the previous version, even if the `PackageIdentifiers` are the same.  In this way, we can detect when a package is broken because one of its dependencies has been recompiled and re-installed. 
     45    Inside GHC, we use the type `PackageId`, which is a `FastString`.  The (Z-encoding of) `PackageId` prefixes each 
     46    external symbol in the generated code, so that the modules of one package do not clash with those of another package, 
     47    even when the module names overlap. 
    5149== Design constraints == 
    53  1. We want RecompilationAvoidance to work.  So that means symbol names should not contain any information that varies too often, such as the ABI hash of the module, or the package. 
     51 1. We want [wiki:Commentary/Compiler/RecompilationAvoidance] to work.  So that means symbol names should not contain any information that varies too often, such as the ABI hash of the module or package.  The ABI of an entity should depend only on its definition, the definitons of the things it depends on, and compiler settings. 
    55  2. We want to be able to compile a package that is compatible with another package; i.e. exports the same ABI.  Right now it isn't possible to do this, but we hope to be able to do it in the future, and we should design the system with that in mind. 
     53 2. We want to be able to detect ABI incompatibility.  If a package is recompiled and installed over the top of the old one, and the new version is ABI-incompatible with the old one, then packages that depended on the old version should be detectably broken using the tools. 
    57  3. When a package is recompiled and installed, packages that depended on the old version should now be detectably broken (unless the newly compiled version is really compatible with the old one). 
     55 3. ABI compatibility: 
     56    * We want repeatable compilations.  Compiling a package with the same inputs should yield the same outputs. 
     57    * Furthermore, we want to be able to make compiled packages that expose an ABI that is compatible (e.g. a superset) 
     58      of an existing compiled package. 
     59    * Modular upgrades: we want to be able to upgrade an existing package without recompiling everything that depends 
     60      on it, by ensuring that the replacement is ABI-compatible. 
     61    * Shared library upgrades.  We want to be able to substitute a new ABI-compatible shared library for an old one, and all the existing binaries linked against the old version continue to work. 
     62    * ABI compatibility is dependent on GHC too; changes to the compiler and RTS can introduce ABI incompatibilities.  We 
     63      guarantee to only make ABI incompatible changes in a major release of GHC.  Between major releases, ABI compatibilty 
     64      is ensured; so for example it should be possible to use GHC 6.12.2 with the packages that came with GHC 6.12.1. 
    60 (3) means that dependencies in the package database should mention something unique about a package installation that changes when the package is installed.  However, (1) means that we don't want to put such unique things in symbol names. 
     67Right now, we do not have repeatable compilations, so while we cannot do (3), we keep it in mind. 
     69== The Plan == 
     71We need to talk about some more package Ids: 
     73  * `InstalledPackageId`: the identifier of a package in the package database.  The `InstalledPackageId` is just a string, 
     74    but it may contain the package name and API version for documentation. 
     75  * `PackageSymbolId`: the symbol prefix used in compiled code. 
     76  * `PackageLibId`: the package Id placed in library files (static and shared). 
     78=== Detecting ABI incompatibility === 
     80  * in the package database, dependencies specify the `InstalledPackageId`. 
     82  * The package database will contain at most one instance of a given package/version combination.  The tools 
     83    are not currently able to cope with multiple instances (e.g. GHC's -package flag selects by name/version). 
     85  * If, say, package P-1.0 is recompiled and re-installed, the new instance of the package will almost 
     86    certainly have an incompatible ABI from the previous version.  We give the new package a distinct 
     87    `InstalledPackageId`, so that packages that depend on the old P-1.0 will now be broken. 
     89  * `PackageSymbolId`: We do not use the `InstalledPackageId` as the symbol prefix in the compiled code, because  
     90    that interacts badly with [wiki:Commentary/Compiler/RecompilationAvoidance].  Every time we pick a 
     91    new unique `InstalledPackageId` (e.g. when reconfiguring the package), we would have to recompile 
     92    the entire package.  Hence, the `PackageSymbolId` is picked deterministically for the package, e.g. 
     93    it can be just the package name/version. 
     95  * `PackageLibId`: ee do want to put the `InstalledPackageId` in the name of a library file, however.  This allows 
     96    ABI compatibility to be detected by the linker.  This is important for shared libraries too: we 
     97    want an ABI-incompatible shared library upgrade to be detected by the dynamic linker.  Hence, 
     98    `PackageLibId` == `InstalledPackageId`. 
     100=== Allowing ABI compatibilty === 
     102 * The simplest scheme is to have an identifier for each distinct ABI, e.g. a pair of the package name and an integer 
     103   that is incremented each time an ABI change of any kind is made to the package.  The ABI identifier 
     104   by the package, and is used as the `PackageSymbolId`.  Since packages with the same `PackageAbiId` 
     105   are ABI-compatible, the `PackageLibId` can be the same as the `PackageSymbolId`. 
     107 * The previous scheme does not allow ABI-compatible changes (e.g. ABI extension) to be made.  Hence, we could 
     108   generalise it to a major/minor versioning scheme. 
     109   * the ABI major version is as before, the package name + an integer.  This is also the `PackageSymbolId`. 
     110   * the ABI minor version is an integer that is incremented each time the ABI is extended in a compatible way. 
     111   * package dependencies in the database specify the major+minor version they require.  The may be satisfied by  
     112     a greater minor version. 
     113   * `PackageLibId` is the major version.  In the case of shared libraries, we may name the library using the 
     114     major + minor versions, with a symbolic link from the major version to major+minor. 
     115   * the shared library `SONAME` is the major version. 
     117 * The previous scheme only allows ABI-compatible changes to be made in a linear sequence.  If we want a tree-shaped 
     118   compatibility structure, then something more complex is needed (ToDo). 
     120 * The previous schemes only allow compatible ABI changes to be made.  If we want to allow incompatible changes to be 
     121   made, then we need something like ELF's symbol versioning.  This is probably overkill, since we will be making 
     122   incompatible ABI changes in the compiler and RTS at regular intervals anyway.  ABI compatibility is more important 
     123   between major releases of the compiler.