In this page we collect proposals and design discussion for reorganising the packages that come with compilers, and the contents of those packages.
None of the ideas herein are claimed to belong to any particular person, many of the ideas have been extracted from mailing list discussions, eg.
Some of the points are GHC-specific. Please feel free to insert points specific to other compilers.
- It would be good to have set of 'core' packages that is installed with every Haskell implementation. More on this at PackageReorg/Rationale page
- Forwards compatibility. Users would like their programs written against the 'core' packages to continue to work, without modification to source text or build system, after upgrading the compiler, or its packages, or switching to a different compiler.
- Backwards compatibility. Users would like to be able to take a program written against some version of the 'core' packages, and build it with an older compiler, accepting that they may have to install newer versions of the 'core' packages in order to do so.
It may not be possible to fully achieve these goals (in particular, backwards compatibility), but that does not mean we should not aim for them.
Here's a straw-man proposal
- There is a set of packages that come with every conforming Haskell implementation. Let's call these the Core Packages to avoid confusion (Bulat called these the "base packages", but that's an over-used term given that there is a package called base). The good thing about the Core Packages is that users know that they will be there, and they are consistent with each other.
- Any particular implementation may install more packages by default; for example GHC will install the template-haskell and stm packages. Let's call these the GHC Install Packages, Hugs Install Packages etc; the Install Packages are a superset of the Core Packages.
What is in the Core Packages?
The Core Packages are installed with every conforming Haskell implementation. What should be in the Core? There is a tension:
- As much as possible; which means in practice widely-used and reasonably stable packages. It is convenient for programmers to have as much as possible in a consistent, bundle that is (a) known to work together bundle, and (b) known to work on all implementations.
- As little as possible; which in practice means enough to run Cabal so that you can run the Setup files that come when downloading new packages. As Ian puts it: the less we force the implementations to come with, the quicker compilation will be when developing, the smaller Debian packages (for example) can be, the lower the disk space requirements to build GHC, the lower the time wasted when a Debian package (for example) build fails and the fewer packages we are tangling up with compiler release schedules.
There's a real choice here: Bulat wants (1) and Ian wants (2).
Initial stab at (1):
- Some regex packages (precisely which?)
- unix or Win32. Questionable, partly because it means the Core interface becomes platform-dependent; and partly because Win32 would double the size of the Hugs distribution.
- QuickCheck (questionable)
- HUnit (questionable)
Initial stab at (2):
- filepath (?)
Bulat: i think that all regex packages should be included and of course libs that helps testing. overall, it should be any general-purpose lib that porters accept (enlarging this set makes users live easier, and porters live harder)
about unix/win32 - these libs provide access to OS internals, not some everywhere-portable API. moreover, other world-interfacing libs (i/o, networking) should use APIs provided by these libs with a conditional compilation (CPPery) tricks in order to provide portable APIs! current situation where such libs use FFI isn't ideal. WinHugs size problem is rather technical - it includes a lot of DLLs which contains almost the same code
i agree to start with minimal stub, and then proceed with discussing inclusion of each library. what we need now is requirements to include library in this set and lifetime support procedure. so:
Requirements to libraries to be included in core set
- BSD-licensed, and even belongs to Haskell community?
- portable (is sense of compiler and OS), may be just Haskell' compatible?
- already widely used
- shouldn't duplicate existing core libs functionality (?)
Exact inclusion, support and exclusion processes?
The base package
The base package is a bit special
- Package base is rather big at the moment.
- From a user's point of view it would be nicer to give it a compiler-independent API. (A module like GHC.Exts would move to a new package ghc-base.)
Thinking of GHC alone for a moment, we could have a package ghc-base (which is pretty much the current base) and a thin wrapper package base that re-exposes some, but not all, of what ghc-base exposes. To support this re-exposing, we need a small fix to both GHC and Cabal, but one that is independently desirable.
Similarly, Hugs could build hugs-base from the same souce code, by using CPP-ery, exactly as now. The thin base wrapper package would not change.
To make base smaller, we could remove stuff, and put it into separate packages. But be careful: packages cannot be cyclic, so anything that is moved out can't be used in base. Some chunks that would currently be easy to split off are:
- Data.ByteString.* (plus future packed Char strings)
- Control.Applicative (?), Data.Foldable, Data.Monoid (?), Data.Traversable, Data.Graph, Data.IntMap, Data.IntSet, Data.Map, Data.Sequence, Data.Set, Data.Tree
Some other things, such as arrays and concurrency, have nothing else depending on them, but are so closely coupled with GHC's internals that extracting them would require exposing these internals in the interface of base.
Bulat: my ArrayRef library contains portable implementation of arrays. there is only thin ghc/hugs-specific layer which should be provided by ghcbase/hugsbase libs. except for MPTC problem (IArray/MArray classes has multiple parameters), this library should be easily portable to any other haskell compiler
See also BaseSplit.
Other non-core packages would probably have their own existence. That is, they don't come with an implementation; instead you use cabal-get, or some other mechanism, such as your OS's package manager. Some of these currently come with GHC, and would no longer do so
Bulat: i propose to unbundle only graphics/sound libs because these solves particular problems and tends to be large, non-portable (?) and some are just legacy ones - like ObjectIO. we should keep everything small & general purpose, including HUnit, arrows, fgl, html and xhtml, and include even more: ByteString, regex-*, Edison, Filepath, MissingH, NewBinary, QuickCheck, monads
We should separate out package-specifc tests, which should be part of the repository for each package. Currently they are all squashed together into the testsuite repository.
Notes about GHC
These are exactly the libraries required to build GHC. That shouldn't be the criterion for the core packages.
One reason we do this is because it means that every GHC installation can build GHC. Less configure-script hacking. (NB: even today if you upgrade any of these packages, and then build GHC, the build might fail because the CPP-ery in GHC's sources uses only the version number of GHC, not the version number of the package.)
Still, for convenience we'd probably arrange that the GHC Install Packages included all the GHC Boot Packages.
Every GHC installation must include packages: base, ghc-prim, integer and template-haskell, else GHC itself will not work. (In fact haskell98 is also required, but only because it is linked by default.)
So GHC's Install Packages would be the Core Packages plus
You can upgrade any package, including base after installing GHC. However, you need to take care. You must not change a number of things that GHC "knows about". In particular, these things must not change
- Defining module
GHC knows even more about some things, where you must not change
- Type signature
- For data types, the names, types, and order of the constructors
The latter group are confined to packages base and template-haskell.
(Note: a few other packages are used by tests in GHC's test suite, currently: mtl, QuickCheck. We should probably eliminate the mtl dependency; but QuickCheck is used as part of the test infrastructure itself, so we'll make it a GHC Boot Package.)
Notes about Hugs
Recent distributions of Hugs come in two sizes, jumbo and minimal. Minimal distributions include only the packages base, haskell98 and Cabal. (Hugs includes another package hugsbase containing interfaces to Hugs primitives.) The requirements for this set are to
- run Haskell 98 programs
- allow packages to be added and upgraded using Cabal
(Currently cpphs is a Haskell 98 program, so the latter implies the former.)
It should be possible to upgrade even the core packages using Cabal.