wiki:SplitBase

Version 18 (modified by nomeata, 14 months ago) (diff)

--

Splitting base

In a thread on glasglow-haskell-users in February some ideas about splitting base in smaller components were floating around. This wiki page tries to assemble ideas on how to re-group the modules.

This has been discussed before, e.g. in [2008 http://www.haskell.org/pipermail/libraries/2008-August/010543.html].

Goals

Structural changes to the base package can be attempted towards the following goals:

To allow changes to internals without forcing a version-bump on ‘base’, on which every package depends

SPJ: But that goal needs a bit of unpacking. Suppose we divided base into six, base1, base2, base3, etc, but each was a vertical silo and every other package depended on all six. Then nothing would be gained; bumping any of them would cause a ripple of bumps down the line.

To allow packages to be explictly about what they need

A library that does not use the IO monad could communicate that just by not depending on some base-io package. Similar with the Foreign Function Interface or unsafe operations.

To allow alternative implementations/targets

A Haskell-to-Javascript compiler will not support File IO, or maybe not even IO at all. It would be desirable such an implementation has a chance to at least provide a complete and API compatible base-pure package, and that one can hence reasonably assume that packages and libraries depending only on base-pure will indeed work without modification. This might be subsumed by fulfilling the previous goal.

More appropriate string types in IO

Johan would like to have text Handles use the Text type and binary Handles use the ByteString? type. Right now we have this somewhat awkward setup where the I/O APIs are spread out and bundled with pure types. Splitting base would let us fix this and write a better I/O layer.

Avoid code copies

Johan says: The I/O manager currently has a copy of IntMap? inside its implementation because base cannot use containers. Splitting base would let us get rid of this code duplication. 

Split base into as FEW packages as possible, consistent with meeting the other goals

In contrast to the non-goal of splitting base as much as possible. Johan points out, a split now could paint us into a corner later, so we should not gratuitously split things up.

Non-Obvious interdependencies

This is a list of interdependencies between seemingly unrelated parts that need to be taken into consideration:

  • class Monad mentions String, hence pulling Char
  • class Monad mentions error and Data.Int requires throw DivideByZero, hence pulling in exceptions
  • Exceptions pull in Typeable
  • Typeable pulls in GHC.Fingerprint
  • GHC.Fingerprint pulls in Foreign and IO (but could be replaced by a pure implementation)
  • The Monad instance of IO calls failIO, which creates an IOException, which has fields for handles and devices, and hence pulls in some Foreign stuff and some file-related IO, preventing the creation of a clean base-io package. There exists a somewhat backwards compatible work-around.

Other issues

  • Some names of base are hardcoded in GHC and hence cannot be moved to a different package name without changes in GHC. This includes:
    • The Num constraint on polymorphic literals. Can be avoided by writing fromIntegral 0 instead of 0.
    • Similar, the [x..y] syntax generates a base:GHC.Enum.Enum constraint, RebindableSyntax does not help (GHC bug?)
    • StablePtr, as used in GHC.Stable
    • Typeable, Show when used in deriving. Can probably be avoided by hand-writing instances. Read can probably move completely out.
    • error has its type wired in GHC when in package base; This is used in a hack in GHC/Err.hs-boot. Work-around: Import GHC.Types in GHC/Err.lhs-boot
    • The Monad constraint on do-notation expects the definition to live in base. RebindableSyntax helps, but requires to define a local ifThenElse function.
  • The ST Monad can (and should) be provided independently of IO, but currently functions like unsafeIOToST are provided in the Control.Monad.ST namespace.

First attempt

Joachim has started a first attempt to pull stuff out of the bottom of base. See https://github.com/nomeata/packages-base/blob/base-split/README.md for an overview of progress and a description of changes. Use git clone git://github.com/nomeata/packages-base.git; git checkout base-split to experiment. This *does* try to split out as many packages as possible, just to see what is possible.