|Version 20 (modified by nomeata, 2 years ago) (diff)|
In a thread on glasglow-haskell-users in February some ideas about splitting base in smaller components were floating around. This wiki page tries to assemble ideas on how to re-group the modules.
This has been discussed before, e.g. in 2008.
Structural changes to the base package can be attempted towards the following goals:
To allow changes to internals without forcing a version-bump on ‘base’, on which every package depends
SPJ: But that goal needs a bit of unpacking. Suppose we divided base into six, base1, base2, base3, etc, but each was a vertical silo and every other package depended on all six. Then nothing would be gained; bumping any of them would cause a ripple of bumps down the line.
To allow packages to be explictly about what they need
A library that does not use the IO monad could communicate that just by not depending on some base-io package. Similar with the Foreign Function Interface or unsafe operations.
To allow alternative implementations/targets
More appropriate string types in IO
Johan would like to have text Handles use the Text type and binary Handles use the ByteString type. Right now we have this somewhat awkward setup where the I/O APIs are spread out and bundled with pure types. Splitting base would let us fix this and write a better I/O layer.
Avoid code copies
Johan says: The I/O manager currently has a copy of IntMap inside its implementation because base cannot use containers. Splitting base would let us get rid of this code duplication.
Right now, if a package depends on a specific version of base, there's no way to compile it with GHC that provides a different version of base.
After the split, hopefully, many subpackages of base will lose their «magic» status and become installable via cabal.
Split base into as FEW packages as possible, consistent with meeting the other goals
In contrast to the non-goal of splitting base as much as possible. Johan points out, a split now could paint us into a corner later, so we should not gratuitously split things up.
Large base, re-exporting API packages
- No to little changes to the actual code in base
- Easier to define the APIs as desired, i.e. focused and stable, without worrying about implementation-imposed cycles
- No need to include internal modules in the API packages
- Alternative compilers/targets can provide these APIs with totally independent implementations
Actual base split
- Forces disentanglement of the implementation (i.e. IOError-less error)
- Hence further development may be easier (according to Ian)
- Some base-foo package can use other libraries like containers (IntMap issue)
- Alternative compilers/targets may only have to reimplement some of the base-* packages.
- Possibly fewer modules in “magic” packages that cannot be installed via cabal.
This is a list of interdependencies between seemingly unrelated parts that need to be taken into consideration:
- class Monad mentions String, hence pulling Char
- class Monad mentions error and Data.Int requires throw DivideByZero, hence pulling in exceptions
- Exceptions pull in Typeable
- Typeable pulls in GHC.Fingerprint
- GHC.Fingerprint pulls in Foreign and IO (but could be replaced by a pure implementation)
- The Monad instance of IO calls failIO, which creates an IOException, which has fields for handles and devices, and hence pulls in some Foreign stuff and some file-related IO, preventing the creation of a clean base-io package. There exists a somewhat backwards compatible work-around.
- Some names of base are hardcoded in GHC and hence cannot be moved to a different package name without changes in GHC. This includes:
- The Num constraint on polymorphic literals. Can be avoided by writing fromIntegral 0 instead of 0.
- Similar, the [x..y] syntax generates a base:GHC.Enum.Enum constraint, RebindableSyntax does not help (GHC bug?)
- StablePtr, as used in GHC.Stable
- Typeable, Show when used in deriving. Can probably be avoided by hand-writing instances. Read can probably move completely out.
- error has its type wired in GHC when in package base; This is used in a hack in GHC/Err.hs-boot. Work-around: Import GHC.Types in GHC/Err.lhs-boot
- The Monad constraint on do-notation expects the definition to live in base. RebindableSyntax helps, but requires to define a local ifThenElse function.
- The ST Monad can (and should) be provided independently of IO, but currently functions like unsafeIOToST are provided in the Control.Monad.ST namespace.
Joachim has started a first attempt to pull stuff out of the bottom of base. See https://github.com/nomeata/packages-base/blob/base-split/README.md for an overview of progress and a description of changes. Use git clone git://github.com/nomeata/packages-base.git; git checkout base-split to experiment. This *does* try to split out as many packages as possible, just to see what is possible.