|Version 19 (modified by 3 years ago) (diff),|
- Newtype wrappers
Email thread here.
Suppose we have
newtype Age = MkAge Int
n :: Int, we can convert
n to an
MkAge n :: Age.
Moreover, this conversion is a type conversion only, and involves no runtime
instructions whatsoever. This cost model -- that newtypes are free -- is important
to Haskell programmers, and encourages them to use newtypes freely to express
type distinctions without introducing runtime overhead.
Alas, the newtype cost model breaks down when we involve other data structures. Suppose we have these declarations
data T a = TLeaf a | TNode (Tree a) (Tree a) data S m a = SLeaf (m a) | SNode (S m a) (S m a)
and we have these variables in scope
x1 :: [Int] x2 :: Char -> Int x3 :: T Int x4 :: S IO Int
Can we convert these into the corresponding forms where the
Int is replaced by
Alas, not easily, and certainly not without overhead.
x1we can write
map MkAge x1 :: [Age]. But this does not follow the newtype cost model: there will be runtime overhead from executing the
mapat runtime, and sharing will be lost too. Could GHC optimise the
mapsomehow? This is hard; apart from anything else, how would GHC know that
mapwas special? And it it gets worse.
x2we'd have to eta-expand:
(\y -> MkAge (x2 y)) :: Char -> Age. But this isn't good either, because eta exapansion isn't semantically valid (if
seqcould distinguish the two). See #7542 for a real life example.
x3, we'd have to map over
mapT MkAge x3. But what if
mapTdidn't exist? We'd have to make it. And not all data types have maps.
Sis a harder one: you could only map over S-values if
mwas a functor. There's a lot of discussion about this on #2110.
In summary: The programmer expects zero-cost conversions between a newtype N and the type T it is based on. We want to allow the programmer to have zero-cost conversions between C N and C T. Requirements:
- It should be sound, i.e. have an implementation in Core with existing coercions, without new coercions or
- It should respect module boundaries, i.e. not allow the user to create a function which he could not write using plain Haskell (and non-zero cost).
- (desirable:) It should be exportable and composable, i.e. the module of N should be able to specify that a conversion between N and T is allowed (or not), and the module of C should be able to specify if a coercion can be lifted (or not), and if both allow it, the user should be able to combine that even if the modules of N and T are independent.
- (desirable:) It should be possible to use such a coercion in one module in internal code, but not export it.
- (desirable:) It should be possible to add REWRITE rules that turn, for example,
map Ninto the zero-cost conversion from C T to C N.
To clarify these requirements, here some benchmarks; feel free to expand if you have further needs.
- Allow the user to define a function of type
C N -> C Twith a zero-cost model.
- Allow the author of
Data.Setprevent a coercion
Set N -> Set T(as it would break internal invariants).
- Allow the author of a escaping-ensuring
newtype HTML = HTML Stringto use the coercion
[String] -> [HTML]internally, but prevent the user of the library from doing that.
Core already had provided all the necessary feature; the question was just how to offer it on the Haskell level. The implementation comes in form of a
coerce :: Coercible a b -> a -> b and a type class
Coercible that relates two types if they have the same representation, i.e. can be related by a coercion of role Representational (see Roles). See the haddock documentation for
coercible for user-level documentation and Note [Coercible Instances] for information on the implementation.
The implementation fulfills the first goal, the second partly (
C N -> C T is allowed even without
C's data constructors in scope; if
C should be abstract the role of its argument needs to be
Nominal). Due to the ad-hoc nature of the
Coercible instances, the second and third goal are not achieve. No work towards the fifths goal has been done.
Discussion of alternatives
Top level declarations of coercions as functions (e.g.
newtype wrap w1 :: [Int] -> [Age] and
newtype wrap w2 :: (Char -> Int) -> (Char -> Age)).
This is close to the actual implementation, but requires giving names to the coercion and introduces new syntax.
(This is basically the current implementation, advocated by Roman Cheplyaka.)
Given a module providing
data NT a b -- abstract coerce :: NT a b -> a -> b refl :: NT a a sym :: NT a b -> NT b a trans :: NT a b -> NT b c -> NT a c
and additional code to derive stuff like
deriving nt :: NT N T deriving listNT :: NT a b -> NT [a] [b]
One problem with this approach is that if the user can use arbitrary Haskell to mange the
NT values, he can create bottoms. Also, additional syntax is required. It was argued that the benefits over the type-class apporoach (Approach 2) do not warrant the extra syntactical complexity.