|Version 3 (modified by 4 years ago) (diff),|
Email thread here.
Suppose we have
newtype Age = MkAge Int
n :: Int, we can convert
n to an
MkAge n :: Age.
Moreover, this conversion is a type conversion only, and involves no runtime
instructions whatsoever. This cost model -- that newtypes are free -- is important
to Haskell programmers, and encourages them to use newtypes freely to express
type distinctions without introducing runtime overhead.
Alas, the newtype cost model breaks down when we involve other data structures. Suppose we have these declarations
data T a = TLeaf a | TNode (Tree a) (Tree a) data S m a = SLeaf (m a) | SNode (S m a) (S m a)
and we have these variables in scope
x1 :: [Int] x2 :: Char -> Int x3 :: T Int x4 :: S IO Int
Can we convert these into the corresponding forms where the
Int is replaced by
Alas, not easily, and certainly not without overhead.
x1we can write
map MkAge x1 :: [Age]. But this does not follow the newtype cost model: there will be runtime overhead from executing the
mapat runtime, and sharing will be lost too. Could GHC optimise the
mapsomehow? This is hard; apart from anything else, how would GHC know that
mapwas special? And it it gets worse.
x2we'd have to eta-expand:
(\y -> MkAge (x2 y)) :: Char -> Age. But this isn't good either, because eta exapansion isn't semantically valid (if
seqcould distinguish the two). See #7542 for a real life example.
x3, we'd have to map over
mapT MkAge x3. But what if
mapTdidn't exist? We'd have to make it. And not all data types have maps.
Sis a harder one: you could only map over S-values if
mwas a functor. There's a lot of discussion abou this on #2110.
Clearly what we want is a way to "lift" newtype constructors (and dually deconstructors)
over arbitrary types, so that whenever we have some type
blah Int blah we can convert it
to the type
blah Age blah, and vice versa.
Tantalisingly, System FC, GHC's internal Core language, already has exactly this!
- A newtype constructor turns into an FC cast:
MkAge x turns into x |> AgeNTCo where AgeNTCo :: Int ~ Age
|> is a cast, and the
AgeNTCo is a coercion axiom witnessng the equality of
- Coercions can be lifted, so that
[AgeNTCo] :: [Int] -> [Age] Char -> AgeNTCo :: (Char -> Int) ~ (Char -> Age) T AgeNTCo :: T Int ~ T Age S IO AgeNTCo :: S IO Int ~ S IO Age
So all we need is concrete syntax to allow you to ask for these lifed coercions in Haskell. I couldn't think of a good way to do this, but now I have a proposal:
newtype wrap w1 :: [Int] -> [Age]) newtype wrap w2 :: (Char -> Int) -> (Char -> Age) newtype wrap w3 :: T Int -> T Age ..etc..
newtype unwrap u1 :: [Age] -> [Int]) newtype unwrap u2 :: (Char -> Age) -> (Char -> Int) ..etc...
This brings into scope the variables
w2, etc, with the declared types. Applications of these wrappers
and unwrappers have the same cost model as newtype constructors themselves: they are free.
- The type specified in a newtype wrap/unwrap declaration must be of the form
type1 -> type2.
unwrapare keywords, but only when they occur right after the keyword
- Wrappers/unwrappers can be polymorphic
newtype wrap foo :: [(Int,b)] -> [(Age,b)]
- Multiple "layers" of newtype can be wrapped at once (just as in
foreigndeclarations). For example:
newtype Fun a = MkFun (a -> a) newtype Age = MkAge Int newtype wrap foo :: (Int -> Int) -> Fun Age
Suppose we have
module Map( ... ) where data Map a b = ...blah blah... module Age( ... ) where newtype Age = MkAge Int
Now suppose we want a newtype wrapper like this
import Map import Age newtype wrap foo :: Map Int Bool -> Map Age Bool
Could we write
foo by hand? (This is a good criterion, I think.) Only if we could see the data constructors of both
- If we can't see the data constructor of
Agewe might miss an invariant that Ages are supposed to have. For example, they might be guaranteed positive.
- If we can't see the data constructors of
Map, we might miss an invariant of Maps. For example, maybe
Mapis represented as a list of pairs, ordered by the keys. Then, if
Ageorders in the reverse way to
Int, it would obviously be bad to substitute.
Invariants like these are difficult to encode in the type system, so we use "exporting the constructors" as a proxy for "I trust the importer to maintain invariants". The "Internals" module name convention is a signal that you must be particularly careful when importing this module; runtime errors may result if you screw up.
One possible conclusion: if we have them at all, newtype wrappers should only work if you can see the constructors of both the newtype, and the type you are lifting over.
But that's not very satisfactory either.
- There are some times (like
IO) where it *does* make perfect sense to lift newtypes, but where we really don't want to expose the representation.
Mapis also a good example: while
Map Age Boolshould not be converted to
Map Int Bool, it'd be perfectly fine to convert
Map Int Ageto
Map Int Int.
- The criterion must be recursive. For example if we had
data Map a b = MkMap (InternalMap a b)it's no good being able to see the data constructor
MkMap; you need to see the constructors of
The right thing is probably to use kinds, and all this is tantalisingly close to the system suggested in Generative type abstraction and type-level computation. Maybe we should be able to declare Map to be indexed (rather than parametric) in its first parameter.
More thought required.
type family F a type instance F Int = Int type instance F Age = Char data T a = MkT (F a) newtype wrap bad :: T Int -> T Age bogus :: Int -> Char bogus n = case (bad (MkT n)) of MkT c -> c
The problem is, as usual, the type function hiding inside T's definition. The solution is described in [http://research.microsoft.com/en-us/um/people/simonpj/papers/ext-f/ Generative type abstraction and type-level computation]. It is still not implemented, alas, but adding the newtype wrappers introduces no problems that we do not already have.