Changes between Version 16 and Version 17 of NewtypeWrappers

Sep 13, 2013 9:10:54 PM (2 years ago)

Clean up the page a bit to keep up the signal-to-noise-level on the wiki.


  • NewtypeWrappers

    v16 v17  
    6060 * Allow the author of a escaping-ensuring `newtype HTML = HTML String` to use the coercion `[String] -> [HTML]` internally, but prevent the user of the library from doing that.
     62== The implementation ==
    63 == The opportunity ==
     64Core already had provided all the necessary feature; the question was just how to offer it on the Haskell level. The implementation comes in form of a `coerce :: Coercible a b -> a -> b` and a type class `Coercible` that relates two types if they have the same representation, i.e. can be related by a coercion of role Representational (see [wiki:Roles]). See the haddock documentation for `coercible` for user-level documentation and Note [Coercible Instances] for information on the implementation.
    65 Clearly what we want is a way to "lift" newtype constructors (and dually deconstructors)
    66 over arbitrary types, so that whenever we have some type `blah Int blah` we can convert it
    67 to the type `blah Age blah`, and vice versa.
     66The implementation fulfills the first goal, the second partly (`C N -> C T` is allowed even without `C`'s data constructors in scope; if `C` should be abstract the role of its argument needs to be `Nominal`. Due to the ad-hoc nature of the `Coercible` instances, the second and third goal are not achieve. No work towards the fifths goal has been done.
    69 Tantalisingly, [ System FC], GHC's internal Core language, already has exactly this!
    70  * A newtype constructor turns into an FC cast:
    71 {{{
    72   MkAge x    turns into       x |> AgeNTCo
    73 where
    74   AgeNTCo :: Int ~ Age
    75 }}}
    76 The `|>` is a cast, and the `AgeNTCo` is a coercion axiom witnessng the equality of `Int` and `Age`.
     68== Discussion of alternatives ==
    78  * Coercions can be lifted, so that
    79 {{{
    80   [AgeNTCo]       :: [Int] -> [Age]
    81   Char -> AgeNTCo :: (Char -> Int) ~ (Char -> Age)
    82   T AgeNTCo       :: T Int ~ T Age
    83   S IO AgeNTCo    :: S IO Int ~ S IO Age
    84 }}}
     70=== Proposal 1 ===
    86 So all we need is concrete syntax to allow you to ask for these lifed coercions in Haskell. 
     72Top level declarations of coercions as functions (e.g. `newtype wrap w1 :: [Int] -> [Age]` and `newtype wrap w2 :: (Char -> Int) -> (Char -> Age)`).
    88 == Proposal 1 ==
    90 The first possiblity involves a new top-level declaration:
    91 {{{
    92   newtype wrap w1 :: [Int] -> [Age])
    93   newtype wrap w2 :: (Char -> Int) -> (Char -> Age)
    94   newtype wrap w3 :: T Int -> T Age
    95   ..etc..
    96 }}}
    97 and dually
    98 {{{
    99   newtype unwrap u1 :: [Age] -> [Int])
    100   newtype unwrap u2 :: (Char -> Age) -> (Char -> Int)
    101   ..etc...
    102 }}}
    103 This brings into scope the variables `w1`, `w2`, etc, with the declared types. Applications of these wrappers
    104 and unwrappers have the same cost model as newtype constructors themselves: they are free.
    106 More precisely
    108  * The type specified in a newtype wrap/unwrap declaration must be of the form `type1 -> type2`.
    110  * `wrap` and `unwrap` are keywords, but only when they occur right after the keyword `newtype`.
    112  * Wrappers/unwrappers can be polymorphic
    113 {{{
    114   newtype wrap foo :: [(Int,b)] -> [(Age,b)]
    115 }}}
    117  * Multiple "layers" of newtype can be wrapped at once (just as in `foreign` declarations). For example:
    118 {{{
    119   newtype Fun a = MkFun (a -> a)
    120   newtype Age = MkAge Int
    121   newtype wrap foo :: (Int -> Int) -> Fun Age
    122 }}}
     74This is close to the actual implementation, but requires giving names to the coercion and introduces new syntax.
    12476== Proposal 2 ==
    126 The second possibility is superficially simpler: just provided a new built-in constant with type
    127 {{{
    128   newtypeCast :: NTC a b => a -> b
    129 }}}
    130 Here `NTC` is a built-in type class that witnesses the (free) conversion between `a` and `b`.  Although it would not quite be implemented like this, we would have a built-in instance for each data type (but see Type Soundness below):
    131 {{{
    132   instance NTC a b => NTC [a] [b]
    133 }}}
    134 and two built-in instances for each newtype:
    135 {{{
    136   instance NTC Int b => NTC Age b
    137   instance NTC a Int => NTC a Age
    138 }}}
    139 So to solve a `NTC` constraint you unwrap all those newtypes (being careful about abstraction; see next section). 
    141 This plan requires a bit more paddling under the water on GHC's part, especially during type inference, but it looks a lot more straightforward than I first thought.  Thanks to Roman Cheplyka for advocating this solution.
    143 One issue with type classes is that they cannot be used only internally, so the third use case above would not be possible.
    145 === Status ===
    147 Joachim started to implement this approach. TODO:
    149 * Class `IsNT concr abstr` defined in ghc-prim.GHC.NT ✓
    150 * Deriving code added. ✓
    151 * Check for constructor visibility. ✓
    152 * Check for data constructor argument convertibility. ✓ (but not very elegant yet)
    153 * Make `IncoherentInstances` [ even more incoherent].
    154 * Think about higher-order type arguments.
    155 * Polish error messages.
    156 * Port to the new role-based coercion.
    157 * Add instances to base etc.
    159 Warts and issues:
    161 * The deriving code works in the typechecker and has to generate `HsExpr`, but the implementation is only possible in Core. Currently, a dummy value is inserted by the deriving code and later implemented by a core-to-core pass. Better solution: Add a `HsCore` data constructor to `HsExpr`, similar to `HsCoreTy` in `HsType`?
    163 Code not yet pushed anywhere.
     78(This is basically the current implementation, advocated by Roman Cheplyka.)
    16580== Proposal 3 ==
    167 This is a write-down of a discussion between me (nomeata) and SPJ at TLCA 2013, but my memory might have tricked me and I might have filled in some bits myself, so beware.
     82Given a module providing
    169 Assume there is a module provided by GHC with these definitions
    17185data NT a b -- abstract
    17589trans  :: NT a b -> NT b c -> NT a c
    177 and the intention that `NT a b` is a witness that `a` and `b` have the same representation and that `coerce n` has zero runtime cost.
    179 Besides the trivial `refl` values, values of type `NT` cannot be created by the user, but only by a special `deriving` statement giving the type of a function that creates `NT` values, but without a definition, e.g.
     92and additional code to derive stuff like
    18194deriving nt :: NT N T
    185 The compiler either rejects the declaration or creates an implementation automatically. It should only create an implementation if a non-zero-cost-implementation could be written by the user instead. (I expect that this means that the constructors of the return type are in scope.)
    187 The “could you write it by hand” criteria implies that the expected arguments for the nt value depends on how they type parameters are used in the data constructors, e.g:
    189 {{{
    190 data Foo a = Foo (Bar a)
    191 deriving fooNT' :: NT a b -> NT (Foo a) (Foo b) -- not ok, especially if `Bar` is abstract
    192 deriving fooNT :: NT (Bar a) (Bar b) -> NT (Foo a) (Foo b) -- ok
    193 }}}
    195 Question: `fooNT'` can be constructed from `fooNT` and a possible `barNT :: NT a b -> NT (Bar a) (Bar b)`. Should the compiler do this automatically, if `barNT` is in scope, or is that too much magic?
    197 This solves the abstraction problem for `Data.Map`: The library author only exports `NT a b -> NT (Map k a) (Map k b)`, but not NT a b -> NT (Map a v) (Map b v)`.
    199 All this amounts to exposing explicit System-FC coercions to the programmer; and that is precisely what we want to do.
    200  * Explicit, because `Age` and `Int` really are different types, and so we can't silently convert between them.
    201  * Explicit, so that we can control precisely which coercions are available (via export lists), and thus control abstraction.
    203 Of course with an `NT` data type, it is possible to define this type class, e.g.
    204 {{{
    205 class NT' a b where
    206   theNT :: NT a b
    207 coerce' :: NT' a b => a -> b
    208 coerce' = coerce theNT
    209 }}}
    210 and use it where more convenient than the function-based variant.
    212 As to the implementation, I quite naively expect that the definition of `NT` and related function in terms of Core to be straight-forward (`newtype NT a b = a ~_R b`). The code that does the `deriving` is maybe non-trivial, as it has to build the term from available newtype axioms (where the constructor is in scope), coercions given as parameters and the application of type constructors to coercions (again, where the data constructors are in scope).
    214 Implementation started at [].
    216 It was [ later argued] that the benefits over the type-class apporoach (Approach 2) do not warrant the extra syntactical complexity.
    218 == Type soundness ==
    220 This proposal rests on very similar foundations to the "newtype deriving" feature,
    221 and suffers from ''precisely'' the same issue with type soundness; see #1496, #5498, #4846, #7148.  For example,
    222 consider
    223 {{{
    224   type family F a
    225   type instance F Int = Int
    226   type instance F Age = Char
    228   data T a = MkT (F a)
    230   newtype wrap bad :: T Int -> T Age
    232   bogus :: Int -> Char
    233  bogus n = case (bad (MkT n)) of
    234               MkT c -> c
    235 }}}
    236 The problem is, as usual, the type function hiding inside T's definition.
    237 The solution is described in [ Generative type abstraction and type-level computation].  It is still being implemented, alas,
    238 but adding the newtype wrappers introduces no problems that we do not already have. See [wiki:Roles] for more info.
     98One problem with this approach is that if the user can use arbitrary Haskell to mange the `NT` values, he can create bottoms. Also, additional syntax is required. It was [ argued] that the benefits over the type-class apporoach (Approach 2) do not warrant the extra syntactical complexity.