Changes between Version 39 and Version 40 of Records/OverloadedRecordFields/Plan


Ignore:
Timestamp:
Oct 16, 2013 7:52:03 AM (6 months ago)
Author:
adamgundry
Comment:

update to latest design

Legend:

Unmodified
Added
Removed
Modified
  • Records/OverloadedRecordFields/Plan

    v39 v40  
    3737=== Record field constraints ===  
    3838 
    39 A record field constraint is introduced when a field is used in an expression. If every `x` in scope is a record field, then an occurrence of `x` has type `a { x :: b } => a -> b` instead of generating an ambiguity error. The overloaded `x` is translated using a typeclass, described below. If there are any normal identifiers `x` in scope (as well as fields) then a use of `x` leads to an ambiguity error.  
    40  
    41 Record field constraints `r { x :: t }` are syntactic sugar for typeclass constraints `Has r "x" t`, where 
    42  
    43 {{{ 
    44 type family GetResult (r :: *) (f :: Symbol) :: * 
    45  
    46 class t ~ GetResult r f => Has r (f :: Symbol) t where 
    47   getField :: proxy f -> r -> t 
    48 }}} 
    49  
    50 Recall that `Symbol` is the kind of type-level strings. Roughly speaking, an occurrence of a field name `x` is translated into `getField (Proxy :: Proxy "x")`. (Actually a slightly more general translation is used, as [#Lensintegration discussed below].) 
    51  
    52 The syntactic sugar extends to conjunctions:  `r {x :: tx, y :: ty}` means `(Has r "x" tx, Has r "y" ty)`. Note also that `r` and `t` might be arbitrary types, not just type variables or type constructors.  For example, `T (Maybe v) { x :: [Maybe v] }` means `(Has (T (Maybe b)) "x" [Maybe v])`. 
    53  
    54 Instances for the `Has` typeclass and `GetResult` type family are automatically generated (for modules with `-XOverloadedRecordFields` enabled) using the record fields that are in scope. For example, the data type 
     39A record field constraint is introduced when a field is used in an expression. If every `x` in scope is a record field, then an occurrence of `x` has type `a { x :: b } => a -> b` (roughly) instead of generating an ambiguity error. The overloaded `x` is translated using a typeclass, described below. If there are any normal identifiers `x` in scope (as well as fields) then a use of `x` leads to an ambiguity error.  
     40 
     41A record field constraint `r { x :: t }` is syntactic sugar for the pair of constraints `(Has r "x", FldTy r "x" ~ t)`, where 
     42 
     43{{{ 
     44type family FldTy (r :: *) (n :: Symbol) :: * 
     45 
     46class Has r (n :: Symbol) where 
     47  getField :: Proxy# n -> r -> FldTy r n 
     48}}} 
     49 
     50Recall that `Symbol` is the kind of type-level strings. Roughly speaking, an occurrence of a field name `x` is translated into `getField (proxy# :: Proxy# "x")`. (Actually a slightly more general translation is used, as [#Lensintegration discussed below].) The type `Proxy#` is zero-width, so it will be erased at runtime, and is used to pass in the type-level string argument, since we don't have explicit type application (yet). 
     51 
     52The syntactic sugar extends to conjunctions:  `r {x :: tx, y :: ty}` means `(Has r "x", FldTy r "x" ~ tx, Has r "y", FldTy r "y" ~ ty)`. Note also that `r` and `t` might be arbitrary types, not just type variables or type constructors.  For example, `T (Maybe v) { x :: [Maybe v] }` means `(Has (T (Maybe b)) "x", FldTy (T (Maybe b)) "x" ~ [Maybe v])`. 
     53 
     54Instances for the `Has` typeclass and `FldTy` type family are automatically generated (for modules with `-XOverloadedRecordFields` enabled) using the record fields that are in scope. For example, the data type 
    5555 
    5656{{{ 
     
    6161 
    6262{{{ 
    63 type instance GetResult (T a) "x" = [a] 
    64  
    65 instance (b ~ [a]) => Has (T a) "x" b where  
     63type instance FldTy (T a) "x" = [a] 
     64 
     65instance Has (T a) "x" where  
    6666  getField _ (MkT x) = x 
    6767}}} 
    6868 
    69 The bare type variable `b` in the instance head is important, so that we get an instance match from the first two parameters only, then the equality constraint `(b ~ [a])` improves `b`. For example, if the constraint `Has (T c) "x" d` is encountered during type inference, the instance will match and generate the constraints `(a ~ c, b ~ d, b ~ [a])`. Moreover, the `GetResult` type family ensures that the third parameter is functionally dependent on the first two, which is needed to [#Troubleinparadise avoid ambiguity errors when composing overloaded fields]. 
    70  
    71 The reason for using a three-parameter class, rather than just two parameters and a type family, is to support the syntactic sugar. With a two-parameter class we could easily end up inferring types like the following, and it would be hard to reapply the sugar: 
    72  
    73 {{{ 
    74 f :: (Has r "x", Has r "y", GetResult r "x" ~ Int, GetResult r "y" ~ Int) => r -> Int 
    75 f r = x r + y r :: Int 
    76 }}} 
     69We have considered using a three-parameter version of the `Has` class, with the third parameter (the field type) being functionally dependent on the first two, but this version seems more straightforward (at the cost of exposing the encoding underlying the syntactic sugar more often). 
    7770 
    7871 
     
    9083}}} 
    9184 
    92 Any module that imports `M` will have access to the `x` field from `R` but not from `S`, because the instance `Has R "x" Int` will be available but the instance `Has S "x" Bool` will not be. Thus `R { x :: Int }` will be solved but `S { x :: Bool }` will not. 
     85Any module that imports `M` will have access to the `x` field from `R` but not from `S`, because the instance `Has R "x"` will be available but the instance `Has S "x"` will not be. Thus `R { x :: Int }` will be solved but `S { x :: Bool }` will not. 
    9386 
    9487 
     
    168161=== Limited type-changing update === 
    169162 
    170 As noted above, supporting a polymorphic version of the existing record update syntax (in its full generality) is difficult. However, we can generate instances of the following class, which permits type-changing update of single fields: 
    171  
    172 {{{ 
    173 type family SetResult (r :: *) (f :: Symbol) (a :: *) :: * 
    174  
    175 class (Has r f (GetResult r f), r ~ SetResult r f (GetResult r f)) => 
    176           Upd (r :: *) (f :: Symbol) (a :: *) where 
    177   setField :: proxy f -> r -> a -> SetResult r f a 
     163As noted above, supporting a polymorphic version of the existing record update syntax (in its full generality) is difficult. However, we can generate instances of the following class and type family, which permit type-changing update of single fields: 
     164 
     165{{{ 
     166type family UpdTy (r :: *) (n:: Symbol) (a :: *) :: * 
     167 
     168class (Has r n, r ~ UpdTy r n (FldTy r n)) => 
     169          Upd (r :: *) (n :: Symbol) (a :: *) where 
     170  setField :: Proxy# n -> r -> a -> UpdTy r n a 
    178171}}} 
    179172 
     
    183176data T a = MkT { x :: [a] } 
    184177 
    185 type instance SetResult (T a) "x" [c] = T c 
     178type instance UpdTy (T a) "x" [c] = T c 
    186179 
    187180instance (b ~ [c]) => Upd (T a) "x" b where 
    188   setField _ r e = r { x = e } 
    189 }}} 
    190  
    191 The third parameter of the `Upd` class represents the new type being assigned to the field, unlike the `Has` class, where it represents the current type. Thus it is not functionally dependent on the first two. Consequently, we must use a bare type variable `b` in the instance declaration, with an equality constraint `b ~ [c]` postponed until after the instance matches. 
    192  
    193 If a type variable is shared by multiple fields, it cannot be changed using `setField`. Moreover, the use of the `SetResult` type family means that phantom type variables cannot be changed. For example, in 
     181  setField _ (MkT _) e = MkT e 
     182}}} 
     183 
     184The third parameter of the `Upd` class represents the new type being assigned to the field. Thus it is not functionally dependent on the first two. Consequently, we must use a bare type variable `b` in the instance declaration, with an equality constraint `b ~ [c]` postponed until after the instance matches. 
     185 
     186If a type variable is shared by multiple fields, it cannot be changed using `setField`. Moreover, the use of the `UpdTy` type family means that phantom type variables cannot be changed. For example, in 
    194187 
    195188{{{ 
     
    201194 
    202195{{{ 
    203 type instance SetResult (V a b c) "foo" (a, b') = V a b' c 
     196type instance UpdTy (V a b c) "foo" (a, b') = V a b' c 
    204197 
    205198instance t ~ (a, b') => Upd (V a b c) "foo" t where 
    206   setField _ r e = r { foo = e } 
     199  setField _ (MkV _ bar) e = MkV e bar 
    207200}}} 
    208201 
     
    210203=== Lens integration === 
    211204 
    212 It was implied above that a field like `foo` translates into `getField (Proxy :: Proxy "foo") :: Has r "foo" t => r -> t`, but this is not quite the whole story. We would like fields to be usable as lenses (e.g. using the [http://hackage.haskell.org/package/lens lens] package). This requires a slightly more general translation, using 
    213  
    214 {{{ 
    215 field :: (Has r f t, Accessor p f) => proxy f -> p r t 
    216 field z = accessor z (getField z) (setField z) 
    217 }}} 
    218  
    219 to translate `foo` to `field (Proxy :: Proxy "foo") :: (Has r "foo" t, Accessor p "foo") => p r t`. The `Accessor` class is defined thus: 
    220  
    221 {{{ 
    222 class Accessor (p :: * -> * -> *) (f :: Symbol) where 
    223   accessor :: proxy f -> (r -> GetResult r f) -> 
    224               (forall a . Upd r f a => r -> a -> SetResult r f a) -> 
    225               p r (GetResult r f) 
    226 }}} 
    227  
    228 An instance of `Accessor p f` means that `p` may contain a getter and setter for the field `f`. In particular, we can give an instance for functions that ignores `f` and the setter completely: 
    229  
    230 {{{ 
    231 instance Accessor (->) f where 
     205It was implied above that a field like `foo` translates into `getField (proxy# :: Proxy# "foo") :: Has r "foo" => r -> FldTy r "foo"`, but this is not quite the whole story. We would like fields to be usable as lenses (e.g. using the [http://hackage.haskell.org/package/lens lens] package). This requires a slightly more general translation, using 
     206 
     207{{{ 
     208field :: Accessor p r n => Proxy# n -> p r (FldTy r n) 
     209field z = accessField z (getField z) (setField z) 
     210}}} 
     211 
     212to translate `foo` to `field (proxy# :: Proxy# "foo") :: Accessor p r "foo" => p r (FldTy r "foo")`. The `Accessor` class is defined thus: 
     213 
     214{{{ 
     215class Accessor (p :: * -> * -> *) (r :: *) (n :: Symbol) where 
     216  accessField :: Proxy# n -> 
     217                 (Has r n => r -> FldTy r n) -> 
     218                 (forall a . Upd r n a => r -> a -> UpdTy r n a) -> 
     219                 p r (FldTy r n) 
     220}}} 
     221 
     222An instance of `Accessor p r n` means that `p` may contain a getter and setter for the field `n` in record type `r`. In particular, we can give an instance for functions that ignores `r`, `n` and the setter completely: 
     223 
     224{{{ 
     225instance Has r n => Accessor (->) r n where 
    232226  accessor _ getter setter = getter 
    233227}}} 
    234228 
    235 Thus, whenever a field `foo` is used at a function type (by applying it or composing it, for example), this instance will be selected. If `z` is a proxy of type `Proxy "foo"`, then `foo` translates to `field z`, which computes to `accessor z (getField z) (setField z)`, and hence to `getField z` by the `Accessor` instance for functions. 
     229Thus, whenever a field `foo` is used at a function type (by applying it or composing it, for example), this instance will be selected. That is, `foo` translates to `field proxy#`, which computes to `accessor proxy# (getField proxy#) (setField proxy#)`, and hence to `getField proxy#` by the `Accessor` instance for functions. 
    236230 
    237231However, `p` does not have to be the function arrow. Suppose the `lens` library defined the following newtype wrapper: 
    238232 
    239233{{{ 
    240 newtype WrapLens f r a 
    241   = MkWrapLens (forall b . Upd r f b => Lens r (SetResult r f b) a b) 
    242  
    243 instance f ~ g => Accessor (WrapLens f) g where 
     234newtype WrapLens n r a 
     235  = MkWrapLens (forall b . Upd r n b => Lens r (SetResult r n b) a b) 
     236 
     237instance m ~ n => Accessor (WrapLens m) r n where 
    244238  accessor _ getter setter = MkWrapLens (\ w s -> setter s <$> w (getter s)) 
    245239 
    246 fieldLens :: Upd r f b => WrapLens f r a -> Lens r (SetResult r f b) a b 
     240fieldLens :: Upd r n b => WrapLens n r a -> Lens r (SetResult r n b) a b 
    247241fieldLens (MkWrapLens l) = l 
    248242}}} 
     
    250244Now `fieldLens foo` is a lens whenever `foo` is an overloaded record field. 
    251245 
    252 Other lens libraries can define their own instances of `Accessor`, even if they do not support type-changing update, and the same machinery enables fields to be used with them. 
     246Other lens libraries can define their own instances of `Accessor`, even if they do not support type-changing update, and the same machinery enables fields to be used with them. For example, here is another possible encoding of lenses: 
     247 
     248{{{ 
     249data DataLens r a = DataLens 
     250   { getDL :: r -> a 
     251   , setDL :: r -> a -> r } 
     252 
     253instance Upd r n (FldTy r n) => Accessor DataLens r n where 
     254  accessField _ g s = DataLens g s 
     255}}} 
     256 
     257Now an overloaded record field `foo` can be used as if it had type `DataLens r a`, and it will just work: we do not even need to use a combinator. 
    253258 
    254259 
     
    264269 
    265270{{{ 
    266 type instance SetResult (T a) "foo" Int = T b 
     271type instance UpdTy (T a) "foo" Int = T b 
    267272}}} 
    268273 
     
    284289 
    285290{{{ 
    286 type instance SetResult (T a) "foo" (Goo b) = T b 
     291type instance UpdTy (T a) "foo" (Goo b) = T b 
    287292}}} 
    288293 
     
    290295 
    291296{{{ 
    292 type instance SetResult (T a) "foo" x = T b 
     297type instance UpdTy (T a) "foo" x = T b 
    293298}}} 
    294299 
     
    302307 
    303308{{{ 
    304 type instance SetResult (U a) "bar" (b -> x) = U b 
     309type instance UpdTy (U a) "bar" (b -> x) = U b 
    305310}}} 
    306311 
     
    315320 
    316321{{{ 
    317 f :: (Has r "g" Int) => r -> Int 
     322f :: r { g :: Int } => r -> Int 
    318323f x = g x + 1 
    319324}}} 
     
    352357 
    353358{{{ 
    354 field (Proxy :: Proxy "foo") 
     359field (proxy# :: Proxy# "foo") 
    355360}}} 
    356361 
     
    408413=== Syntactic sugar for `Upd` constraints === 
    409414 
    410 Should we have a special syntax for `Upd` constraints, just as `r { x :: t }` sugars `Has r "x" t`? What should it look like? Perhaps something like `r { x ::= t }`? 
     415Should we have a special syntax for `Upd` constraints, just as `r { x :: t }` sugars `(Has r "x", FldTy r "x" ~ t)`? What should it look like? Perhaps something like `r { x ::= t }`? 
    411416 
    412417 
     
    416421=== Trouble in paradise === 
    417422 
    418 [http://www.haskell.org/pipermail/glasgow-haskell-users/2013-July/022584.html Edward Kmett points out] that a previous version of this proposal, where the third parameter of `Has` was not functionally dependent on the first two, fell short in an important respect: composition of polymorphic record fields would lead to ambiguity errors, as the intermediate type cannot be determined. For example, suppose 
     423[http://www.haskell.org/pipermail/glasgow-haskell-users/2013-July/022584.html Edward Kmett points out] that a previous version of this proposal, where the `Has` class took three parameters and the third was not functionally dependent on the first two, fell short in an important respect: composition of polymorphic record fields would lead to ambiguity errors, as the intermediate type cannot be determined. For example, suppose 
    419424 
    420425{{{ 
     
    429434}}} 
    430435 
    431 and `b` is an ambiguous type variable. This shows the need for the `GetResult` type family. 
     436and `b` is an ambiguous type variable. This shows the need for the `FldTy` type family. 
    432437 
    433438 
    434439=== Virtual record fields === 
    435440 
    436 We could imagine supporting virtual record fields by allowing the user to declare their own instances of `Has` and `GetResult` (and possibly `Upd` and `SetResult`). For example, the user could write the following: 
     441We could imagine supporting virtual record fields by allowing the user to declare their own instances of `Has` and `FldTy` (and possibly `Upd` and `UpdTy`). For example, the user could write the following: 
    437442 
    438443{{{ 
    439444data Person = MkPerson { firstName :: String, lastName :: String } 
    440445 
    441 type instance GetResult Person "fullName" = String 
    442 instance Has Person "fullName" String where 
     446type instance FldTy Person "fullName" = String 
     447instance Has Person "fullName" where 
    443448  getField _ p = firstName p ++ " " ++ lastName p 
    444449}}} 
    445450 
    446 This means that the `Person` type can be used where a type with a field `fullName` is expected. Since no `SetResult` and `Upd` instances are provided, the field cannot be updated. 
     451This means that the `Person` type can be used where a type with a field `fullName` is expected. Since no `Upd` and `UpdTy` instances are provided, the field cannot be updated. 
    447452 
    448453However, this does not bring `fullName` into scope as a field, [#Scopeissuesorwhywemissdot as previously observed]. Moreover, it is difficult to check the type family instances for consistency. For example, given the following declaration 
    449454 
    450455{{{ 
    451 type instance GetResult a "foo" = Int 
     456type instance FldTy a "foo" = Int 
    452457}}} 
    453458 
    454459we would need to check that any datatype with a field `foo` in scope gave it the type `Int`. For these reasons, user-defined instances of the classes are not currently permitted, so virtual fields are not available. 
    455  
    456  
    457 == Example of constraint solving == 
    458  
    459 Consider the example 
    460  
    461 {{{ 
    462 module M ( R(R, x), S(S, y), T(T, x) ) where 
    463  
    464   data R = R { x :: Int } 
    465   data S = S { x :: Bool, y :: Bool } 
    466   data T = T { x :: forall a . a } 
    467  
    468 module N where 
    469   import M 
    470  
    471   foo e = x e 
    472   
    473   bar :: Bool 
    474   bar = foo T 
    475  
    476   baz = foo S 
    477  
    478   quux = y 
    479 }}} 
    480  
    481 When checking `foo`, `e` is a variable of unknown type `alpha`, and the projection generates the constraint `alpha { x :: beta }` where `beta` is fresh. This constraint cannot be solved immediately, so generalisation yields the type `a { x :: b } => a -> b`. 
    482  
    483 When checking `bar`, the application of `foo` gives rise to the constraint `T { x :: Bool }`, which is solved since `Bool` is an instance of `forall a . a` (the type `T` gives to `x`). 
    484  
    485 When checking `baz`, the constraint `S { x :: gamma }` is generated and rejected, since the `x` from `S` is not in scope. 
    486  
    487 When checking `quux`, the only `y` field in scope is of type `S -> Bool` so that is its type.