Changes between Version 39 and Version 40 of Records/OverloadedRecordFields/Plan


Ignore:
Timestamp:
Oct 16, 2013 7:52:03 AM (22 months ago)
Author:
adamgundry
Comment:

update to latest design

Legend:

Unmodified
Added
Removed
Modified
  • Records/OverloadedRecordFields/Plan

    v39 v40  
    3737=== Record field constraints ===
    3838
    39 A record field constraint is introduced when a field is used in an expression. If every `x` in scope is a record field, then an occurrence of `x` has type `a { x :: b } => a -> b` instead of generating an ambiguity error. The overloaded `x` is translated using a typeclass, described below. If there are any normal identifiers `x` in scope (as well as fields) then a use of `x` leads to an ambiguity error.
    40 
    41 Record field constraints `r { x :: t }` are syntactic sugar for typeclass constraints `Has r "x" t`, where
    42 
    43 {{{
    44 type family GetResult (r :: *) (f :: Symbol) :: *
    45 
    46 class t ~ GetResult r f => Has r (f :: Symbol) t where
    47   getField :: proxy f -> r -> t
    48 }}}
    49 
    50 Recall that `Symbol` is the kind of type-level strings. Roughly speaking, an occurrence of a field name `x` is translated into `getField (Proxy :: Proxy "x")`. (Actually a slightly more general translation is used, as [#Lensintegration discussed below].)
    51 
    52 The syntactic sugar extends to conjunctions:  `r {x :: tx, y :: ty}` means `(Has r "x" tx, Has r "y" ty)`. Note also that `r` and `t` might be arbitrary types, not just type variables or type constructors.  For example, `T (Maybe v) { x :: [Maybe v] }` means `(Has (T (Maybe b)) "x" [Maybe v])`.
    53 
    54 Instances for the `Has` typeclass and `GetResult` type family are automatically generated (for modules with `-XOverloadedRecordFields` enabled) using the record fields that are in scope. For example, the data type
     39A record field constraint is introduced when a field is used in an expression. If every `x` in scope is a record field, then an occurrence of `x` has type `a { x :: b } => a -> b` (roughly) instead of generating an ambiguity error. The overloaded `x` is translated using a typeclass, described below. If there are any normal identifiers `x` in scope (as well as fields) then a use of `x` leads to an ambiguity error.
     40
     41A record field constraint `r { x :: t }` is syntactic sugar for the pair of constraints `(Has r "x", FldTy r "x" ~ t)`, where
     42
     43{{{
     44type family FldTy (r :: *) (n :: Symbol) :: *
     45
     46class Has r (n :: Symbol) where
     47  getField :: Proxy# n -> r -> FldTy r n
     48}}}
     49
     50Recall that `Symbol` is the kind of type-level strings. Roughly speaking, an occurrence of a field name `x` is translated into `getField (proxy# :: Proxy# "x")`. (Actually a slightly more general translation is used, as [#Lensintegration discussed below].) The type `Proxy#` is zero-width, so it will be erased at runtime, and is used to pass in the type-level string argument, since we don't have explicit type application (yet).
     51
     52The syntactic sugar extends to conjunctions:  `r {x :: tx, y :: ty}` means `(Has r "x", FldTy r "x" ~ tx, Has r "y", FldTy r "y" ~ ty)`. Note also that `r` and `t` might be arbitrary types, not just type variables or type constructors.  For example, `T (Maybe v) { x :: [Maybe v] }` means `(Has (T (Maybe b)) "x", FldTy (T (Maybe b)) "x" ~ [Maybe v])`.
     53
     54Instances for the `Has` typeclass and `FldTy` type family are automatically generated (for modules with `-XOverloadedRecordFields` enabled) using the record fields that are in scope. For example, the data type
    5555
    5656{{{
     
    6161
    6262{{{
    63 type instance GetResult (T a) "x" = [a]
    64 
    65 instance (b ~ [a]) => Has (T a) "x" b where
     63type instance FldTy (T a) "x" = [a]
     64
     65instance Has (T a) "x" where
    6666  getField _ (MkT x) = x
    6767}}}
    6868
    69 The bare type variable `b` in the instance head is important, so that we get an instance match from the first two parameters only, then the equality constraint `(b ~ [a])` improves `b`. For example, if the constraint `Has (T c) "x" d` is encountered during type inference, the instance will match and generate the constraints `(a ~ c, b ~ d, b ~ [a])`. Moreover, the `GetResult` type family ensures that the third parameter is functionally dependent on the first two, which is needed to [#Troubleinparadise avoid ambiguity errors when composing overloaded fields].
    70 
    71 The reason for using a three-parameter class, rather than just two parameters and a type family, is to support the syntactic sugar. With a two-parameter class we could easily end up inferring types like the following, and it would be hard to reapply the sugar:
    72 
    73 {{{
    74 f :: (Has r "x", Has r "y", GetResult r "x" ~ Int, GetResult r "y" ~ Int) => r -> Int
    75 f r = x r + y r :: Int
    76 }}}
     69We have considered using a three-parameter version of the `Has` class, with the third parameter (the field type) being functionally dependent on the first two, but this version seems more straightforward (at the cost of exposing the encoding underlying the syntactic sugar more often).
    7770
    7871
     
    9083}}}
    9184
    92 Any module that imports `M` will have access to the `x` field from `R` but not from `S`, because the instance `Has R "x" Int` will be available but the instance `Has S "x" Bool` will not be. Thus `R { x :: Int }` will be solved but `S { x :: Bool }` will not.
     85Any module that imports `M` will have access to the `x` field from `R` but not from `S`, because the instance `Has R "x"` will be available but the instance `Has S "x"` will not be. Thus `R { x :: Int }` will be solved but `S { x :: Bool }` will not.
    9386
    9487
     
    168161=== Limited type-changing update ===
    169162
    170 As noted above, supporting a polymorphic version of the existing record update syntax (in its full generality) is difficult. However, we can generate instances of the following class, which permits type-changing update of single fields:
    171 
    172 {{{
    173 type family SetResult (r :: *) (f :: Symbol) (a :: *) :: *
    174 
    175 class (Has r f (GetResult r f), r ~ SetResult r f (GetResult r f)) =>
    176           Upd (r :: *) (f :: Symbol) (a :: *) where
    177   setField :: proxy f -> r -> a -> SetResult r f a
     163As noted above, supporting a polymorphic version of the existing record update syntax (in its full generality) is difficult. However, we can generate instances of the following class and type family, which permit type-changing update of single fields:
     164
     165{{{
     166type family UpdTy (r :: *) (n:: Symbol) (a :: *) :: *
     167
     168class (Has r n, r ~ UpdTy r n (FldTy r n)) =>
     169          Upd (r :: *) (n :: Symbol) (a :: *) where
     170  setField :: Proxy# n -> r -> a -> UpdTy r n a
    178171}}}
    179172
     
    183176data T a = MkT { x :: [a] }
    184177
    185 type instance SetResult (T a) "x" [c] = T c
     178type instance UpdTy (T a) "x" [c] = T c
    186179
    187180instance (b ~ [c]) => Upd (T a) "x" b where
    188   setField _ r e = r { x = e }
    189 }}}
    190 
    191 The third parameter of the `Upd` class represents the new type being assigned to the field, unlike the `Has` class, where it represents the current type. Thus it is not functionally dependent on the first two. Consequently, we must use a bare type variable `b` in the instance declaration, with an equality constraint `b ~ [c]` postponed until after the instance matches.
    192 
    193 If a type variable is shared by multiple fields, it cannot be changed using `setField`. Moreover, the use of the `SetResult` type family means that phantom type variables cannot be changed. For example, in
     181  setField _ (MkT _) e = MkT e
     182}}}
     183
     184The third parameter of the `Upd` class represents the new type being assigned to the field. Thus it is not functionally dependent on the first two. Consequently, we must use a bare type variable `b` in the instance declaration, with an equality constraint `b ~ [c]` postponed until after the instance matches.
     185
     186If a type variable is shared by multiple fields, it cannot be changed using `setField`. Moreover, the use of the `UpdTy` type family means that phantom type variables cannot be changed. For example, in
    194187
    195188{{{
     
    201194
    202195{{{
    203 type instance SetResult (V a b c) "foo" (a, b') = V a b' c
     196type instance UpdTy (V a b c) "foo" (a, b') = V a b' c
    204197
    205198instance t ~ (a, b') => Upd (V a b c) "foo" t where
    206   setField _ r e = r { foo = e }
     199  setField _ (MkV _ bar) e = MkV e bar
    207200}}}
    208201
     
    210203=== Lens integration ===
    211204
    212 It was implied above that a field like `foo` translates into `getField (Proxy :: Proxy "foo") :: Has r "foo" t => r -> t`, but this is not quite the whole story. We would like fields to be usable as lenses (e.g. using the [http://hackage.haskell.org/package/lens lens] package). This requires a slightly more general translation, using
    213 
    214 {{{
    215 field :: (Has r f t, Accessor p f) => proxy f -> p r t
    216 field z = accessor z (getField z) (setField z)
    217 }}}
    218 
    219 to translate `foo` to `field (Proxy :: Proxy "foo") :: (Has r "foo" t, Accessor p "foo") => p r t`. The `Accessor` class is defined thus:
    220 
    221 {{{
    222 class Accessor (p :: * -> * -> *) (f :: Symbol) where
    223   accessor :: proxy f -> (r -> GetResult r f) ->
    224               (forall a . Upd r f a => r -> a -> SetResult r f a) ->
    225               p r (GetResult r f)
    226 }}}
    227 
    228 An instance of `Accessor p f` means that `p` may contain a getter and setter for the field `f`. In particular, we can give an instance for functions that ignores `f` and the setter completely:
    229 
    230 {{{
    231 instance Accessor (->) f where
     205It was implied above that a field like `foo` translates into `getField (proxy# :: Proxy# "foo") :: Has r "foo" => r -> FldTy r "foo"`, but this is not quite the whole story. We would like fields to be usable as lenses (e.g. using the [http://hackage.haskell.org/package/lens lens] package). This requires a slightly more general translation, using
     206
     207{{{
     208field :: Accessor p r n => Proxy# n -> p r (FldTy r n)
     209field z = accessField z (getField z) (setField z)
     210}}}
     211
     212to translate `foo` to `field (proxy# :: Proxy# "foo") :: Accessor p r "foo" => p r (FldTy r "foo")`. The `Accessor` class is defined thus:
     213
     214{{{
     215class Accessor (p :: * -> * -> *) (r :: *) (n :: Symbol) where
     216  accessField :: Proxy# n ->
     217                 (Has r n => r -> FldTy r n) ->
     218                 (forall a . Upd r n a => r -> a -> UpdTy r n a) ->
     219                 p r (FldTy r n)
     220}}}
     221
     222An instance of `Accessor p r n` means that `p` may contain a getter and setter for the field `n` in record type `r`. In particular, we can give an instance for functions that ignores `r`, `n` and the setter completely:
     223
     224{{{
     225instance Has r n => Accessor (->) r n where
    232226  accessor _ getter setter = getter
    233227}}}
    234228
    235 Thus, whenever a field `foo` is used at a function type (by applying it or composing it, for example), this instance will be selected. If `z` is a proxy of type `Proxy "foo"`, then `foo` translates to `field z`, which computes to `accessor z (getField z) (setField z)`, and hence to `getField z` by the `Accessor` instance for functions.
     229Thus, whenever a field `foo` is used at a function type (by applying it or composing it, for example), this instance will be selected. That is, `foo` translates to `field proxy#`, which computes to `accessor proxy# (getField proxy#) (setField proxy#)`, and hence to `getField proxy#` by the `Accessor` instance for functions.
    236230
    237231However, `p` does not have to be the function arrow. Suppose the `lens` library defined the following newtype wrapper:
    238232
    239233{{{
    240 newtype WrapLens f r a
    241   = MkWrapLens (forall b . Upd r f b => Lens r (SetResult r f b) a b)
    242 
    243 instance f ~ g => Accessor (WrapLens f) g where
     234newtype WrapLens n r a
     235  = MkWrapLens (forall b . Upd r n b => Lens r (SetResult r n b) a b)
     236
     237instance m ~ n => Accessor (WrapLens m) r n where
    244238  accessor _ getter setter = MkWrapLens (\ w s -> setter s <$> w (getter s))
    245239
    246 fieldLens :: Upd r f b => WrapLens f r a -> Lens r (SetResult r f b) a b
     240fieldLens :: Upd r n b => WrapLens n r a -> Lens r (SetResult r n b) a b
    247241fieldLens (MkWrapLens l) = l
    248242}}}
     
    250244Now `fieldLens foo` is a lens whenever `foo` is an overloaded record field.
    251245
    252 Other lens libraries can define their own instances of `Accessor`, even if they do not support type-changing update, and the same machinery enables fields to be used with them.
     246Other lens libraries can define their own instances of `Accessor`, even if they do not support type-changing update, and the same machinery enables fields to be used with them. For example, here is another possible encoding of lenses:
     247
     248{{{
     249data DataLens r a = DataLens
     250   { getDL :: r -> a
     251   , setDL :: r -> a -> r }
     252
     253instance Upd r n (FldTy r n) => Accessor DataLens r n where
     254  accessField _ g s = DataLens g s
     255}}}
     256
     257Now an overloaded record field `foo` can be used as if it had type `DataLens r a`, and it will just work: we do not even need to use a combinator.
    253258
    254259
     
    264269
    265270{{{
    266 type instance SetResult (T a) "foo" Int = T b
     271type instance UpdTy (T a) "foo" Int = T b
    267272}}}
    268273
     
    284289
    285290{{{
    286 type instance SetResult (T a) "foo" (Goo b) = T b
     291type instance UpdTy (T a) "foo" (Goo b) = T b
    287292}}}
    288293
     
    290295
    291296{{{
    292 type instance SetResult (T a) "foo" x = T b
     297type instance UpdTy (T a) "foo" x = T b
    293298}}}
    294299
     
    302307
    303308{{{
    304 type instance SetResult (U a) "bar" (b -> x) = U b
     309type instance UpdTy (U a) "bar" (b -> x) = U b
    305310}}}
    306311
     
    315320
    316321{{{
    317 f :: (Has r "g" Int) => r -> Int
     322f :: r { g :: Int } => r -> Int
    318323f x = g x + 1
    319324}}}
     
    352357
    353358{{{
    354 field (Proxy :: Proxy "foo")
     359field (proxy# :: Proxy# "foo")
    355360}}}
    356361
     
    408413=== Syntactic sugar for `Upd` constraints ===
    409414
    410 Should we have a special syntax for `Upd` constraints, just as `r { x :: t }` sugars `Has r "x" t`? What should it look like? Perhaps something like `r { x ::= t }`?
     415Should we have a special syntax for `Upd` constraints, just as `r { x :: t }` sugars `(Has r "x", FldTy r "x" ~ t)`? What should it look like? Perhaps something like `r { x ::= t }`?
    411416
    412417
     
    416421=== Trouble in paradise ===
    417422
    418 [http://www.haskell.org/pipermail/glasgow-haskell-users/2013-July/022584.html Edward Kmett points out] that a previous version of this proposal, where the third parameter of `Has` was not functionally dependent on the first two, fell short in an important respect: composition of polymorphic record fields would lead to ambiguity errors, as the intermediate type cannot be determined. For example, suppose
     423[http://www.haskell.org/pipermail/glasgow-haskell-users/2013-July/022584.html Edward Kmett points out] that a previous version of this proposal, where the `Has` class took three parameters and the third was not functionally dependent on the first two, fell short in an important respect: composition of polymorphic record fields would lead to ambiguity errors, as the intermediate type cannot be determined. For example, suppose
    419424
    420425{{{
     
    429434}}}
    430435
    431 and `b` is an ambiguous type variable. This shows the need for the `GetResult` type family.
     436and `b` is an ambiguous type variable. This shows the need for the `FldTy` type family.
    432437
    433438
    434439=== Virtual record fields ===
    435440
    436 We could imagine supporting virtual record fields by allowing the user to declare their own instances of `Has` and `GetResult` (and possibly `Upd` and `SetResult`). For example, the user could write the following:
     441We could imagine supporting virtual record fields by allowing the user to declare their own instances of `Has` and `FldTy` (and possibly `Upd` and `UpdTy`). For example, the user could write the following:
    437442
    438443{{{
    439444data Person = MkPerson { firstName :: String, lastName :: String }
    440445
    441 type instance GetResult Person "fullName" = String
    442 instance Has Person "fullName" String where
     446type instance FldTy Person "fullName" = String
     447instance Has Person "fullName" where
    443448  getField _ p = firstName p ++ " " ++ lastName p
    444449}}}
    445450
    446 This means that the `Person` type can be used where a type with a field `fullName` is expected. Since no `SetResult` and `Upd` instances are provided, the field cannot be updated.
     451This means that the `Person` type can be used where a type with a field `fullName` is expected. Since no `Upd` and `UpdTy` instances are provided, the field cannot be updated.
    447452
    448453However, this does not bring `fullName` into scope as a field, [#Scopeissuesorwhywemissdot as previously observed]. Moreover, it is difficult to check the type family instances for consistency. For example, given the following declaration
    449454
    450455{{{
    451 type instance GetResult a "foo" = Int
     456type instance FldTy a "foo" = Int
    452457}}}
    453458
    454459we would need to check that any datatype with a field `foo` in scope gave it the type `Int`. For these reasons, user-defined instances of the classes are not currently permitted, so virtual fields are not available.
    455 
    456 
    457 == Example of constraint solving ==
    458 
    459 Consider the example
    460 
    461 {{{
    462 module M ( R(R, x), S(S, y), T(T, x) ) where
    463 
    464   data R = R { x :: Int }
    465   data S = S { x :: Bool, y :: Bool }
    466   data T = T { x :: forall a . a }
    467 
    468 module N where
    469   import M
    470 
    471   foo e = x e
    472  
    473   bar :: Bool
    474   bar = foo T
    475 
    476   baz = foo S
    477 
    478   quux = y
    479 }}}
    480 
    481 When checking `foo`, `e` is a variable of unknown type `alpha`, and the projection generates the constraint `alpha { x :: beta }` where `beta` is fresh. This constraint cannot be solved immediately, so generalisation yields the type `a { x :: b } => a -> b`.
    482 
    483 When checking `bar`, the application of `foo` gives rise to the constraint `T { x :: Bool }`, which is solved since `Bool` is an instance of `forall a . a` (the type `T` gives to `x`).
    484 
    485 When checking `baz`, the constraint `S { x :: gamma }` is generated and rejected, since the `x` from `S` is not in scope.
    486 
    487 When checking `quux`, the only `y` field in scope is of type `S -> Bool` so that is its type.