Changes between Version 27 and Version 28 of Records/OverloadedRecordFields/Plan


Ignore:
Timestamp:
Aug 6, 2013 7:53:28 AM (19 months ago)
Author:
adamgundry
Comment:

telling the latest story

Legend:

Unmodified
Added
Removed
Modified
  • Records/OverloadedRecordFields/Plan

    v27 v28  
    3939A record field constraint is introduced when a field is used in an expression. If every `x` in scope is a record fields, then an occurrence of `x` has type `a { x :: b } => a -> b` instead of generating an ambiguity error. The overloaded `x` is translated using a typeclass, described below. If there are any normal identifiers `x` in scope (as well as fields) then a use of `x` leads to an ambiguity error.  
    4040 
    41 Record field constraints `r { x :: t }` are syntactic sugar for typeclass constraints `Get r "x" t`, where 
     41Record field constraints `r { x :: t }` are syntactic sugar for typeclass constraints `Has r "x" t`, where 
    4242 
    4343{{{ 
    4444type family GetResult (r :: *) (f :: Symbol) :: * 
    4545 
    46 class t ~ GetResult r f => Get r (f :: Symbol) t where 
    47   getFld :: proxy f -> r -> t 
    48 }}} 
    49  
    50 Recall that `Symbol` is the kind of type-level strings. Roughly speaking, an occurrence of a field name `x` is translated into `getFld (Proxy :: Proxy "x")`. (Actually a slightly more general translation may be used, as [#Lensintegration discussed below].) 
    51  
    52 The syntactic sugar extends to conjunctions:  `r {x :: tx, y :: ty}` means `(Get r "x" tx, Get r "y" ty)`. Note also that `r` and `t` might be arbitrary types, not just type variables or type constructors.  For example, `T (Maybe v) { x :: [Maybe v] }` means `(Get (T (Maybe b)) "x" [Maybe v])`. 
    53  
    54 Instances for the `Get` typeclass and `GetResult` type family are automatically generated (for modules with `-XOverloadedRecordFields` enabled) using the record fields that are in scope. For example, the data type 
     46class t ~ GetResult r f => Has r (f :: Symbol) t where 
     47  getField :: proxy f -> r -> t 
     48}}} 
     49 
     50Recall that `Symbol` is the kind of type-level strings. Roughly speaking, an occurrence of a field name `x` is translated into `getField (Proxy :: Proxy "x")`. (Actually a slightly more general translation is used, as [#Lensintegration discussed below].) 
     51 
     52The syntactic sugar extends to conjunctions:  `r {x :: tx, y :: ty}` means `(Has r "x" tx, Has r "y" ty)`. Note also that `r` and `t` might be arbitrary types, not just type variables or type constructors.  For example, `T (Maybe v) { x :: [Maybe v] }` means `(Has (T (Maybe b)) "x" [Maybe v])`. 
     53 
     54Instances for the `Has` typeclass and `GetResult` type family are automatically generated (for modules with `-XOverloadedRecordFields` enabled) using the record fields that are in scope. For example, the data type 
    5555 
    5656{{{ 
     
    6363type instance GetResult (T a) "x" = [a] 
    6464 
    65 instance (b ~ [a]) => Get (T a) "x" b where 
    66   getFld _ (MkT x) = x 
    67 }}} 
    68  
    69 The bare type variable `b` in the instance head is important, so that we get an instance match from the first two parameters only, then the equality constraint `(b ~ [a])` improves `b`. For example, if the constraint `Get (T c) "x" d` is encountered during type inference, the instance will match and generate the constraints `(a ~ c, b ~ d, b ~ [a])`. Moreover, the `GetResult` type family ensures that the third parameter is functionally dependent on the first two, which is needed to [#Troubleinparadise avoid ambiguity errors when composing overloaded fields]. 
     65instance (b ~ [a]) => Has (T a) "x" b where 
     66  getField _ (MkT x) = x 
     67}}} 
     68 
     69The bare type variable `b` in the instance head is important, so that we get an instance match from the first two parameters only, then the equality constraint `(b ~ [a])` improves `b`. For example, if the constraint `Has (T c) "x" d` is encountered during type inference, the instance will match and generate the constraints `(a ~ c, b ~ d, b ~ [a])`. Moreover, the `GetResult` type family ensures that the third parameter is functionally dependent on the first two, which is needed to [#Troubleinparadise avoid ambiguity errors when composing overloaded fields]. 
    7070 
    7171The reason for using a three-parameter class, rather than just two parameters and a type family, is to support the syntactic sugar. With a two-parameter class we could easily end up inferring types like the following, and it would be hard to reapply the sugar: 
    7272 
    7373{{{ 
    74 f :: (Get r "x", Get "y", GetResult r "x" ~ Int, GetResult r "y" ~ Int) => r -> Int 
     74f :: (Has r "x", Has r "y", GetResult r "x" ~ Int, GetResult r "y" ~ Int) => r -> Int 
    7575f r = x r + y r :: Int 
    7676}}} 
     
    7979=== Representation hiding === 
    8080 
    81 At present, a datatype in one module can declare a field, but if the selector function is not exported, then the field is hidden from clients of the module. It is important to support this. Typeclasses in general have no controls over their scope, but for implicitly generated `Get` instances, the instance is available for a module if `-XOverloadedRecordFields` is enabled for that module and the record field selector function is in scope. Instances are generated on the client side, rather than being exported from the defining module. 
     81At present, a datatype in one module can declare a field, but if the selector function is not exported, then the field is hidden from clients of the module. It is important to support this. Typeclasses in general have no controls over their scope, but for implicitly generated `Has` instances, the instance is available for a module if `-XOverloadedRecordFields` is enabled for that module and the record field selector function is in scope. Instances are generated on the client side, rather than being exported from the defining module. 
    8282 
    8383This enables representation hiding: just like at present, exporting the field selector permits access to the field. For example, consider the following module: 
     
    9090}}} 
    9191 
    92 Any module that imports `M` will have access to the `x` field from `R` but not from `S`, because the instance `Get R "x" Int` will be available but the instance `Get S "x" Bool` will not be. Thus `R { x :: Int }` will be solved but `S { x :: Bool }` will not. 
     92Any module that imports `M` will have access to the `x` field from `R` but not from `S`, because the instance `Has R "x" Int` will be available but the instance `Has S "x" Bool` will not be. Thus `R { x :: Int }` will be solved but `S { x :: Bool }` will not. 
    9393 
    9494 
    9595=== Multiple modules and automatic instance generation === 
    9696 
    97 Note that `Get` instances are generated on a per-module basis, using the fields that are in scope for that module, and automatically generated instances are never exported. Thus it doesn't matter whether `-XOverloadedRecordFields` was on in the module that defined the datatype. The availability of the instances in a particular module depends only on whether the flag is enabled for that module. 
     97Note that `Has` instances are generated on a per-module basis, using the fields that are in scope for that module, and automatically generated instances are never exported. Thus it doesn't matter whether `-XOverloadedRecordFields` was on in the module that defined the datatype. The availability of the instances in a particular module depends only on whether the flag is enabled for that module. 
    9898 
    9999Suppose module `M` imports module `N`, `N` imports module `O`, and only `N` has the extension enabled. Now `N` can project any field in scope (including those defined in `O`), but `M` cannot access any `Get` instances.  
    100100 
    101101This means that 
    102  * the extension is required whenever a `Get` constraint must be solved; 
     102 * the extension is required whenever a `Has` constraint must be solved; 
    103103 * no new mechanism for hiding instances is required; and 
    104104 * records defined in existing modules (or other packages) without the extension can still be overloaded. 
     
    117117Bidirectional type inference for higher-rank types relies on inferring the type of functions, so that types can be pushed in to the arguments. However, the type of an overloaded field cannot immediately be inferred (as some constraint solving is required). This is why higher-rank and overloaded fields are incompatible. 
    118118 
    119 Some previous variants of the design supported rank-1 universally quantified fields (but not rank-2 and above). However, these prevent the third parameter of the `Get` class from being a function of the first two, and hence obstruct type inference for compositions of selectors. 
     119Some previous variants of the design supported rank-1 universally quantified fields (but not rank-2 and above). However, these prevent the third parameter of the `Has` class from being a function of the first two, and hence obstruct type inference for compositions of selectors. 
    120120 
    121121 
     
    142142 
    143143 
     144=== Limited type-changing update === 
     145 
     146As noted above, supporting a polymorphic version of the existing record update syntax (in its full generality) is difficult. However, we can generate instances of the following class, which permits type-changing update of single fields: 
     147 
     148{{{ 
     149type family SetResult (r :: *) (f :: Symbol) (a :: *) :: * 
     150 
     151class Has r f (GetResult r f) => 
     152          Upd (r :: *) (f :: Symbol) (a :: *) where 
     153  setField :: proxy f -> r -> a -> SetResult r f a 
     154}}} 
     155 
     156For example, the datatype `T` above would give rise to these instances: 
     157 
     158{{{ 
     159type instance SetResult (T a) "x" [c] = T c 
     160 
     161instance (b ~ [c]) => Upd (T a) "x" b where 
     162  setField _ r e = r { x = e } 
     163}}} 
     164 
     165If a type variable is shared by multiple fields, it cannot be changed using `setField`. Moreover, the use of the `SetResult` type family means that phantom type variables cannot be changed. For example, 
     166 
     167{{{ 
     168data U a b c = MkU { foo :: (a, b), bar :: a } 
     169}}} 
     170 
     171an update to `foo` must keep `a` and `c` the same, since `a` occurs in the 
     172type of `bar`, and `c` does not occur in the type of `foo`, but the update may change `b`.  Thus we generate: 
     173 
     174{{{ 
     175type instance SetResult (T a b c) "foo" (a, b') = T a b' c 
     176 
     177instance t ~ (a, b') => Upd (T a b c) "foo" t where 
     178  setField _ r e = r { foo = e } 
     179}}} 
     180 
     181 
     182=== Lens integration === 
     183 
     184It was implied above that a field like `foo` translates into `getField (Proxy :: Proxy "foo") :: Has r "foo" t => r -> t`, but this is not quite the whole story. We would like fields to be usable as lenses (e.g. using the [http://hackage.haskell.org/package/lens lens] package). This requires a slightly more general translation, using 
     185 
     186{{{ 
     187field :: (Has r f t, Accessor p f) => proxy f -> p r t 
     188field z = accessor z (getField z) (setField z) 
     189}}} 
     190 
     191to translate `foo` to `field (Proxy :: Proxy "foo") :: (Has r "foo" t, Accessor p "foo") => p r t`. The `Accessor` class is defined thus: 
     192 
     193{{{ 
     194class Accessor (p :: * -> * -> *) (f :: Symbol) where 
     195  accessor :: proxy f -> (r -> GetResult r f) -> 
     196              (forall a . Upd r f a => r -> a -> SetResult r f a) -> 
     197              p r (GetResult r f) 
     198}}} 
     199 
     200An instance of `Accessor p f` means that `p` may contain a getter and setter for the field `f`. In particular, we can give an instance for functions that ignores `f` and the setter completely: 
     201 
     202{{{ 
     203instance Accessor (->) f where 
     204  accessor _ getter setter = getter 
     205}}} 
     206 
     207Thus, whenever a field `foo` is used at a function type (by applying it or composing it, for example), this instance will be selected. If `z` is a proxy of type `Proxy "foo"`, then `foo` translates to `field z`, which computes to `accessor z (getField z) (setField z)`, and hence to `getField z` by the `Accessor` instance for functions. 
     208 
     209However, `p` does not have to be the function arrow. Suppose the `lens` library defined the following newtype wrapper: 
     210 
     211{{{ 
     212newtype WrapLens f r a 
     213  = MkWrapLens (forall b . Upd r f b => Lens r (SetResult r f b) a b) 
     214 
     215instance f ~ g => Accessor (WrapLens f) g where 
     216  accessor _ getter setter = MkWrapLens (\ w s -> setter s <$> w (getter s)) 
     217 
     218fieldLens :: Upd r f b => WrapLens f r a -> Lens r (SetResult r f b) a b 
     219fieldLens (MkWrapLens l) = l 
     220}}} 
     221 
     222Now `fieldLens foo` is a lens whenever `foo` is an overloaded record field. 
     223 
     224Other lens libraries can define their own instances of `Accessor`, even if they do not support type-changing update, and the same machinery enables fields to be used with them. 
     225 
     226 
    144227== Design choices == 
    145228 
     
    165248 
    166249 
    167 === Lens integration === 
    168  
    169 As noted above, supporting a polymorphic version of the existing record update syntax (in its full generality) is difficult. However, we can generate instances of the following class, which permits type-changing update of single fields: 
    170  
    171 {{{ 
    172 type family SetResult (r :: *) (f :: Symbol) (a :: *) :: * 
    173  
    174 class Set (r :: *) (f :: Symbol) (a :: *) where 
    175   setFld :: proxy f -> r -> a -> SetResult r f a 
    176 }}} 
    177  
    178 For example, the datatype `T` above would give rise to these instances: 
    179  
    180 {{{ 
    181 type instance SetResult (T a) "x" [c] = T c 
    182  
    183 instance (b ~ [c]) => Set (T a) "x" b where 
    184   setFld _ (MkT _) y = MkT y 
    185 }}} 
    186  
    187 It was implied above that a field like `foo` translates into `getFld (Proxy :: Proxy "foo") :: Get r "foo" t => r -> t`, but this is not quite the whole story. We would like fields to be usable as lenses (e.g. using the [http://hackage.haskell.org/package/lens lens] package). This requires a slightly more general translation, using 
    188  
    189 {{{ 
    190 field :: (Get r f t, Accessor p f) => proxy f -> p r t 
    191 field z = accessor z (getFld z) (setFld z) 
    192 }}} 
    193  
    194 to translate `foo` to `field (Proxy :: Proxy "foo") :: (Get r "foo" t, Accessor p "foo") => p r t`. The `Accessor` class is defined thus: 
    195  
    196 {{{ 
    197 class Accessor (p :: * -> * -> *) (f :: Symbol) where 
    198   accessor :: proxy f -> (r -> GetResult r f) -> 
    199               (forall a . Set r f a => r -> a -> SetResult r f a) -> 
    200               p r (GetResult r f) 
    201 }}} 
    202  
    203 An instance of `Accessor p f` means that `p` may contain a getter and setter for the field `f`. In particular, we can give an instance for functions that ignores `f` and the setter completely: 
    204  
    205 {{{ 
    206 instance Accessor (->) f where 
    207   accessor _ getter setter = getter 
    208 }}} 
    209  
    210 Thus, whenever a field `foo` is used at a function type (by applying it or composing it, for example), this instance will be selected. If `z` is a proxy of type `Proxy "foo"`, then `foo` translates to `field z`, which computes to `accessor z (getFld z) (setFld z)`, and hence to `getFld z` by the `Accessor` instance for functions. 
    211  
    212 However, `p` does not have to be the function arrow. Suppose the `lens` library defined the following newtype wrapper: 
    213  
    214 {{{ 
    215 newtype WrapLens f r a 
    216   = MkWrapLens (forall b . Set r f b => Lens r (SetResult r f b) a b) 
    217  
    218 instance f ~ g => Accessor (WrapLens f) g where 
    219   accessor _ getter setter = MkWrapLens (\ w s -> setter s <$> w (getter s)) 
    220  
    221 fieldLens :: Set r f b => WrapLens f r a -> Lens r (SetResult r f b) a b 
    222 fieldLens (MkWrapLens l) = l 
    223 }}} 
    224  
    225 Now `fieldLens foo` is a lens whenever `foo` is an overloaded record field. 
    226  
    227  
    228 === Trouble in paradise === 
    229  
    230 [http://www.haskell.org/pipermail/glasgow-haskell-users/2013-July/022584.html Edward Kmett points out] that a previous version of this proposal, where the third parameter of `Get` was not functionally dependent on the first two, fell short in an important respect: composition of polymorphic record fields would lead to ambiguity errors, as the intermediate type cannot be determined. For example, suppose 
    231  
    232 {{{ 
    233 foo :: Get b "foo" c => b -> c 
    234 bar :: Get a "bar" b => a -> b 
    235 }}} 
    236  
    237 then 
    238  
    239 {{{ 
    240 foo . bar :: (Get a "bar" b, Get b "foo" c) => a -> c 
    241 }}} 
    242  
    243 and `b` is an ambiguous type variable. This shows the need for the `GetResult` type family. 
    244  
    245  
    246 === User-defined `Get` instances === 
    247  
    248 Should the user be allowed to write explicit `Get` instances? For example: 
    249  
    250 {{{ 
    251 instance ctx => Get r "x" t where 
    252   getFld = blah :: r -> t 
    253 }}} 
    254  
    255 Even with an explicit `Get` instance as above, the name `x` will not be in scope unless a datatype has a field with name `x`. Thus it is not really useful. The previous proposal, where `(.x)` always meant "project out the `x` field", used explicit `Has` instances for virtual fields.  
    256  
    257  
    258250=== Hiding record selectors ===  
    259251 
     
    275267 
    276268but with the property that it will not clash with actual `x` fields. 
     269 
     270 
     271=== Syntactic sugar for `Upd` constraints === 
     272 
     273Should we have a special syntax for `Upd` constraints, just as `r { x :: t }` sugars `Has r "x" t`? What should it look like? Perhaps something like `r { x ::= t }`? 
     274 
     275 
     276== Remarks == 
     277 
     278 
     279=== Trouble in paradise === 
     280 
     281[http://www.haskell.org/pipermail/glasgow-haskell-users/2013-July/022584.html Edward Kmett points out] that a previous version of this proposal, where the third parameter of `Get` was not functionally dependent on the first two, fell short in an important respect: composition of polymorphic record fields would lead to ambiguity errors, as the intermediate type cannot be determined. For example, suppose 
     282 
     283{{{ 
     284foo :: Has b "foo" c => b -> c 
     285bar :: Has a "bar" b => a -> b 
     286}}} 
     287 
     288then 
     289 
     290{{{ 
     291foo . bar :: (Has a "bar" b, Has b "foo" c) => a -> c 
     292}}} 
     293 
     294and `b` is an ambiguous type variable. This shows the need for the `GetResult` type family. 
     295 
     296 
     297=== User-defined `Has` instances === 
     298 
     299The user can write explicit `Has` instances, and they are scoped normally. For example: 
     300 
     301{{{ 
     302instance ctx => Has r "x" t where 
     303  getField = blah :: proxy "x" -> r -> t 
     304}}} 
     305 
     306Even with an explicit `Has` instance as above, the name `x` will not be in scope unless a datatype has a field with name `x`. This is likely to make it less useful. However, it does allow virtual fields to be declared. 
    277307 
    278308