Changes between Version 7 and Version 8 of Records/OverloadedRecordFields/Plan


Ignore:
Timestamp:
Jun 19, 2013 8:18:42 AM (10 months ago)
Author:
adamgundry
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • Records/OverloadedRecordFields/Plan

    v7 v8  
    2626}}} 
    2727 
     28 
     29 
    2830=== Record field constraints ===  
    2931 
    30 '''AMG''' the following needs updating. 
    31  
    32 If the flag `-XOverloadedRecordFields` is enabled, a new form of 'record field' constraints `r { x :: t } :: Constraint` are available where `r` and `t` are types, while `x` is a literal string.  These are not typeclass constraints and do not have instances as such (though see the discussion below about virtual record fields). They can appear in contexts in the usual way (that is, they are a new primitive, like equality constraints or implicit parameter constraints). Multiple fields may be written comma-separated in a single constraint as in `r { x :: t, y :: u }`. 
    33  
    34 '''SLPJ note'''.  I think it's more helpful to introduce it as a type-class constraint that happens to have convenient concrete syntax: 
    35 {{{ 
    36 class Has (r::*) (s::Symbol) (t::*) where 
     32Record field constraints `r { x :: t }` are syntactic sugar for typeclass constraints `Has r "x" t`, where 
     33 
     34{{{ 
     35class Has (r :: *) (x :: Symbol) (t :: *) where 
    3736  getFld :: r -> t 
    38  
    39 -- data T a = MkT { x :: [a] } 
    40 instance (b ~ [a]) => Has (T a) "r" b where 
     37}}} 
     38 
     39Recall that `Symbol` is the kind of type-level strings. The notation extends to conjunctions:  `r {x :: tx, y :: ty}` means `(Has r "x" tx, Has r "y" ty)`. Note also that `r` and `t` might be arbitrary types, not just type variables or type constructors.  For example, `T (Maybe v) { x :: [Maybe v] }` means `(Has (T (Maybe b)) "x" [Maybe v])`. 
     40 
     41Instances for the `Has` typeclass are implicitly generated,  corresponding to fields in datatype definitions, when the flag `-XOverloadedRecordFields` is enabled. For example, the data type 
     42 
     43{{{ 
     44data T a = MkT { x :: [a] } 
     45}}} 
     46 
     47has the corresponding instance 
     48 
     49{{{ 
     50instance (b ~ [a]) => Has (T a) "x" b where 
    4151  getFld (MkT { x = x }) = x 
    4252}}} 
    43 The `(b ~ [a])` in the instance is important, so that we get an instance match from the first two fields only. 
    44  
    45 Then we have convenient syntax for `(Has r "x" t)`, namely `r { x::t }`.  Moreover, we get convenient syntax for conjunctions: `(Has r "x" tx, Has r "y" ty)` has shorthand `r { x::tx, y:: ty }`. 
    46  
    47 Don't forget that the `r` might be an arbitrary type not just a type variable or type constructor.  For example, `(Has (T (Maybe b)) "x" [Maybe v])` is a perfectly fine (and soluble) constraint.  I suppose that its shorthand looks like `T (Maybe v) { x :: [Maybe v] }` 
    48 '''End of SLPJ''' 
    49  
    50 === Record field projections ===  
    51  
    52 Record field constraints are introduced by projections, which are written using a new left-associative infix operator `(.>)`.  That is, if `e :: r` then `e.>x :: r { x :: t } => t`.  This operator must always be applied to (at least) its second argument, so `(.>)` is invalid but `(.>x) :: forall a b . a { x :: b } => a -> b`.  
    53  
    54 A constraint `R { x :: t }` is solved if `R` is a datatype that has a field `x` of type `t` in scope. (More precisely, if `R` contains `x :: s` then `t` must be an instance of `s`.) An error is generated if `R` has no field called `x`, it has the wrong type, or the field is not in scope. Otherwise, the new constraints are handled by the solver just like other types of constraint. 
     53 
     54The `(b ~ [a])` in the instance is important, so that we get an instance match from the first two fields only. For example, if the constraint `Has (T c) "x" d` is encountered during type inference, the instance will match and generate the constraints `(a ~ c, b ~ d, b ~ [a])`. 
    5555 
    5656If multiple constructors for a single datatype use the same field name, all occurrences must have exactly the same type, as at present. 
    5757 
    58 === Record selectors ===  
    59  
    60 Optionally, we could [wiki:Records/DeclaredOverloadedRecordFields/NoMonoRecordFields add a flag `-XNoMonoRecordFields`] to disable the generation of the usual monomorphic record field selector functions.  This is not essential, but would free up the namespace for other record systems (e.g. '''data-lens'''). Note that `-XOverloadedRecordFields` will generate monomorphic selectors by default for backwards compatibility reasons, but they will not be usable if multiple selectors with the same name are in scope. 
    61  
    62 When either flag is enabled, the same field label may be declared repeatedly in a single module (or a label may be declared when a function of that name is already in scope). 
     58A constraint `R { x :: t }` is solved if `R` is a datatype that has a field `x` of type `t` in scope. An error is generated if `R` has no field called `x`, it has the wrong type, or the field is not in scope.  
     59 
     60 
     61=== Projections: the dot operator ===  
     62 
     63Record field constraints are introduced by projections, which are written using the dot operator with no space following it.  That is, if `e :: r` then `e.x :: r { x :: t } => t`.  The right section `(.x) :: a { x :: b } => a -> b` is available but the left section `(e.)` is not (what would its type be?). 
     64 
     65The composition operator must be written with spaces on both sides, for consistency. This will break old code, but only when the `-XOverloadedRecordFields` extension is enabled. There is no ambiguity, and dot notation is already space-aware: `M.x` is a qualified name whereas `M . x` is the composition of a data constructor `M` with a function `x`.  Similarly `e.x` can mean record projection, distinct from `e . x`. Note that dot (for qualified names or record projection) binds more tightly than function application, so `f e.x` means the same as `f (e.x)`. Parentheses can be used to write `(f e).x`. 
    6366 
    6467 
    6568=== Representation hiding === 
    6669 
    67 Since the new constraints are '''not''' typeclass constraints, it is reasonable for the constraint solver to consult the fields in scope when deciding whether a solution is valid.  
    68  
    69 '''SLPJ''' As above, I'd like to say that they are just type-class constraints with special syntax.  However, maybe their instances (unlike most type-class instances) can be limited in scope; the instance is in scope iff the record field selector function is.  (Um; this sentence doesn't make so much sense if we suppress the record field selectors.) '''End of SLPJ''' 
     70At present, a datatype in one module can declare a field, but if the selector function is not exported, then the field is hidden from clients of the module. It is important to support this. Typeclasses in general have no controls over their scope, but for implicitly generated `Has` instances, the instance is in scope iff the record field selector function is. 
    7071 
    7172This enables representation hiding: exporting the field selector is a proxy for permitting access to the field. For example, consider the following module: 
     
    7879}}} 
    7980 
    80 Any module that imports `M` will have access to the `x` field from `R` but not from `S`. Thus `R { x :: Int }` will be solved but `S { x :: Bool }` will not. 
     81Any module that imports `M` will have access to the `x` field from `R` but not from `S`, because the instance `Has R "x" Int` will be in scope but the instance `Has S "x" Bool` will not be. Thus `R { x :: Int }` will be solved but `S { x :: Bool }` will not. 
     82 
     83 
     84=== Record selectors ===  
     85 
     86Optionally, we could [wiki:Records/DeclaredOverloadedRecordFields/NoMonoRecordFields add a flag `-XNoMonoRecordFields`] to disable the generation of the usual monomorphic record field selector functions.  This is not essential, but would free up the namespace for other record systems (e.g. '''lens'''). Note that `-XOverloadedRecordFields` will generate monomorphic selectors by default for backwards compatibility reasons, but they will not be usable if multiple selectors with the same name are in scope. 
     87 
     88When either flag is enabled, the same field label may be declared repeatedly in a single module (or a label may be declared when a function of that name is already in scope). 
     89 
     90Even if the selector functions are suppressed, we still need to be able to mention the fields in import and export lists, to control access to them (as discussed in the previous section). 
     91 
     92'''AMG''' perhaps we should also have a flag to automatically generate the polymorphic record selectors? These are slightly odd: if two independent imported modules declare fields with the same label, only a single polymorphic record selector should be brought into scope. 
    8193 
    8294 
     
    88100 * records may include higher-rank components. 
    89101 
    90 These problems have already been [wiki:Records/OverloadedRecordFields#Recordupdates described in some detail]. In the interests of doing something, even if imperfect, we plan to add a new syntax for monomorphic record update. This allows overloaded fields to be updated, though it requires the data constructor to be specified and cannot be abstracted over. For example, 
    91  
    92 {{{ 
    93 R { e | x = t } 
    94 }}} 
    95  
    96 means the same as `e { x = t }` except that the type is determined from the data constructor `R`, rather than the field name `x`. Thus it can be used where the latter is ambiguous. 
    97  
    98 '''SLPJ. Not the ''data'' constructor, for sure.  Possibly the type constructor.  But in any case, this is mean to be a disambiguation of 
    99 {{{ 
    100   e { x = t } 
    101 }}} 
    102 so surely it should look like 
    103 {{{ 
    104   e {T| x = t } 
    105 }}} 
    106 where `T` is the type constructor.   
    107  
    108 And there's a design choice here too.  Rather than special syntax we could resolve the ambiguity if you put a type signature in one of these two places: 
     102These problems have already been [wiki:Records/OverloadedRecordFields#Recordupdates described in some detail]. In the interests of doing something, even if imperfect, we plan to support only monomorphic record update. For overloaded fields to be updated, a type signature may be required in order to specify the type being updated. For example, 
     103{{{ 
     104e { x = t } 
     105}}} 
     106currently relies on the name `x` to determine the datatype of the record. If this is ambiguous, a type signature can be given either to `e` or to the whole expression. Thus either 
    109107{{{ 
    110108  e :: T Int { x = t } 
     109}}} 
    111110or 
     111{{{ 
    112112  e { x = t } :: T Int 
    113113}}} 
    114 That's less invasive syntactially, and still does the job. 
    115 '''End of SLPJ''' 
     114will be accepted. (Really only the type constructor is needed, whereas this approach requires the whole type to be specified, but it seems simpler than inventing a whole new syntax.) 
     115 
    116116 
    117117== Design choices == 
    118118 
    119119 
    120 === The projection operator === 
    121  
    122 As currently drafted, this proposal advocates using a new operator `.>` rather than changing the meaning of the dot operator, for reasons of backward compatibility and avoidance of a whole host of tricky parsing issues. This could change, if it is felt that the benefits of mimicking other languages outweigh the drawbacks of breaking backwards compatibility. 
    123  
    124 '''SLPJ'''.  I don't agree here.  Dot-notation is so convenient and so universal that I think we should use it.  And there isn't any ambiguity.  Dot notation is already space-aware: `M.x` is a qualified name whereas `M . x` is the composition of a data constructor `M` with a function `x`.  Similarly `r.x` can mean record selection, distinct from `r . x`. 
    125 '''End of SLPJ'''. 
     120=== Record update: avoiding redundant annotations === 
     121 
     122If `e` is a variable, whose type is given explicitly in the context, we could look it up rather than requiring it to be given again. Thus 
     123{{{ 
     124f :: T Int -> T Int 
     125f v = v { x = 5 } 
     126}}} 
     127would not require an extra annotation. On the other hand, we would need an annotation on the update in   
     128{{{ 
     129  \v -> (v { x = 4 }, [v, w :: T Int]) 
     130}}} 
     131because the type of `v` is only determined later, by constraint solving. 
     132 
     133Annoyingly, nested updates will require some annotations. In the following example, the outer update need not be annotated (since `v` is a variable that is explicitly given a type by the context) but the inner update must be (since `v.x` is not a variable): 
     134{{{ 
     135  f :: T Int -> T Int 
     136  f v = v { x = v.x { y = 6 } } 
     137}}} 
    126138 
    127139 
    128140=== Virtual record fields === 
    129141 
    130 The design presented above does not include virtual record fields, but it is easy to add them, by permitting typeclass-like instances: 
    131  
    132 {{{ 
    133 instance ctx => r { x :: t } where 
    134   get = blah :: r -> t 
    135 }}} 
    136  
    137 The constraint solver can be extended to consider such virtual fields.  Note that if `r` is a datatype with a field `x`, the virtual field will overlap. The usual rules about overlap checking apply. 
     142It is easy to support virtual record fields, by permitting explicit `Has` instances: 
     143 
     144{{{ 
     145instance ctx => Has r "x" t where 
     146  getFld = blah :: r -> t 
     147}}} 
     148 
     149Note that if `r` is a datatype with a field `x`, the virtual field will overlap, and the usual rules about overlap checking apply. Explicit instances follow the usual instance scope rules, so a virtual record field instance is always exported and imported. 
     150 
     151`Has` constraints are slightly more general than the syntactic sugar suggests: one could imagine building constraints of the form `Has r l t` where `l` is non-canonical, for example a variable or type family. It's hard to imagine uses for such constraints, though. One idea is giving virtual fields of all possible names to a type: 
     152 
     153{{{ 
     154instance Has T l () where 
     155  getFld _ = () 
     156}}} 
    138157 
    139158 
     
    143162 
    144163{{{ 
    145 foo = \ e -> e.>x 
     164foo = \ e -> e.x 
    146165}}} 
    147166 
    148167would naturally be assigned a polymorphic type. If there is only one `x` in scope, perhaps the constraint solver should pick that one (analogously to the other defaulting rules). However, this would mean that bringing a new `x` into scope (e.g. adding an import) could break code. Of course, it is already the case that bringing new identifiers into scope can create ambiguity! 
    149168 
     169For example, suppose the following definitions are in scope: 
     170{{{ 
     171data T = MkT { x :: Int, y :: Int } 
     172data S = MkS { y :: Bool } 
     173}}} 
     174 
     175Inferring the type of `foo = \ e -> e.x` results in `alpha -> beta` subject to the constraint `alpha { x :: beta }`. However, the monomorphism restriction prevents this constraint from being generalised. There is only one `x` field in scope, so defaulting specialises the type to `T -> Int`. If the `y` field was used, it would instead give rise to an ambiguity error. 
     176 
     177 
    150178 
    151179== Implementation details == 
    152180 
    153 We add a new family of constraints 
    154  
    155 {{{ 
    156 Has :: * -> Symbol -> * -> Constraint 
    157 }}} 
    158  
    159 where `r { x :: t }` is syntactic sugar for `Has r "x" t`, and `Symbol` is the kind of type-level strings. Evidence for `Has r l t` is just a projection function of type `r -> t`. Thus it is exactly the same as if `Has` were defined via a typeclass 
    160  
    161 {{{ 
    162 class Has r (l :: Symbol) t where 
    163   get :: r -> t 
    164 }}} 
    165  
    166 However, evidence is supplied by the constraint solver, taking into account the fields (or virtual field instances) that are in scope. These are not just typeclass constraints. 
    167  
    168181 
    169182=== Example of constraint solving === 
     183 
     184'''AMG''' need to update these examples. 
    170185 
    171186'''SLPJ''' Making the first example rely on the monomorphism restriction is not a good plan! 
     
    183198  import M 
    184199 
    185   foo e = e.>x 
    186  
    187   qux = (.>y) 
     200  foo e = e.x 
     201 
     202  qux = (.y) 
    188203  
    189204  bar :: Bool 
     
    201216 
    202217When checking `baz`, the constraint `S { x :: gamma }` is generated and rejected, since the `x` from `S` is not in scope. 
    203  
    204 === Remark === 
    205  
    206 `Has` constraints are slightly more general than the syntactic sugar suggests: one could imagine building constraints of the form `Has r l t` where `l` is non-canonical, for example a variable or type family. It's hard to imagine uses for such constraints, though. One idea is giving virtual fields of all possible names to a type: 
    207  
    208 {{{ 
    209 instance Has T l () where 
    210   get _ = () 
    211 }}}