Changes between Version 22 and Version 23 of Records/NameSpacing


Ignore:
Timestamp:
Jan 10, 2012 9:20:09 PM (4 years ago)
Author:
GregWeber
Comment:

restructure

Legend:

Unmodified
Added
Removed
Modified
  • Records/NameSpacing

    v22 v23  
    1616So the proposed solution for record field names is to specify more precisely which one you mean by using the type name. Note that a data declaration now creates a module-like namespace, so we aren't so much using the type name as using the data type namespace in the same way we use a module namespace.
    1717
    18 So you could say `Record.a` or `RecordClash.a` rather than `a`, to specify which field selector you mean.  The difficulty here is that it's hard to know whether you are writing `<module-name>.f` or `<record-name>.f`.  That is, is `Record` the name of a type or of a module? (Currently it legally could be both.)
    19 
    20  The module/record ambiguity is dealt with in Frege by preferring modules and requiring a module prefix for the record if there is ambiguity. So if your record named Record was inside a module named Record you would need `Record.Record.a`. I think we could improve upon this case to prefer a record rather than the name of the existing module, which should not need to be referenced. So just `Record.a`.
    21 
    22 However, we still have the case of conflicting imports between the names of modules and records. We have the choice of either requiring a module prefix or making this a compilation error. Generally, programmers will avoid this situation by doing what they do now: structuring their programs to avoid name collisions. We can try and give the greater assistance in this regard by providing simpler ways for them to alter the names of import types.
     18So you could say `Record.a` or `RecordClash.a` rather than `a`, to specify which field selector you mean.  This creates a potential ambiguity: did you mean `<module-name>.f` or `<record-name>.f`.  `Record` could be the name of a type or of a module.
     19
     20There are 2 cases to consider:
     211) inside module M naming a record M, and also importing both that module and record
     222) importing 2 different modules, M1 and M2, where M2 defines a record named M1.
     23
     24The module/record ambiguity is dealt with in Frege by preferring modules and requiring a module prefix for the record if there is ambiguity. So for 1) you need M.M.field. For 2) you need M2.M1.field.
     25
     26We could instead prefer a record rather than a module - this would normally be the desired behavior, but then we must figure out how to still be able to access the module. This is particularly the case for modules not marked as qualified - generally all modules are under a verbose namespace and the Module.identifier syntax is only used if the module is first qualified.
     27
     28Generally, programmers will avoid this situation by doing what they do now: structuring their programs to avoid name collisions. We can try and give the greater assistance in this regard by providing simpler ways for them to alter the names of import types.
    2329
    2430One way to avoid the Module.Record.x problem is to use type alias names, for example:
     
    3238
    3339
    34 === Alternative name-spacing techiniques ===
    35 
    36 '''Use the module name space mechanism'''.
    37 But putting each record definition in its own module is a bit heavyweight. So maybe we need local modules (just for name space control) and local import declarations.  Details are unclear. (This was proposed in 2008 in [http://www.haskell.org/pipermail/haskell-cafe/2008-August/046494.html this discussion] on the Haskell cafe mailing list and in #2551. - Yitz).
    38 
    39  Rather than strictly re-use modules it may make more sense to have a name-spacing implementation construct that is shared between both records and modules - hopefully this would make implementation easier and unify behavior. In the Frege approach, each data declaration is its own namespace - if we were to go this far (instead of stopping purely at records) there may be much less need for local namespaces. Overall this seems to be more of an interesting implementation detail than a concrete design proposal relating to records. -- Greg Weber.
    40 
    4140== Getting rid of the Verbosity with the dot operator ==
    4241
    43 We have name-spaces, but the equivalent is already being accomplished by adding prefixes to record fields: `data Record = Record { recordA :: String }`
    44 
    45 Verbosity is solved in Frege by using the dot syntax concept. In `data Record = Record {a::String};r = Record "A"; r.a` The final `r.a` resolves to `Record.a r`.
    46 See below for how we resolve the type of this code.
    47 
    48 === Details on the dot ===
    49 
    50 This proposal requires the current Haskell function composition dot operator to have spaces on both sides. No spaces around the dot are reserved for name-spacing: this use and the current module namespace use. No space to the right would be partial application (see
    51 [http://hackage.haskell.org/trac/haskell-prime/wiki/TypeDirectedNameResolution TDNR]. The dot operator should bind as tightly as possible.
    52 
    53 Given the dot's expanded use here, plus its common use in custom operators, it is possible to end up with dot-heavy code.
    54 
    55 {{{
    56 quux (y . (foo>.<  bar).baz (f . g)) moo
    57 }}}
    58 
    59 It's not that easy to distinguish from
    60 
    61 {{{
    62 quux (y . (foo>.<  bar) . baz (f . g)) moo
    63 }}}
    64 
    65 What then is the future of the dot if this proposal is accepted? I think the community need to chose among 2 alternatives:
    66 
    67 1) discourage the use of dot in custom operators: `>.<` is bad, use a different character or none: `><`
    68 2) discourage the use of dot for function composition - use a different operator for that task. Indeed, Frege users have the choice between `<~` or the proper unicode dot.
     42We have name-spaces, but it is hard to see how this is better than the current practice of adding prefixes to record fields: `data Record = Record { recordA :: String }`
     43
     44Verbosity is solved in Frege and DDC by using the dot syntax concept. In `data Record = Record {a::String};r = Record "A"; r.a` The final `r.a` resolves to `Record.a r`.
     45
     46This is the TDNR syntax concept. See 'Simple Type Resolution' for how we resolve the type of this code.
     47Also see 'Details on the dot' for a lengthy discussion of the dot.
    6948
    7049
     
    123102=== Increased need for type annotation ===
    124103
    125 This is the only real downside of the proposal. The Frege author says:
     104This is the only real downside of *this* proposal (most downsides discussed here are inherent to any records proposal). The Frege author says:
    126105
    127106I estimate that in 2/3 of all cases one does not need to write `T.e x` in sparsely type annotated code, despite the fact that the frege type checker has a left to right bias and does not yet attempt to find the type of `x` in the code that "follows" the `x.e` construct (after let unrolling etc.) I think one could do better and guarantee that, if the type of `x` is inferrable at all, then so will be `x.e` (Still, it must be more than just a type variable.)
     
    132111  * the function that updates field `x` of data type `T` is `T.{x=}`
    133112  * the function that sets field x in a `T` to `42` is `T.{x=42}`
    134   * If `a::T` then `a.{x=}` and `a.{x=42}` are valid
     113  * If `a::T` then `a.{x=}` and `a.{x=42}` are equivalent to `T.{x=} a` and `T.{x=42} a`
    135114  * the function that changes field x of a T by applying some function to it is `T.{x <-}`
    136115
     
    177156the new functions `f` and `g` are accessible (only) through R.
    178157So we have a technique for lifting new functions into the Record namespace.
    179 For the initial records implementaion we probably want to maintain `f` and `g` at both the top-level and through the name-space.
    180 See below for a discussion of future directions.
     158For the initial records implementaion we definitely want to maintain `f` and `g` at the top-level, but should consider also adding through the record name-space. See related discussion below on future directions.
    181159
    182160== Compatibility with existing records ==
     161
     162The new record system can be enabled with `-XNAMESPACEDATA`
    183163
    184164Seems like it should be OK to use old records in the new system playing by the new rules, although those records likely already include some type of prefixing and would be quite verbose.
     
    186166
    187167
    188 == Partial application ==
    189 
    190 see [http://hackage.haskell.org/trac/haskell-prime/wiki/TypeDirectedNameResolution TDNR] syntax discusion.
    191 `.a r == r.a`
    192 
    193 
    194 == Potential Downside: mixing of 2 styles of code ==
     168                                                                               
     169== Details on the dot ==
     170
     171This proposal requires the current Haskell function composition dot operator to have spaces on both sides. No spaces around the dot are reserved for name-spacing: this use and the current module namespace use. No space to the right would be partial application (see
     172[http://hackage.haskell.org/trac/haskell-prime/wiki/TypeDirectedNameResolution TDNR]. The dot operator should bind as tightly as possible.
     173
     174=== Partial application ===
     175
     176see [http://hackage.haskell.org/trac/haskell-prime/wiki/TypeDirectedNameResolution TDNR] syntax discusion for an explanation.
     177{{{
     178(.a) r == r.a
     179}}}
     180
     181.x (no space after the dot), for any identifier x, is a postfix operator that binds more tightly than function application, so that parentheses are not usually required.
     182{{{
     183.a r == r.a
     184}}}
     185
     186When there are multiple operators, they chain left to right
     187{{{
     188(r.a.b.c) == (.c $ .b $ .a r)
     189}}}
     190
     191See below for how partial application can allow for different code styles.
     192
     193Question: does this now hold?
     194{{{
     195r.a == r.(Record.a) == r.Record.a
     196}}}
     197
     198
     199
     200=== Dealing with dot-heavy code ===
     201
     202==== Identifying the difference between a name-space dot and function composition ====
     203
     204Given the dot's expanded use here, plus its common use in custom operators, it is possible to end up with dot-heavy code.
     205
     206{{{
     207quux (y . (foo>.<  bar).baz (f . g)) moo
     208}}}
     209
     210It's not that easy to distinguish from
     211
     212{{{
     213quux (y . (foo>.<  bar) . baz (f . g)) moo
     214}}}
     215
     216What then is the future of the dot if this proposal is accepted? The community needs to consider ways to reduce the dot:
     217
     2181) discourage the use of dot in custom operators: `>.<` could be discouraged, use a different character or none: `><`
     219In most cases the dot in custom operators has little to no inherent meaning. Instead it is just the character available for custom operators that takes up the least real-estate. This makes it the best choice for implementing a custom operator modeled after an existing Haskell operator: `.==` or `.<` is normably preferable to `@==` and `@<`.
     220
     2212) discourage the use of dot for function composition - use a different operator for that task. Indeed, Frege users have the choice between `<~` or the proper unicode dot.
     222
     223Discouraging the use of the dot in custom operators makes the example code only slightly better. With the second we now have:
     224
     225{{
     226quux (y <~ (foo>.<  bar).baz (f <~ g)) moo
     227}}}
     228
     229Very easy to distinguish from
     230
     231{{{
     232quux (y <~ (foo>.<  bar) <~ baz (f <~ g)) moo
     233}}}
     234
     235If you are disgusted by `<~` than you can use the very pretty unicode dot.
     236
     237==== Downside: mixing of 2 styles of code ====
    195238
    196239{{{
     
    201244}}}
    202245
    203 It bothers some that the code does not look like the previous `b a r` - chiefly that the record is now in the middle. Chaining can make this perception even worse: `(e . d) r.a.b.c`
    204 
    205 Is it possible we can have an equivalent of the dot that changes the ordering? `b a.@r` is possible, but requires an operator that binds tightly to the right.
    206 
    207 === Partial Application ===
     246It bothers some that the code does not read strictly left to right as in: `b . a . r`. Chaining can make this even worse: `(e . d) r.a.b.c`
     247
     248===== Solution: Partial Application =====
    208249
    209250Partial application provides a potential solution: `b . .a $ r`
     
    213254Our longer example from above: `e . d . .c . .b . .a`
    214255
    215 At first glance it may look odd, but it is starting to grow on me. Also let us consider real use with longer names:
     256Let us consider real use with longer names:
    216257{{{
    217258echo . delta . .charlie . .beta . .alpha
    218259}}}
    219260
    220 Is there are more convenient syntax for this? `b <.a`
    221 Note that a move to a different operator for function composition (see brief discussion of the dot operator above) would make things much clearer: `b <~ .a`, where the unicode dot might be even nicer
    222 
    223 
    224 == Extending data name-spacing and dot syntax ==
     261Note that a move to a different operator for function composition (see discussion above) would make things much nicer:
     262{{{
     263echo <~ delta <~ .charlie <~ .beta <~ .alpha
     264}}}
     265
     266
     267===== Solution: Field selector to the left of the record =====
     268
     269We could have an equivalent of the dot where the field is to the left of the record: `b a@r`
     270Could this also be used in a partial syntax?
     271
     272{{{
     273echo . delta . charlie@ . beta@ . alpha@
     274}}}
     275
     276Can this be shortened to:
     277
     278{{{
     279echo . delta . charlie@beta@alpha@
     280}}}
     281
     282Or would this syntax alway need to be applied?
     283
     284{{{
     285echo . delta $ charlie@beta@alpha@r
     286}}}
     287
     288
     289
     290
     291== Extending data name-spacing ==
    225292
    226293This is mostly just something interesting to contemplate.
    227294
    228 Dot syntax does not have to be limited to records (although it probably should be for the initial implementation until this new record system is vetted). I think it is a bad idea to attempt to attempt to extend the dot syntax to accomplish general function chaining through extending the dot syntax. However, it is consistent to extend the function name-spaced to a record data type concept to any data type (as it is in Frege), and use dot syntax for that. The dot (without spaces) *always* means tapping into a namespace (and simple type resolution).
     295Dot syntax does not have to be limited to records (although it probably should be for the initial implementation until this new record system is vetted). I think it is a bad idea to attempt to attempt to extend the dot syntax to accomplish general function chaining through extending the dot syntax - we are simply asking too much of the dot right now. However, it is consistent to extend the function name-spaced to a record data type concept to any data type (as it is in Frege), and use dot syntax for that. The dot (without spaces) then *always* means tapping into a namespace (and simple type resolution).
    229296
    230297Placing functions within a data name-space can make for nicer data-structure oriented code where the intent is clearer. It can help to achieve the data-oriented goal of OO (without the entanglement of state). With control over how the data namespace is exported (similar to controlling module namesapces), it is possible to create virtual record field setters and getters that can be accessed through dot syntax.
     
    232299Both Frege and the DDC thesis take this approach.
    233300
    234 In this brave new world (see above where typeclass functions are also placed under the namespace of the data), there are few functions that *absolutlely must* be at the top level of a module. Although a library author might take attempt the approach of no top-level functions, obviously it will still be most convenient for users to define functions at the top level of modules rather than have to lift them into data structures.
    235                                                                                                                  
     301In this brave new world (see above where typeclass functions are also placed under the namespace of the data), there are few functions that *absolutlely must* be at the top level of a module. Although a library author might take attempt the approach of no top-level functions, obviously it will still be most convenient for users to define functions at the top level of modules rather than have to lift them into data structure namespaces.