wiki:Records/SyntaxDirectedNameResolution

Version 3 (modified by elaforge, 2 years ago) (diff)

--

Syntax Directed Name Resolution

The idea is a # prefix for identifiers. #a is a "type directed function". It requires the type of its argument to be monomorphic. An application #a :: forall x. M.X -> x is desugared to M.a :: M.X -> Y where Y depends on the concrete type of M.a. All it does is strip the module part off off the argument type, and look for the symbol a in that module. The module must be imported qualified for the resolution to happen, so compiling a certain module only needs to look at its direct dependents.

Everything else remains the same. Records wishing the same name must live in separate modules, but field access looks like: (#b . #a) record. People who like unqualified imports can still use them, since import A.B.C will bring A.B.C.a into scope as well as a. If there are two as from separate modules, an unqualified a will give an ambiguous error as always, but #a will use its argument type to automatically qualifiy it to the right module. Since #a is desugared to A.B.C.a it's a separate symbol from a and they can both exist in the same module provided, of course, that you imported A.B.C qualified.

This is enough for field access, but to support convenient modification, some notion of lenses has to be built in and the rule has to be modified, from #a :: forall x. M.X -> x ==> M.a :: M.X -> Y to look at the first argument of a Lens class: #a :: forall x. Lens M.X x ==> M.a :: Lens M.X Y. Effectively this is just substituting Lens for (->). Maybe it could be extended to look at the first argument of any type constructor forall m x. m M.X x, that way the syntax can be used to resolve not just record fields, but any function, or any member of Category, or whatever.

Here's an example using plain function "# resolution":

import qualified Inner
import qualified Outer
-- data Outer.Outer = Outer { a :: Inner.Inner }
-- data Inner.Inner = Inner { b :: Int }

record :: Outer.Outer
record = Outer.Outer (Inner.Inner 42)

val :: Int
val = (#b.#a) record

-- Type of 'record' must already be known:
(#b.#a) :: Outer.Outer -> Int
-- Due to the definition of (.) and its return type being known:
#a :: Outer.Outer -> b
-- New "# Resolution" rule: The first argument is in module Outer, so
-- resolve to Outer.a, and now we know the full type of Outer.a:
Outer.a :: Outer.Outer -> Inner.Inner
-- Due to the definition of (.) and its return type being known:
#b :: Inner.Inner -> c
-- # Resolution:
Outer.b :: Inner.Inner -> Int
-- So the Ints match, and val is (effectively) rewritten as:
val = (Inner.b . Outer.a) record

This works to set fields of a record as well. Let M.a be a function from a record to a lens focused on a certain field of that record. Now you can write:

M.a :: M.Record -> Lens M.Record Int
set :: record -> Lens record field -> field -> record

record :: M.Record
record = M.Record (6*9)

val :: M.Record
val = set (#a record) 42

-- Due to the type of set:
(#a record) :: record -> Lens record field -> record -> record
-- Type of 'record' must already be known:
#a :: M.Record -> b
-- # Resolution:
M.a :: M.Record -> Lens M.Record Int
-- The Ints match, and val is:
val = set (M.a record) 42

Unfortunately, it doesn't work with a lens get:

get :: Lens record field -> record -> field
val = get #a record
#a :: Lens M.Record field
-- Oops, can't apply # reduction, because its first argument type is 'Lens ...'
-- not 'M.Record ...'.

The # resolution needs the #a to be a function with a known first argument type. We could say that's fine because lenses are only really needed for setting since getting is already a plain composable function, and write gets like #a_ record. But that's ugly and unsatisfying, and you can't pass around the getter and setter as one unit.

Also, composed set will run into problems. So lets build lenses into the language. Two changes: as a convenience, deriving (Lens) will generate lenses for the fields instead of get functions. And, the # reduction will treat an argument type of Lens a b specially, requiring that the type of a being known, and looking in its module:

M.hs:

data M.Record = Record { a :: Int } deriving (Lens)
import qualified M

get :: Lens record field -> record -> field
val = get #a record

#a :: Lens M.Record field
-- # resolution on Lens:
M.a :: Lens M.Record Int
-- val is
val = get M.a record

Let's try with composed lenses.

set ((#b.#a) record) 42 should become set ((Inner.b . Outer.a) record) 42

Outer.hs:

data Outer = Outer { a :: Inner.Inner } deriving (Lens)

Inner.hs:

data Inner = Inner { b :: Int } deriving (Lens)

Main.hs:

-- A lens composition operator.  Most lens libraries overload (.) with this,
-- so we'd either need to make a new operator or move Control.Category into
-- the Prelude.  Let's go with a new operator:
(!) :: Lens b c -> Lens a b -> Lens a c

-- This is the more usual argument order for 'set'.  It makes it easier to
-- partially apply as well:
set :: Lens record field -> field -> record -> record

record :: Outer.Outer
record = Outer.Outer (Inner.Inner 42)

val :: Outer.Outer -> Outer.Outer
val = set (#b!#a) 42

-- Due to the type of 'set' and the type of 'record' already being known:
(#b!#a) :: Lens Outer.Outer field
-- Due to the definition of (!) and its return type being known:
#a :: Lens Outer.Outer b
-- # resolution on Lens:
Outer.a :: Lens Outer.Outer Inner.Inner
-- Due to the second argument of (!) being known:
#b :: Lens Inner.Inner c
-- # resolution on Lens
Inner.b :: Lens Inner.Inner Int
-- result is
val = set (Inner.b ! Outer.a) 42 record

Note that there must be a known monomorphic type for the #a so this may require some type declarations:

-- no type declared
set42 = set #a 42

-- Due to type of 'set':
#a :: Lens a b
-- 'a' is not monomorphic, error!

However, they should be rare in practice, since if set42 is passed to a function that gives it a monomorphic type or applied to a M.Record then the ambiguity is resolved.

discussion

If we can generalize #-resolution to work on an arbitrary binary type constructor argument like (~>) Known x then we don't need to build a Lens class in. That's nice because people can keep experimenting with libraries, especially since different lens libraries have different features (e.g. StateT integration, partial lenses, etc.). However, in practice records aren't very satisfying without a nice update syntax, so perhaps a lens library should be promoted to the platform at least. Can the new "generalized deriving" feature replace ugly TH splices with a nice deriving clause or would that need to be built in? At the least I think it would need an extension to suppress record declaration's automatic creation of get functions.

pros

  1. No effect on (.) operator, which is composition as always. No "binds tighter than functions" or left-to-right vs. right-to-left controversy, and partial application works as it always did.
  2. Record declaration syntax remains exactly the same.
  3. Works on any function, so it doesn't tie you to the implementation of a record, you can remove a field and add a compatibility shim. So no tension between directly exposing the record implementation vs. writing a bunch of set/modify boilerplate.
  4. Module export list controls access over record fields as always.
  5. Orthogonal to records: any function can be addressed.

cons

  1. Still can't have two records with the same field name in the same module since it relies on modules for namespacing.
  2. Lenses can't handle updates that change the type, e.g. from Rec a to Rec b. If the set function is Lens rec field -> field -> rec -> rec then you can't change the type of rec. I'm sure if this is solvable without the set being builtin syntax, or a fancier lens implementation could admit Lens rec1 rec2 field -> field -> rec1 -> rec2.
  3. It's another way to resolve a name to a different function body that's not typeclasses. One way should be enough. But typeclasses are fundamentally global.
  4. The function to resolve must be monomorphic, so there is no "structural polymorphism" e.g. getName :: (Has Name a) => a -> String

From my point of view (elaforge), the pros are compelling, especially how record fields aren't a built-in concept but are just normal haskell identifiers.

My spin on the cons:

  1. I don't mind. To my mind, modules are there for namespace control and so two things shouldn't be able to have the same name in one module by definition. I have very few records that really want to have the same field name *and* live in the same module, but for those who have more of those, nested modules would be an orthogonal feature that would satisfy them. You could argue that this is a necessary development, because modules are *the* user-controlled namespace and access control mechanism (typeclasses are always global and hence not that controlled), then the solution must be modules. Any other solution is either going to be unsatisfactory because you can't control the scope of the names, or unsatisfactory because it's a non-module way of namespacing.
  1. It's not too satisfying, but you can fall back to rec { field = ... }. In practice I'd define a type specific update function as I do now, to avoid being tied field always existing in the record. Still, it's worth thinking about whether this proposal could later be extended to support type-changing updates if there were further language support.
  1. The global-ness of typeclass seems hard to reconcile with the idea of wanting to control export of record fields, so maybe this is unavoidable.
  1. I'm not real fond of structural polymorphism anyway, I believe parametric and typeclass polymorphism are more principled.