Opened 5 years ago

Last modified 23 months ago

#7828 new bug

RebindableSyntax and Arrow

Reported by: AlessandroVermeulen Owned by:
Priority: normal Milestone:
Component: Compiler (Type checker) Version: 7.6.2
Keywords: Arrows, RebindableSyntax Cc: ross@…
Operating System: Unknown/Multiple Architecture: Unknown/Multiple
Type of failure: GHC rejects valid program Test Case:
Blocked By: Blocking:
Related Tickets: #1537, #3613 Differential Rev(s): Phab:D72
Wiki Page:

Description

When trying to add constraints to the types of the arrow primitives I get a type error. I think that doing such a thing should be possible and I've attached the code I used to test this. The errors I get when using the arrow notation for the function test are as follows:

test :: Typeable a => R a a
test = proc n -> returnA -< n
bug-arrow.hs:15:8:
    Could not deduce (Typeable c) arising from a use of `arr'
    from the context (Typeable a)
      bound by the type signature for test :: Typeable a => R a a
      at bug-arrow.hs:14:9-27
    Possible fix:
      add (Typeable c) to the context of
        a type expected by the context: (b -> c) -> R b c
        or the type signature for test :: Typeable a => R a a
    In the expression: arr
    When checking that `arr' (needed by a syntactic construct)
      has the required type: forall b1 c1. (b1 -> c1) -> R b1 c1
      arising from a proc expression at bug-arrow.hs:15:8-29
    In the expression: proc n -> returnA -< n

bug-arrow.hs:15:8:
    Could not deduce (Typeable c) arising from a use of `>>>'
    from the context (Typeable a)
      bound by the type signature for test :: Typeable a => R a a
      at bug-arrow.hs:14:9-27
    Possible fix:
      add (Typeable c) to the context of
        a type expected by the context: R a1 b -> R b c -> R a1 c
        or the type signature for test :: Typeable a => R a a
    In the expression: (>>>)
    When checking that `(>>>)' (needed by a syntactic construct)
      has the required type: forall a2 b1 c1.
                             R a2 b1 -> R b1 c1 -> R a2 c1
      arising from a proc expression at bug-arrow.hs:15:8-29
    In the expression: proc n -> returnA -< n

bug-arrow.hs:15:8:
    Could not deduce (Typeable d) arising from a use of `first'
    from the context (Typeable a)
      bound by the type signature for test :: Typeable a => R a a
      at bug-arrow.hs:14:9-27
    Possible fix:
      add (Typeable d) to the context of
        a type expected by the context: R b c -> R (b, d) (c, d)
        or the type signature for test :: Typeable a => R a a
    In the expression: first
    When checking that `first' (needed by a syntactic construct)
      has the required type: forall b1 c1 d1.
                             R b1 c1 -> R (b1, d1) (c1, d1)
      arising from a proc expression at bug-arrow.hs:15:8-29
    In the expression: proc n -> returnA -< n

When I replace the definition with the translated core code (minus type applications and scoped type variables) the code compiles:

test :: Typeable a => R a a
test =
    (>>>)
      (arr (\ (n_apd) -> n_apd))
      ((>>>)
         (arr (\ (ds_dst) -> ds_dst))
         (returnA)
         )

Attachments (1)

bug-arrow.hs (2.4 KB) - added by AlessandroVermeulen 5 years ago.

Download all attachments as: .zip

Change History (60)

Changed 5 years ago by AlessandroVermeulen

Attachment: bug-arrow.hs added

comment:1 Changed 5 years ago by simonpj

difficulty: Unknown

I know exactly what is happening here, but need some arrow-aware person to do some work to fix it.

Remember that GHC typechecks the user-written source code not the desugared code.

What shouuld happen is that at each source-code construct that requires (say) a use of (>>>), we should instantiate a fresh call of (>>>), and attach that instantiated call to the syntax tree. That's what happens for monads. For example, here is the data type of Stmt (in HsExpr):

data StmtLR idL idR body -- body should always be (LHs**** idR)
  = LastStmt  -- Always the last Stmt in ListComp, MonadComp, PArrComp,
              -- and (after the renamer) DoExpr, MDoExpr
              -- Not used for GhciStmtCtxt, PatGuard, which scope over other stuff
               body
               (SyntaxExpr idR)   -- The return operator, used only for MonadComp
                                  -- For ListComp, PArrComp, we use the baked-in 'return'
                                  -- For DoExpr, MDoExpr, we don't appply a 'return' at all
                                  -- See Note [Monad Comprehensions]
  | BindStmt (LPat idL)
             body
             (SyntaxExpr idR) -- The (>>=) operator; see Note [The type of bind]
             (SyntaxExpr idR) -- The fail operator
             -- The fail operator is noSyntaxExpr
             -- if the pattern match can't fail
...etc...

The SyntaxExpr on the ReturnStmt is for the instantiated call of return; the ones on BindStmt are for (>>=) and fail.

But currently for arrow Cmds we use a different (older) plan. We have a single instantiated call of (>>>) for the whole Cmd. It is held in the CmdTop:

data HsCmdTop id
  = HsCmdTop (LHsCmd id)
             PostTcType          -- Nested tuple of inputs on the command's stack
             PostTcType          -- return type of the command
             (CmdSyntaxTable id) -- See Note [CmdSyntaxTable]

The CmdSyntaxTable has the calls for (>>>), first etc.

But this approach requires (>>>) etc to be fully polymorphic in the non-arrow argument, so that this one call can be used at every place in the command that it's needed. And yours are not.

Solution: use the same approach as we use for monads:

  • Get rid of the CmdSyntaxTable on HsCmdTop
  • Add SyntaxExpr fields to all the appropriate constructs in HsCmd that need them.

I don't think this is really hard, but it's a bit fiddly. I'd be happy to advise if someone wants to undertake it.

Simon

comment:2 Changed 5 years ago by AlessandroVermeulen

I would like to try and implement this seeing as I need it myself. However, I am completely unfamiliar with the internals of GHC and the development process. I do have checkout but what is a good place to start looking where I should make the modifications?

comment:3 Changed 5 years ago by igloo

Component: CompilerCompiler (Type checker)
Milestone: 7.8.1

comment:4 Changed 4 years ago by thoughtpolice

Milestone: 7.8.37.10.1

Moving to 7.10.1

comment:5 Changed 4 years ago by jstolarek

Owner: set to jstolarek

comment:6 Changed 4 years ago by jstolarek

I have spent some time on this today and I have some questions. As I understand Simon's comment the idea is to handle the arrow do-notation similarly to the monadic do-notation. Looking at the handling of monadic bind I see that indeed each BindStmt stores a unique call to >>= and fail. These are generated in the renamer and then converted to explicit lambda by the desugarer. For arrow operations - arr, (>>>), first and possibly loop, (|||) and app - renamer creates a shared table of names. Desugarer creates a command environment that contains identifiers that are shared by all desugared expressions. This basicaly rephrases what Simon wrote. Now I have reached a point where I'm not sure what to do. For monads calls to bind are introduced only (?) when we encounter BindStmt. For arrows it is not that simple. For example calls to >>> are introduced when we desugar BodyStmt, BindStmt and RecStmt but also by do_premap, which does not correspond to any particular source construct but is called in many places of arrow notation desugarer. Another place where we introduce >>> is desugaring of CmdLStmt. I might be wrong, but it seems to me that adding SyntaxExpr to HsCmd constructors won't help us here. We need a way of generating fresh calls to desugared arrow operations out of thin air.

I also don't understand why is arrow notation considered not rebindable? The comment suggests that this is because DsArrows performs a separate desugaring of arrow notation. Am I right to think that after this change is implemented arrow notation will be marked as rebindable?

Last edited 4 years ago by jstolarek (previous) (diff)

comment:7 Changed 4 years ago by jstolarek

Cc: ross@… jan.stolarek@… added

comment:8 Changed 4 years ago by simonpj

Yes, I think if we made the change suggested here, arrow notation would become rebindable.

I think the idea would be that each construct becomes decorated with the overloaded operations necessary to desugar it. Indeed, in the case of monad syntax, the effect is exactly as if you first desugared and then typechecked. For example x <- e; rest desugars into something like e >>= \x -> rest. So to typecheck x <- e (a BindStmt), we look up >>=, instantiate its type, unify e's type with the first argument type of >>=, and so on. We hang onto the instantiated >>= in the typechecked BindStmt.

The idea is to do the same for arrows, but I have little idea of how hard it is. But the thought experiment of first desugaring and then type checking may be an instructive one.

Simon

comment:9 Changed 4 years ago by jstolarek

I think the idea would be that each construct becomes decorated with the overloaded operations necessary to desugar it.

Technically this is simple but I feel the resulting design will be inelegant. We use StmtLR datatype to represent both the monadic and arrow do-notation. This means that desugaring BindStmt inside a monadic do requires calls to >>= and fail (here), whereas desugaring the same BindStmt inside the arrow do requires a call to first, two different calls to >>> and two different calls to arr (here). Adding five extra fields to BindStmt feels like a lot of unnecessary noise. Perhaps creating helper datatypes to store these extra identifiers would be better? But then every (or almost every) constructor of StmtLR would require such extra datatype. I don't yet see a clean solution. Augmenting HsCmd with this extra information seems relatively simple, with HsCmdCase being the only tricky case. (It reuses a call to (|||) in every branch of a case (here), which means we need to generate a whole list of new identifiers.)

What do you think about the design I describe above? Is it acceptable?

Janek

comment:10 Changed 4 years ago by simonpj

Well in principle with rebindable syntax, each of those calls to >>> at different types (I assume they are at different types?) could in principle be overloaded a different way. So

  • There really is no alternative to generating them all
  • I really don't know what sort of error message you'd generate for the user if one of the calls to >>> failed but the others didn't, say.

Before going further I think we need help from Ross, to see if the "thought experiment of desugar and then typecheck" makes sense for arrows. We MUST give a clear story to the programmer of what will type check and what will not. We have that for rebindable syntax and do-notation. I don't know what the story is for rebindable syntax and arrow notation. Until we know that we can't implement anything.

What do you think about the design I describe above?

What precisely is "above". I don't see a proposed design.

We really need help from Ross here.

comment:11 Changed 4 years ago by jstolarek

There really is no alternative to generating them all

Yes, I am not questioning that.

What precisely is "above". I don't see a proposed design.

I meant storing IDs of calls necessary for desugaring in the extra fields (or extra datatypes) of the constructors of StmtLR. Of course this assumes that it is correct to "desugar then typecheck" the arrow do-notation. After reading "A New Notation for Arrows" I feel this should be perfectly OK, just as it is OK for monadic do, but of course that's a layman's perspective.

I have a few questions about implementation in DsArrows - I think these go to Ross.

  1. What is the purpose of the stack created in the desugaring process?
  2. How do I read the notation describing each case of dsCmd? Eg. the HsCmdApp case has this comment:
-- D; ys |-a cmd : (t,stk) --> t'
-- D, xs |-  exp :: t
-- ------------------------
-- D; xs |-a cmd exp : stk --> t'
--
--        ---> premap (\ ((xs),stk) -> ((ys),(e,stk))) cmd

What is |-a, xs and ys? What is the difference between : and ::? Could you tell me how to read this particular comment? I should then be able to understand all the others. EDIT: Is there a semantic difference between , and ; as in D; ys vs. D, xs?

Last edited 4 years ago by jstolarek (previous) (diff)

comment:12 Changed 4 years ago by jstolarek

After spending some time on this today I have to admit that arrow desugaring is still a mystery to me. For example this code:

proc n -> do
  x <- returnA -< n
  returnA -< x

Is desugared to:

arr (\n -> (n, ())) >>>
arr (\ds -> case ds of  { (ds2, ds3) -> ds2 }) >>>
arr (\ds -> ((ds, ()), ())) >>>
((arr (\ds -> case ds of  { (ds2, ds3) -> ds2 }) >>> id) *** id >>>
        arr (\ds -> case ds of  { (x, ds2) -> case ds2 of  { () -> x } }) ) >>>
arr (\ds -> (ds, ())) >>>
arr (\ds -> case ds of { (ds2, ds3) -> ds2 }) >>>
id

This can be simplified to an identity (as expected), but GHC doesn't do it. This definitely seems like an opportunity for improvement.

comment:13 in reply to:  11 Changed 4 years ago by ross

Replying to jstolarek:

I have a few questions about implementation in DsArrows - I think these go to Ross.

  1. What is the purpose of the stack created in the desugaring process?
  2. How do I read the notation describing each case of dsCmd? Eg. the HsCmdApp case has this comment:
-- D; ys |-a cmd : (t,stk) --> t'
-- D, xs |-  exp :: t
-- ------------------------
-- D; xs |-a cmd exp : stk --> t'
--
--        ---> premap (\ ((xs),stk) -> ((ys),(e,stk))) cmd

What is |-a, xs and ys? What is the difference between : and ::? Could you tell me how to read this particular comment? I should then be able to understand all the others. EDIT: Is there a semantic difference between , and ; as in D; ys vs. D, xs?

Yes, those differences in punctuation are significant. A type judgement for an expression (ignoring constraints) looks like

D |- exp :: t

but one for a command looks like

D; xs |-a cmd : stk --> t

Here

  • a is the type of arrow for this command
  • D is the environment of the surrounding proc expression
  • xs are variables defined within the proc expression, which will be inputs to the arrow
  • stk is a list of anonymous inputs to the arrow
  • t is the output type of the arrow

So the translation of this command should be an expression with type

D |- [[proc (xs) -> cmd]] : a ((xs),stk) t

In this particular rule, exp is typed using an ordinary expression judgement. It can use both external valuables (D) and those local to the proc expression (xs), so these are combined in a single environment. (Commands, in contrast, need to keep these separate.) The value of that expression is then pushed onto stk, which, together with xs, is input to cmd. (The lambda rule for commands lets you take a value from the stack and put it in the local environment, giving it a name.) The remaining wrinkle here is that cmd may not need all the variables in xs, but just the subset ys of xs, so the function inside the premap may also trim the environment from xs to ys.

comment:14 in reply to:  12 Changed 4 years ago by ross

Replying to jstolarek:

After spending some time on this today I have to admit that arrow desugaring is still a mystery to me. For example this code:

proc n -> do
  x <- returnA -< n
  returnA -< x

Is desugared to:

arr (\n -> (n, ())) >>>
arr (\ds -> case ds of  { (ds2, ds3) -> ds2 }) >>>
arr (\ds -> ((ds, ()), ())) >>>
((arr (\ds -> case ds of  { (ds2, ds3) -> ds2 }) >>> id) *** id >>>
        arr (\ds -> case ds of  { (x, ds2) -> case ds2 of  { () -> x } }) ) >>>
arr (\ds -> (ds, ())) >>>
arr (\ds -> case ds of { (ds2, ds3) -> ds2 }) >>>
id

This can be simplified to an identity (as expected), but GHC doesn't do it. This definitely seems like an opportunity for improvement.

The simplifier reduces that to an identity using rules in Control.Arrow (though more rules are probably needed to handle slightly more complex cases).

comment:15 Changed 4 years ago by jstolarek

Thanks Ross. I think I still don't understand the stk thing. What are the "anonymous inputs to the arrow"? Is that described in any of the arrow papers?

The simplifier reduces that to an identity using rules in Control.Arrow

How can I observe that? Using -ddump-simpl only shows that this huge expression is split into many smaller functions but it doesn't look like anything gets simplified. Using -ddump-rules and -ddump-rule-rewrites I see that some rules fire but none of them seems to optimize the above expression.

comment:16 in reply to:  15 Changed 4 years ago by ross

Replying to jstolarek:

Thanks Ross. I think I still don't understand the stk thing. What are the "anonymous inputs to the arrow"? Is that described in any of the arrow papers?

They are command arguments, the arrow analogue of function arguments. But with arrows they have to be tupled up with the local environment as the input to the arrow. In the notation paper (section 3.2) these arguments are attached to the environment with a series of pairings, but before 7.8 I had to change the representation to fit with GHC's constraint-based typechecker. Now the environment and stack are separate tuples, combined in a pair.

The simplifier reduces that to an identity using rules in Control.Arrow

How can I observe that? Using -ddump-simpl only shows that this huge expression is split into many smaller functions but it doesn't look like anything gets simplified. Using -ddump-rules and -ddump-rule-rewrites I see that some rules fire but none of them seems to optimize the above expression.

I used -O as well (this is with GHC 7.8.2).

comment:17 Changed 4 years ago by ross

Here's a different suggestion for rebinding that is perhaps more analogous to the expression case. We could consider do and if commands as sugar for more primitive commands:

do { p <- cmd; ss } = cmd `bind` \ p -> do { ss }
do { cmd; ss } = cmd `bind_` do { ss }
do { rec { ss }; ss' } =
    do { vs <- (| fixA (\ ~vs -> do { ss; returnA -< vs })|); ss' }
do { let defs; ss } = let defs in do { ss }
do { cmd } = cmd
if exp then cmd1 else cmd2 = (| ifThenElseA cmd1 cmd2 |) exp

(remember these are commands, not expressions), given the ordinary Haskell definitions

bind :: Arrow a => a (e,s) b -> a (e,(b,s)) c -> a (e,s) c
u `bind` f = arr id &&& u >>> arr (\ ((e,s),b) -> (e,(b,s))) >>> f

bind_ :: Arrow a => a (e,s) b -> a (e,s) c -> a (e,s) c
u `bind_` v = arr id &&& u >>> arr fst >>> v

fixA :: ArrowLoop a => a (e,(b,s)) b -> a (e,s) b
fixA f = loop (arr (\ ((e,s),b) -> (e,(b,s))) >>> f >>> arr (\ b -> (b,b)))

ifThenElseA :: ArrowChoice a => a (e,s) r -> a (e,s) r -> a (e,(Bool,s)) r
ifThenElseA thenPart elsePart = arr split >>> thenPart ||| elsePart
  where
    split (e, (True, s)) = Left (e, s)
    split (e, (False, s)) = Right (e, s)

Then rebinding would just use whatever definitions of these names are in scope, and they could be associated with BindStmt, etc, in a similar way to expressions. But would this be enough for the people who want rebinding?

Last edited 3 years ago by ross (previous) (diff)

comment:18 Changed 4 years ago by simonpj

Ross that looks attractive. Is it enough for desugaring? That is, is the translation above what you would use for desugaring? It looks a LOT simpler than the present desugaring!

Simon

comment:19 in reply to:  18 Changed 4 years ago by ross

Replying to simonpj:

Ross that looks attractive. Is it enough for desugaring? That is, is the translation above what you would use for desugaring? It looks a LOT simpler than the present desugaring!

It's only part of it. This would treat do and if commands as derived constructs, defined in terms of a kernel command language. But that language would still need to be type-checked using the command rules (with their split environments), and would still need to be translated into expressions, as now.

comment:20 Changed 4 years ago by jstolarek

I did some more reading and I finally understand Ross' answers :-) Section 3.2 of John Hughes' "Programming with Arrows" was really helpful in understanding the idea of the stack. But I don't yet understand the implementation and I still have problems reading the comments. I figured a good way of improving my understanding will be implementing a prototype solution outlined in comment 9. I wanted to begin by implementing it only for selected language constructs and I got stuck on typechecking BindStmt in TcMatches. Currently there is only the monadic BindStmt. In my prototype I want to distinguish between the monadic and arrow BindStmt. This means I need to write typechecking for the latter (in TcMatches). I'm not sure how to do that as I don't know what is the type signature for arrow bind. I tried to deduce that based on the comments in DsArrows, but I'm not sure if this is a correct approach. Can I once again ask for help in reading the comments?

-- D; xs1 |-a c : () --> t
-- D; xs' |-a do { ss } : t'   xs2 = xs' - defs(p)
-- -----------------------------------
-- D; xs  |-a do { p <- c; ss } : t'
--
--        ---> premap (\ (xs) -> (((xs1),()),(xs2)))
--            (first c >>> arr (\ (p, (xs2)) -> (xs'))) >>> ss

I'm totally confused by (\ (xs) -> (((xs1),()),(xs2))). Where do xs1 and xs2 come from, if we only bind xs as the input to do { p <- c; ss }? Also, do I understand correctly that bind expression generated in desugaring of arrow do-notation does not have a source-level equivalent? If so is it even correct to represent it using the BindStmt constructor?

In my prototype I'm ignoring Ross' proposal of the new desugaring.

EDIT: My prototype (not working) is on GitHub. Here is the key change behind my implementation. Everything else follows from it.

Last edited 4 years ago by jstolarek (previous) (diff)

comment:21 in reply to:  20 Changed 4 years ago by ross

Replying to jstolarek:

Can I once again ask for help in reading the comments?

-- D; xs1 |-a c : () --> t
-- D; xs' |-a do { ss } : t'   xs2 = xs' - defs(p)
-- -----------------------------------
-- D; xs  |-a do { p <- c; ss } : t'
--
--        ---> premap (\ (xs) -> (((xs1),()),(xs2)))
--            (first c >>> arr (\ (p, (xs2)) -> (xs'))) >>> ss

I'm totally confused by (\ (xs) -> (((xs1),()),(xs2))). Where do xs1 and xs2 come from, if we only bind xs as the input to do { p <- c; ss }?

They are subsets of xs: xs1 is the variables used by c, while xs2 is the variables used by ss but not bound in p. So the command 'c' gets local environment xs1 and stack (). The translations of statements are arrows that take an environment (here xs2) as input.

Also, do I understand correctly that bind expression generated in desugaring of arrow do-notation does not have a source-level equivalent? If so is it even correct to represent it using the BindStmt constructor?

In my prototype I'm ignoring Ross' proposal of the new desugaring.

I don't understand that question - I don't see a bind expression in the generated code. Do you mean the BindStmt emitted by the type checker? No, that wouldn't correspond to something one could write in the command sublanguage. Part of the motivation for my proposal is that the desugaring of do notation would have a source-level equivalent, albeit still in the command sublanguage, but it could be checked using the existing rules for commands.

comment:22 Changed 4 years ago by jstolarek

They are subsets of xs: xs1 is the variables used by c, while xs2 is the variables used by ss but not bound in p. So the command 'c' gets local environment xs1 and stack (). The translations of statements are arrows that take an environment (here xs2) as input.

That starts to make sense :-) As I understand x1 and x2 need not be disjoint - is that correct?

I am yet to understand the rules for encoding the input parameters to the arrows. Is there a general rule that says when and in what order things are put on the stack? Is it the case that elements on the stack are never accessed and need to be popped if they are to be used?

I know all of this could be deduced from the source code, but sadly this is not straightforward (at least for me).

I don't understand that question - I don't see a bind expression in the generated code.

What I meant is that desugaring of arrow bind is completely different from monadic bind (which is desugared to >>=+lambda) so perhaps it deserves its own internal representation. But in fact this is what my prototype implements ie. it distingushes arrow and monadic bind.

comment:23 in reply to:  22 Changed 4 years ago by ross

Replying to jstolarek:

They are subsets of xs: xs1 is the variables used by c, while xs2 is the variables used by ss but not bound in p. So the command 'c' gets local environment xs1 and stack (). The translations of statements are arrows that take an environment (here xs2) as input.

That starts to make sense :-) As I understand x1 and x2 need not be disjoint - is that correct?

That's right. If a variable is used by both c and ss, it will be in both sets.

I am yet to understand the rules for encoding the input parameters to the arrows. Is there a general rule that says when and in what order things are put on the stack? Is it the case that elements on the stack are never accessed and need to be popped if they are to be used?

Things are added to the stack by HsCmdApp and removed with HsCmdLam, but stacks can also be modified by HsCmdArrForm - there are some examples in the "Primitive constructs" section in teh arrow notation part of the GHC User's Guide.

comment:24 Changed 4 years ago by jstolarek

In my work on the prototype I've ran into a problem I can't solve. Since desugaring of HsCmdTop requires calls to arr and >>> I've added two SyntaxExpr to HsCmdTop datatype. These two SyntaxExprs need to be typechecked in tcCmdTop. I've see earlier how typechecking of the do-notation bind is implemented and decided to adopt a similar aproach:

tcCmdTop env (L loc (HsCmdTop cmd _ _ names compose_op arr_op)) cmd_ty@(cmd_stk, res_ty)
  = setSrcSpan loc $
    do  { cmd'   <- tcCmd env cmd cmd_ty
        ; names' <- mapM (tcSyntaxName ProcOrigin (cmd_arr env)) names
        -- VOODOO CODING based on typechecking of >>= in TcMatches
        -- is it correct to use b and c variables for typechecking in both
        -- arr and compose?
        ; a       <- newFlexiTyVarTy liftedTypeKind
        ; b       <- newFlexiTyVarTy liftedTypeKind
        ; c       <- newFlexiTyVarTy liftedTypeKind
        ; arr_op' <- tcSyntaxOp DoOrigin arr_op
                                (mkFunTy (mkFunTy b c) (mkCmdArrTy env b c))

        ; compose_op' <- tcSyntaxOp DoOrigin compose_op
                                (mkFunTys [mkCmdArrTy env a b, mkCmdArrTy env b c] (mkCmdArrTy env a c))
        ; return (L loc $ HsCmdTop cmd' cmd_stk res_ty names' compose_op' arr_op') }

This however ends with a panic. Using -dcore-lint gives me this:

*** Core Lint errors : in result of Desugar (after optimization) ***
<no location info>: Warning:
In the expression: T7828.>>>
                     @ GHC.Prim.Any
                     @ GHC.Prim.Any
                     @ GHC.Prim.Any
                     @ a_aEo
                     @ (a_aEo, ())
                     @ a_aEo
                     (T7828.arr
                        @ GHC.Prim.Any
                        @ GHC.Prim.Any
                        @ a_aEo
                        @ (a_aEo, ())
                        (\ (n_aus :: a_aEo) -> (n_aus, GHC.Tuple.())))
                     (T7828.>>>
                        @ (a_aEo, ())
                        @ a_aEo
                        @ a_aEo
                        (T7828.arr
                           @ (a_aEo, ())
                           @ a_aEo
                           (\ (ds_dGU :: (a_aEo, ())) ->
                              case ds_dGU of _ [Occ=Dead] { (ds_dGT, _ [Occ=Dead]) -> ds_dGT }))
                        (T7828.returnA @ a_aEo))
Illegal type application:
Exp type:
    T7828.R GHC.Prim.Any GHC.Prim.Any
    -> T7828.R GHC.Prim.Any GHC.Prim.Any
    -> T7828.R GHC.Prim.Any GHC.Prim.Any :: *
Arg type: a_aEo :: *

These GHC.Prim.Any look wrong. Notice that there are two calls of >>> and arr but only one of each has these extra GHC.Prim.Any parameters. The incorrect >>> and arr are generated by my modified HsCmdTop. The other pair of >>> and arr is generated by HsCmdArrApp constructor, which has not been modified yet. Simon, help?

EDIT: For completeness sake here's the code I was compiling:

{-# LANGUAGE GADTs, Arrows, NullaryTypeClasses #-}
{-# LANGUAGE RebindableSyntax #-}
module T7828 where

import Prelude          (Either(..), flip, Int, return)

import Control.Category (Category)
import Control.Arrow    (Arrow)

test :: Foo a => R a a
test = proc n -> returnA -< n

instance Category R where
instance Arrow R where

class Foo a where

data R a b where
  Id       :: R a a
  Comp     :: R b c -> R a b -> R a c
  Arr      :: (a -> b) -> R a b
  Split    :: R b c -> R b' c' -> R (b,b') (c,c')

infixr 1 >>>
infixr 3 ***

arr :: (a -> b) -> R a b
arr = Arr

first :: R b c -> R (b, d) (c, d)
first = (*** Id)

(***) :: R b c -> R b' c' -> R (b,b') (c,c')
(***) = Split

(>>>) :: R a b -> R b c -> R a c
(>>>) = flip Comp

returnA :: R a a
returnA = Id
Last edited 4 years ago by jstolarek (previous) (diff)

comment:25 Changed 4 years ago by jstolarek

comment:26 Changed 3 years ago by simonpj

The VOODOO CODING is of course wrong. Take a look at the version for BindStmt in (for example) tcMcStmt, in TcMatches line 511.

We are typechecking pat <- rhs; body as if it was (>>=) rhs (\x -> body). So we instantiate (>>=), expecting its type to have the shape

    rhs_ty -> (pat_ty -> body_ty)

That's what tcSyntaxOp does.

Then we check that pat really does have type pat_ty and rhs really does have type rhs_ty.

Presumably you need to do similar things for arrows, but I'm not deep enough into it to know how. Doing the instantiation once, at CmdTop will almost certainly not work.

I'm sorry not to be more helpful.

Why not follow Ross's suggestions in comment 17?

comment:27 Changed 3 years ago by jstolarek

Why not follow Ross's suggestions in comment 17?

If I understand correctly Ross' proposal is actually independent of this ticket: we could implement it and still have broken RebindableSyntax. That's why I chose to fix the rebindable syntax first. Perhaps then we can think about alternative desugaring of arrow syntax.

The VOODOO CODING is of course wrong.

Well, it is wrong, but I don't yet see why it is wrong. In my code I intended to check that arr has type b -> c -> x b c (where x is an arrow):

arr_op' <- tcSyntaxOp DoOrigin arr_op
                      (mkFunTy (mkFunTy b c) (mkCmdArrTy env b c))

and that >>> has type x a b -> x b c -> x a c:

compose_op' <- tcSyntaxOp DoOrigin compose_op
                      (mkFunTys [mkCmdArrTy env a b, mkCmdArrTy env b c] (mkCmdArrTy env a c))

I looked at the code you pointed me to and I don't yet see where I went wrong. Do you mean that I should not only check arr and >>> independently of everything else, but should also check that the type of the whole desugared expression - ie. arr (\p -> ((xs),())) >>> c' - is also correct?

Doing the instantiation once, at CmdTop will almost certainly not work.

Not sure if I understand what instantiation you mean. You refer to arr and compose? They are instantiated separately for each data constructor that needs them. HsCmdTop was just an example.

At the moment I implemented your proposal: every constructor stores operators required for its desugaring and the CmdSyntaxTable has been removed. Typechecking is the missing part. I'll be taking a stab at it sometime later this week. There are some examples of typechecking arrows in TcArrows, which will hopefully get me on the right track. Once this is done I expect to get tons of minor bugs resulting from my voodoo programming. Hopefully, these will be easy to fix.

comment:28 in reply to:  27 Changed 3 years ago by ross

Replying to jstolarek:

Why not follow Ross's suggestions in comment 17?

If I understand correctly Ross' proposal is actually independent of this ticket: we could implement it and still have broken RebindableSyntax. That's why I chose to fix the rebindable syntax first. Perhaps then we can think about alternative desugaring of arrow syntax.

It's not so much an alternative desugaring as a different view of the existing desugaring in which do and if can be rebound in an analogous manner to the way they're rebound in the expression world. Or it's a re-interpretation of this ticket: instead of using private versions of arr, >>> and first, the idea is to use private versions of the control operators bind, bind_ and ifThenElseA, the arrow counterparts of >>=, >> and ifThenElse. I don't see how the former can work, as each language construct is translated to several of those combinators. In addition, you have less control of the types of these things, because the translation re-arranges and trims the environment it passes through the arrow.

comment:29 in reply to:  26 Changed 3 years ago by simonpj

I wrote

We are typechecking pat <- rhs; body as if it was (>>=) rhs (\x -> body). So we instantiate (>>=), expecting its type to have the shape

    rhs_ty -> (pat_ty -> body_ty)

That's what tcSyntaxOp does.

Then we check that pat really does have type pat_ty and rhs really does have type rhs_ty.

The last sentence is terribly important! We are not just checking that (>>=) has a type of the right shape (which is, I gather from your question, what you think is going on), but also that it's type can be instantiated to match the particular pattern pat and the particular rhs rhs. You aren't doing that with (>>>) which is why you have all those uninstantiated type variables.

Simon

comment:30 Changed 3 years ago by jstolarek

Replying to ross:

It's not so much an alternative desugaring as a different view of the existing desugaring in which do and if can be rebound in an analogous manner to the way they're rebound in the expression world. Or it's a re-interpretation of this ticket: instead of using private versions of arr, >>> and first, the idea is to use private versions of the control operators bind, bind_ and ifThenElseA, the arrow counterparts of >>=, >> and ifThenElse.

A few more questions to make sure that I understand:

  1. If your solution was implemented then bind, bind_ and ifThenElseA would be rebindable and arr & friends would not?
  1. bind, bind_ and ifThenElseA currently don't exist in the libraries. Implementing yor proposal would require adding their definitions to Control.Arrow. Desugared arrow notation would call these default definitions, unless RebindableSyntax is enabled, in which case we use the definitions of bind etc. that are currently in scope. Correct?
  1. In comment:17 you wrote:
do { p <- cmd; stmts } = cmd `bind` \ p -> do { stmts }

Is \ p -> do { stmts } a shorthand for "put p on the stack and desugar do { stmts }"? If not, then what does it mean (code as written does not typecheck because bind needs an arrow, not a lambda).

  1. I assume that this desugaring still requires the stack to store bound parameters (p in this example)?

Replying to ross:

I don't see how the former can work, as each language construct is translated to several of those combinators.

This seems technically simple to me: just store in a constructor as many copies of the required combinators as necessary for desugaring. Although simple it leads to a design I'm not happy with. For example RecStmt requires 11 (sic!) combinators to desugar: 5x >>>, 4x arr, first and loop.

you have less control of the types of these things, because the translation re-arranges and trims the environment it passes through the arrow.

Yes, if I need to write down the exact types for typechecking this becomes a problem. Which brings me to another question related to Simon's answer:

Replying to simonpj:

The last sentence is terribly important!

That's what I suspected. So in the typechecker we type check pat <- rhs; stmt as if it was rhs >>= \pat -> stmt and later in the desugarer we actually convert it to such form, right? So if I want to typecheck the arrow desugaring I need to typecheck *exactly* the form to which it will later be desugared, including all the explicit stacks, etc?

comment:31 in reply to:  30 Changed 3 years ago by ross

Replying to jstolarek:

A few more questions to make sure that I understand:

  1. If your solution was implemented then bind, bind_ and ifThenElseA would be rebindable and arr & friends would not?

Yes.

  1. bind, bind_ and ifThenElseA currently don't exist in the libraries. Implementing yor proposal would require adding their definitions to Control.Arrow. Desugared arrow notation would call these default definitions, unless RebindableSyntax is enabled, in which case we use the definitions of bind etc. that are currently in scope. Correct?

It would probably be helpful to add them, but it wouldn't be necessary, just as the libraries don't currently have a definition of ifThenElse. The point is that the existing translation is equivalent to translating via these operators, so with RebindableSyntax turned on one could translate to those names instead. Then again, always translating via those names would mean less desugaring code, which Simon would approve of.

(I've edited comment:17 to add the translation of rec, which I forgot before.)

  1. In comment:17 you wrote:
do { p <- cmd; stmts } = cmd `bind` \ p -> do { stmts }

Is \ p -> do { stmts } a shorthand for "put p on the stack and desugar do { stmts }"? If not, then what does it mean (code as written does not typecheck because bind needs an arrow, not a lambda).

Remember that we're still in the command world here, so the infix operator bind is applied to two commands. A command can have the form \ p -> cmd, and that's what we have here. The meaning follows from the above implementation of bind and the existing translation rules for commands. The bindoperator will leave the output of the first command on the stack, and the second command will take it off and match it against p to extend its local environment.

  1. I assume that this desugaring still requires the stack to store bound parameters (p in this example)?

The rules for commands other than do and if would be as before. The lambda would take the argument off the stack, match it against p, and add to the local environment an entry for each variable in p.

Replying to ross:

I don't see how the former can work, as each language construct is translated to several of those combinators.

This seems technically simple to me: just store in a constructor as many copies of the required combinators as necessary for desugaring. Although simple it leads to a design I'm not happy with. For example RecStmt requires 11 (sic!) combinators to desugar: 5x >>>, 4x arr, first and loop.

you have less control of the types of these things, because the translation re-arranges and trims the environment it passes through the arrow.

Yes, if I need to write down the exact types for typechecking this becomes a problem. Which brings me to another question related to Simon's answer:

Replying to simonpj:

The last sentence is terribly important!

That's what I suspected. So in the typechecker we type check pat <- rhs; stmt as if it was rhs >>= \pat -> stmt and later in the desugarer we actually convert it to such form, right? So if I want to typecheck the arrow desugaring I need to typecheck *exactly* the form to which it will later be desugared, including all the explicit stacks, etc?

Yes you do, and that will be harder in the arrow case, because the local environment is an explicit input to the resulting arrow, and the desugarer feels free to trim unused variables from that environment, which will change its type. It would probably be necessary to force the replacements for the arrow combinators to be polymorphic in the environment.

comment:32 Changed 3 years ago by carter

Question: would it be possible as part of this refactoring to have a sort of generalized (weaker) arrow that doesn't have arr?

having arr be baked into the Arrow machinery makes using Arrow for deep embeddings not tenable (such as circuit/frp models that you may wanna compile at runtime). bind has a similar issue, because it allows injecting any haskell function f :: a -> b

It may be out of scope for this effort, but figure I should ask

comment:33 Changed 3 years ago by jstolarek

A command can have the form \ p -> cmd, and that's what we have here

Ah, so you used \ to denote the command abstraction (proc), not a lambda abstraction? That got me confused.

So if I want to typecheck the arrow desugaring I need to typecheck *exactly* the form to which it will later be desugared, including all the explicit stacks, etc?

Yes you do, and that will be harder in the arrow case, because the local environment is an explicit input to the resulting arrow, and the desugarer feels free to trim unused variables from that environment, which will change its type.

And that means that at the typechecking stage I have to exactly follow the whole desugaring algorithm, right? Sounds non-trivial.

It would probably be necessary to force the replacements for the arrow combinators to be polymorphic in the environment.

This brings me to another question. So let's say the user rebinds arr to some function with the type (Arrow a, Foo b) => (b -> c) -> a b c. Then, after desugaring, arr is made to operate on data + environment, like in your definition of bind: arr (\ ((e,s),b) -> (e,(b,s))). So the type b that arr has in its signature may no longer be an instance of Foo. What now? I don't see a way out.

Question: would it be possible as part of this refactoring to have a sort of generalized (weaker) arrow that doesn't have arr?

From the discussion on ghc-devs it looks that this is out of scope of this ticket - it probably deserves its own.

Last edited 3 years ago by jstolarek (previous) (diff)

comment:34 Changed 3 years ago by simonpj

I'm not following all of this, but my instinct is this. The semantics of the program should not depend on the details of desugaring. So the various calls to arr, (>>>) in the desugarer should not dispatch to different instances depending on the size of the tuple. So the type checker should be able to construct functions arr-at-T, (>>>)-at-T (for the ambient arrow T) that are polymorphic in the environment argument.

The desugarer can then instantiate this polymorphic function at whatever types it needs (maybe several of them).

I am having trouble getting to grips with this because I know of no document that gives:

  • The syntax of the arrow sub-language (including comprehensions)
  • The typing rules for that language
  • The desugaring for that language

Having such a document would be extremely helpful in making sure we were all on staring from the same baseline.

Simon

comment:35 in reply to:  33 ; Changed 3 years ago by ross

Replying to jstolarek:

A command can have the form \ p -> cmd, and that's what we have here

Ah, so you used \ to denote the command abstraction (proc), not a lambda abstraction? That got me confused.

Not quite. There are three different grammar items here:

exp  ::=  \ pat -> exp
exp  ::=  proc pat -> cmd
cmd  ::=  \ pat -> cmd

In the last one you're already in the command world, and you want to take a value off the stack and give names to its components for use in the subcommand.

So if I want to typecheck the arrow desugaring I need to typecheck *exactly* the form to which it will later be desugared, including all the explicit stacks, etc?

Yes you do, and that will be harder in the arrow case, because the local environment is an explicit input to the resulting arrow, and the desugarer feels free to trim unused variables from that environment, which will change its type.

And that means that at the typechecking stage I have to exactly follow the whole desugaring algorithm, right? Sounds non-trivial.

If you want to rebind at the level of arr, etc, partly (as Simon says) and yes it is.

Replying to simonpj:

I'm not following all of this, but my instinct is this. The semantics of the program should not depend on the details of desugaring. So the various calls to arr, (>>>) in the desugarer should not dispatch to different instances depending on the size of the tuple. So the type checker should be able to construct functions arr-at-T, (>>>)-at-T (for the ambient arrow T) that are polymorphic in the environment argument.

Yes indeed. And because of the multiple occurrences of arr and >>> in the translations, this will severely constrain the types of any replacements, to the point that I wonder if the rebinding will solve the original users' issues.

I am having trouble getting to grips with this because I know of no document that gives:

  • The syntax of the arrow sub-language (including comprehensions)
  • The typing rules for that language
  • The desugaring for that language

Having such a document would be extremely helpful in making sure we were all on staring from the same baseline.

Well there is http://www.haskell.org/ghc/docs/papers/arrow-rules.pdf, but it needs updating to the new stack representation. I'll do that.

comment:36 Changed 3 years ago by simonpj

I didn't know about arrow-rules.pdf! Excellent. Could it have a figure with BNF syntax? (Mirroring the data types in GHC.) Thank you

comment:37 in reply to:  35 ; Changed 3 years ago by jstolarek

Replying to simonpj:

So the type checker should be able to construct functions arr-at-T, (>>>)-at-T (for the ambient arrow T) that are polymorphic in the environment argument.

My concern was what if the arr defined by the user cannot be made polymorphic in the environment argument?

According to my understanding of this bug report the problem is that during desugaring all calls to user-defined arr are instantiated to be the same and thus typechecking fails, because different types are required at different call sites. Why the same thing doesn't happen without rebinding?

Replying to ross:

Not quite. There are three different grammar items here:

exp  ::=  \ pat -> exp
exp  ::=  proc pat -> cmd
cmd  ::=  \ pat -> cmd

In the last one you're already in the command world, and you want to take a value off the stack and give names to its components for use in the subcommand.

Now I see this was even mentioned in "A New Notation for Arrows" paper. Again there's something I don't understand here. So far I though that stack is an implicit mechanism that is used to pass around values bound using <-. But here it seems that the user is allowed to explicitly access the stack. Isn't that a bit risky?

comment:38 in reply to:  37 Changed 3 years ago by ross

Replying to jstolarek:

Now I see this was even mentioned in "A New Notation for Arrows" paper. Again there's something I don't understand here. So far I though that stack is an implicit mechanism that is used to pass around values bound using <-. But here it seems that the user is allowed to explicitly access the stack. Isn't that a bit risky?

The local environment is used to pass around values bound inside the proc expression. The stack is used to pass around anonymous values. They can be pushed using application, popped and named in the environment by command lambda, and also manipulated by control operators. The type constraints on operators keep them well behaved.

comment:39 in reply to:  17 ; Changed 3 years ago by jstolarek

Replying to ross:

Here's a different suggestion for rebinding that is perhaps more analogous to the expression case. We could consider do and if commands as sugar for more primitive commands:

do { rec { ss }; ss' } =
    do { vs <- (| fixA (\ ~vs -> do { ss; returnA -< vs })|); ss' }

Can the desugaring of rec be viewed as:

do { rec { ss }; ss' } =
    (| fixA (\ ~vs -> do { ss; returnA -< vs })|) `bind` \ vs -> ss' }

?

I'm working on this new desugaring now, but I'm unable to give any estimates of how long it might take.

comment:40 in reply to:  39 Changed 3 years ago by ross

Replying to jstolarek:

Replying to ross:

Here's a different suggestion for rebinding that is perhaps more analogous to the expression case. We could consider do and if commands as sugar for more primitive commands:

do { rec { ss }; ss' } =
    do { vs <- (| fixA (\ ~vs -> do { ss; returnA -< vs })|); ss' }

Can the desugaring of rec be viewed as:

do { rec { ss }; ss' } =
    (| fixA (\ ~vs -> do { ss; returnA -< vs })|) `bind` \ vs -> do { ss' }

?

Yes, that's a combination of the above and the rule for <-.

comment:41 Changed 3 years ago by jstolarek

Simon, I think I need some help with typechecking. I defined thenA (Ross' bind_) to have the type Arrow a => a (e,s) b -> a (e,s) c -> a (e,s) c. Now I'm trying to typecheck thenA operator stored inside BodyStmtA constructor (a new arrow equivalent of monadic BodyStmt). I wrote something like this:

tcArrDoStmt env _ (BodyStmtA rhs then_op _) res_ty thing_inside
  = do  { (rhs', elt_ty) <- tc_arr_rhs env rhs
        ; thing    <- thing_inside res_ty
        ; s <- newFlexiTyVarTy liftedTypeKind
        ; b <- newFlexiTyVarTy liftedTypeKind
        ; c <- newFlexiTyVarTy liftedTypeKind
        ; then_op' <- tcSyntaxOp DoOrigin then_op
                           (mkFunTys [ mkCmdArrTy env (mkBoxedTupleTy [elt_ty, s]) b
                                     , mkCmdArrTy env (mkBoxedTupleTy [elt_ty, s]) c]
                                      (mkCmdArrTy env (mkBoxedTupleTy [elt_ty, s]) c))
        ; return (BodyStmtA rhs' then_op' elt_ty, thing) }

The test function I'm compiling looks like this:

test :: Arrow a => a i i
test = proc n -> do
         (arr id) -< n
         returnA -< n

Using -dcore-lint during complation reveals offences similar to the ones I experienced earlier:

    Argument value doesn't match argument type:
    Fun type:
        a_auK (i_auL, GHC.Prim.Any) GHC.Prim.Any
        -> a_auK (i_auL, GHC.Prim.Any) GHC.Prim.Any
        -> a_auK (i_auL, GHC.Prim.Any) GHC.Prim.Any
    Arg type: a_auK (i_auL, ()) i_auL
    Arg:
        ds_dvR
          @ (i_auL, ())
          @ i_auL
          @ i_auL
          (ds_dvQ
             @ (i_auL, ())
             @ i_auL
             (\ (ds_dw0 :: (i_auL, ())) ->
                case ds_dw0 of _ [Occ=Dead] { (ds_dvZ, _ [Occ=Dead]) -> ds_dvZ }))
          (Control.Arrow.arr
             @ a_auK $dArrow_auX @ i_auL @ i_auL (T7828.id @ i_auL))

I tried to write the typechecking of thenA to match the actual type, just like you wrote in 7828#comment:29. But I don't see how could I replace my new type variables s, b and c with something concrete. I believe that s should be allowed to be anything (polymorphism in the environment), so I don't know what could it be other than a new tyvar. Tracing the calls lead me to believe that res_ty is the type of the whole do expression, so I don't think it has anything to do with the type of thenA. Can I ask for your guidance here?

comment:42 Changed 3 years ago by jstolarek

I have made some progress on this one today. I'm able to compile

test :: Arrow a => a i i
test = proc n -> do
         returnA -< n

using the new desugaring, which was not possible earlier. But I'm still stuck on this:

test :: Arrow a => a i i
test = proc n -> do
         returnA -< n
         returnA -< n

ie. desugaring of thenA although now I have gathered more information. With my new implementation the code above desugars to:

(>>>) @ i @ (i, ()) @ i
  (arr @ i @ (i, ()) (\ (n :: i) -> (n, ())))
  (thenA @ arrow @ i @ (GHC.Prim.Any *) @ (GHC.Prim.Any *) @ i $dArrow_auX
     ((>>>) @ (i, ()) @ i @ i
        (arr @ (i, ()) @ i
           (\ (ds2 :: (i, ())) ->
              case ds2 of _ { (ds4, ds5) -> ds4 }))
        (Control.Arrow.returnA @ arrow @ i $dArrow_auX))
     ((>>>) @ (i, ()) @ i @ i
        (arr @ (i, ()) @ i
           (\ (ds2 :: (i, ())) ->
              case ds2 of _ { (ds4, ds5) -> ds4 }))
        (Control.Arrow.returnA @ arrow @ i $dArrow_auX)))

This looks correct except for the two (GHC.Prim.Any *) type parameters to thenA. One corresponds to the type of the stack, the other to result of first command (b in the type signature a (e,s) b -> a (e,s) c -> a (e,s) c). thenA should be polymorphic in these, but it seems that during typechecking I need to instantiate s and b to concrete values. I noticed that desugaring of cmd stored in the BodyStmt always passes () as the type of the stack. If this is correct then there's only the problem of what to do with b. Help needed here becuase I have no idea how to figure out the type of b from within tcArrDoStmt.

Ross, I also tried to implement desugaring of bind. When generating the desugared Core for cmd 'bind' \ p -> do { ss } I have problem with the \p -> part. So, given the desugared do { ss }, cmd and the bind operator how should the generated command lambda look like? I'm still at a loss to undestand how to explicitly manipulate the stack from within Core.

comment:43 in reply to:  42 ; Changed 3 years ago by ross

Replying to jstolarek:

I'm able to compile

test :: Arrow a => a i i
test = proc n -> do
         returnA -< n

using the new desugaring, which was not possible earlier.

I wouldn't have expected any change in the handling of this.

But I'm still stuck on this:

test :: Arrow a => a i i
test = proc n -> do
         returnA -< n
         returnA -< n

If we're allowing rebinding, that should be type-checked and desugared exactly as if it were written

test :: Arrow a => a i i
test = proc n -> (returnA -< n) `thenA` (returnA -< n)

Here's the type inference (backwards, the way the checker does it):

D |- proc n -> (returnA -< n) `thenA` (returnA -< n) :: a i t
---------------------------------------------------------------- (ProcExpr)
    D; n::i |-a (returnA -< n) `thenA` (returnA -< n) : () --> t
    ------------------------------------------------------------ (ArrForm)
        D |- thenA :: forall e. a1 (e,s1) i -> a2 (e,s2) i -> a (e,()) t
        D; n::i |-a1 returnA -< n : s1 --> i
        ------------------------------------ (ArrApp)
            D |- returnA :: a1 i i
            D, n::i |- n :: i
        D; n::i |-a2 returnA -< n : s2 --> i
        ------------------------------------ (ArrApp)
            D |- returnA :: a2 i i
            D, n::i |- n :: i

For the default thenA,

thenA :: Arrow a => a (e,s) b -> a (e,s) c -> a (e,s) c
u `thenA` v = arr id &&& u >>> arr fst >>> v

we have a1 = a2 = a, s1 = s2 = () and t = i, but with rebinding they'd be taken from the type of thenA, which could be anything matching the type in the above inference.

I noticed that desugaring of cmd stored in the BodyStmt always passes () as the type of the stack.

That's true for the default, but if we have rebinding it will be determined by the type of thenA/bind.

Ross, I also tried to implement desugaring of bind. When generating the desugared Core for cmd 'bind' \ p -> do { ss } I have problem with the \p -> part. So, given the desugared do { ss }, cmd and the bind operator how should the generated command lambda look like? I'm still at a loss to undestand how to explicitly manipulate the stack from within Core.

Here's the type inference:

D; xs |-a cmd1 `bind` \ p -> cmd2 : s --> t
------------------------------------------- (ArrForm)
    D |- bind :: forall e. a1 (e,s1) t1 -> a2 (e,(b,s2)) t1 -> a (e,s) t
    D; xs |-a1 :: cmd1 : s1 --> t1
    D; xs |-a2 :: \ p -> cmd2 : (b,s2) --> t2
    ----------------------------------------- (Lam)
        D; xs, p::b |-a2 cmd2 : s2 --> t2

In this case the manipulation of the stack is done by the translation of CmdLam and by the code inside the vanilla Haskell implementation of bind.

comment:44 in reply to:  43 Changed 3 years ago by jstolarek

Replying to ross:

Replying to jstolarek:

I'm able to compile

test :: Arrow a => a i i
test = proc n -> do
         returnA -< n

using the new desugaring, which was not possible earlier.

I wouldn't have expected any change in the handling of this.

Hm... I just realized I probably misunderstood one thing. My desugaring is slightly different, but this is easy to fix.

If we're allowing rebinding, that should be type-checked and desugared exactly as if it were written

test :: Arrow a => a i i
test = proc n -> (returnA -< n) `thenA` (returnA -< n)

I believe that my implementation is doing exactly this. It's just that types don't match.

For the default thenA,

thenA :: Arrow a => a (e,s) b -> a (e,s) c -> a (e,s) c
u `thenA` v = arr id &&& u >>> arr fst >>> v

we have a1 = a2 = a, s1 = s2 = () and t = i, but with rebinding they'd be taken from the type of thenA, which could be anything matching the type in the above inference.

At the moment my implementation does not work even for default thenA. I keep looking at the desugaring of monadic do-notation trying to figure out where did I go wrong.

I guess the discussion here is more about the details of my implementation. To keep in line with recent decisions I will move it to Phab. I'll post the link once I've created a new phabricator revision.

comment:45 Changed 3 years ago by jstolarek

Differential Rev(s): Phab:D72

I created a phabricator revisionfir this ticket: Phab:D72

comment:46 Changed 3 years ago by simonpj

I'm sorry Jan I'm struggling with this one. My main difficulty is that I don't understand arrows well enough, and I do not have a stand-alone document giving the typing rules they are supposed to obey, one that matches the syntax and deguaring used in GHC. (There are some changes described above, and I do not know how they fit in either.)

What would be really helpful would be a wiki page or Latex document that gave

  • The syntax for arrows, including a clear indication of how it is represented in HsSyn
  • The static semantics as typing rules
  • The desugaring into Core

The second and third of these are connected, because the type checker must generate the evidence that will be used by the desugarer to translate to Core.

I believe that the static semantics may be specified (perhaps only in part), as comment:17 suggests, by translation into some simpler combinators.

In short, it's a nice, cleanly-separable problem; and has clear intellectual challenge; but I feel unable to contribute much at the moment.

Simon

comment:47 Changed 3 years ago by jstolarek

I admit I am not certain if I can reliably produce such a specification. But I will be thinking about it.

Regarding first bullet: my plan was to change the representation of arrow notation in HsSyn. Would you like the specification to contain current representation or the planned new one?

comment:48 Changed 3 years ago by simonpj

The new one. After all, the new one is what you propose!

comment:49 Changed 3 years ago by simonpj

See also Sophie Taylor's proposal about monoidal category classes.

comment:50 Changed 3 years ago by jstolarek

Owner: jstolarek deleted

comment:51 Changed 3 years ago by simonpj

comment:52 Changed 3 years ago by thoughtpolice

Milestone: 7.10.17.12.1

Moving to 7.12.1 milestone; if you feel this is an error and should be addressed sooner, please move it back to the 7.10.1 milestone.

comment:53 Changed 3 years ago by tomberek

Interested in this. What is the current status?

comment:54 Changed 3 years ago by simonpj

It's stalled. The honest truth is:

  • No one knows about both GHC and arrows in enough detail to make progress.
  • I'm willing to support
  • But it really needs someone who cares about arrows to pay sustained attention. Maybe you?
  • comment:46 suggests the next steps

Simon

comment:55 Changed 2 years ago by thoughtpolice

Milestone: 7.12.18.0.1

Milestone renamed

comment:56 Changed 2 years ago by jstolarek

Cc: jan.stolarek@… removed

comment:57 Changed 23 months ago by thomie

Keywords: Arrows added

comment:58 Changed 23 months ago by thomie

Keywords: RebindableSyntax added

comment:59 Changed 23 months ago by thomie

Milestone: 8.0.1
Note: See TracTickets for help on using tickets.