Opened 12 months ago

Last modified 5 months ago

#12778 patch bug

Expose variables bound in quotations to reify

Reported by: facundo.dominguez Owned by:
Priority: normal Milestone:
Component: Template Haskell Version: 8.0.1
Keywords: template-haskell reify Cc: mboes, goldfire, simonpj
Operating System: Unknown/Multiple Architecture: Unknown/Multiple
Type of failure: None/Unknown Test Case:
Blocked By: Blocking:
Related Tickets: Differential Rev(s): Phab:D3003
Wiki Page:

Description

Consider the following program:

{-# LANGUAGE TemplateHaskell #-}
module A where
import Language.Haskell.TH as TH
import Language.Haskell.TH.Syntax as TH

foo :: IO ()
foo = $([| let x = True
            in $(do addModFinalizer $ do
                      Just name <- TH.lookupValueName "x"
                      TH.reify name >>= runIO . print
                    [| return () |]
                )
        |])

When compiled, TH.lookupValueName fails to find x.

$ inplace/bin/ghc-stage2 A.hs -fforce-recomp
[1 of 1] Compiling A                ( A.hs, A.o )

A.hs:7:9: error:
    • Pattern match failure in do expression at A.hs:9:23-31
    • In the expression: (let x_a3Jy = True in return ())
      In an equation for ‘foo’: foo = (let x_a3Jy = True in return ())

It would make producing bindings in inline-java better if the type of x could be found in the finalizer.

According to comments in ghc, [| \x -> $(f [| x |]) |] desugars to

gensym (unpackString "x"#) `bindQ` \ x1::String ->
  lam (pvar x1) (f (var x1))

which erases any hint that a splice point existed at all. This information is necessary to know which variables were in scope.

How about we add a some new methods to the Q monad for the sake of marking inner splices:

class Q m where
  ...
  qSpliceE :: m Exp -> m Exp
  qSpliceP :: m Pat -> m Pat
  qSpliceT :: m Type -> m Type
  ...

Now [| \x -> $(f [| x |]) |] would desugar to

gensym (unpackString "x"#) `bindQ` \ x1::String ->
  lam (pvar x1) (qSpliceE (f (var x1)))

When the renamer executes these primitives, it would be aware of the inner splices and could treat them similarly to top-level splices.

Change History (13)

comment:1 Changed 12 months ago by facundo.dominguez

Keywords: template-haskell reify added

comment:2 Changed 12 months ago by goldfire

Component: CompilerTemplate Haskell

comment:3 Changed 10 months ago by facundo.dominguez

The proposal is incomplete. For it to work, we would need to extend Exp, Pat and Type with constructors mimicking what HsSpliced does in the GHC AST.

data Exp = ... | SplicedE ModFinalizers Exp

This is ok for producing code with splices inside brackets, but what about pattern matching Exp values?

To be comprehensive of all cases, the following code

case e of
  TupE [LitE _, LitE _] -> ...
  _                     -> ...

would need to be rewritten

case e of
  TupE [             LitE _,              LitE _] -> ...
  TupE [             LitE _, SplicedE _ (LitE _)] -> ...
  TupE [SplicedE _ (LitE _),              LitE _] -> ...
  TupE [SplicedE _ (LitE _), SplicedE _ (LitE _)] -> ...
  _                     -> ...

It could be alleviated with view patterns like

case e of
  TupE [ dropSplicedE -> (LitE _), dropSplicedE -> (LitE _)] -> ...
  _                     -> ...
 where
  dropSplicedE (SplicedE _ e) = e
  dropSplicedE e              = e

but is it tolerable?

Last edited 10 months ago by facundo.dominguez (previous) (diff)

comment:4 Changed 9 months ago by facundo.dominguez

Another issue with this approach is that the finalizer would not be registered by addModFinalizer but it is carried in the AST instead. If the user discards the result of the inner splice, the finalizer wouldn't run.

The following expression does not run the finalizer, because exp carries the finalizers and it is not used in the result of the outermost splice.

$(do 
  exp@(SplicedE here_we_carry_the_finalizers (TupE [])) <-
   [| $(addModFinalizer (runIO (putStrLn "finalizer")) >> [| () |] ) |]
  [| () |]
 )

comment:5 Changed 9 months ago by facundo.dominguez

Differential Rev(s): Phab:D3003
Status: newpatch

comment:6 Changed 6 months ago by facundo.dominguez

Would abandon this in favor of #13608.

comment:7 Changed 6 months ago by mboes

Here's a clarification as to the scope of this ticket. The example in the description shows that addModFinalizer only knows about variables bound in the source code, but not variables introduced by a splice. This ticket is about making the types of all variables queryable in addModFinalizer. Whereas #13608 is much less ambitious: it only seeks to name all quasiquotes so that the type of each quasiquote can be queried in addModFinalizer, without resolving this ticket in its entirety.

comment:8 Changed 6 months ago by mboes

In the attached Diff, Simon PJ admits to, understandably, being very confused by the original use case. The description in this ticket doesn't expatiate that, so here's a quick summary.

inline-java defines the java quasiquoter, which stands for a call to a *typed* static Java method (with the antiquotation variables being the arguments):

jadd :: Int32 -> Int32 -> IO Int32
jadd x y = do
  [java| { return $x + $y } |]

At compile time, we need to add somewhere the definition of this static method. Something like:

public static int wrapperFun(int x, int y) {
  return x + y;
}

At runtime, we need to call this method. Note that the user doesn't need to specify the Java types of antiquotation variables, nor the Java return type. Those are inferred from the types in the Haskell context of quasiquote (Int32 maps to Java's int, Bool maps to boolean etc). We use addModFinalizer to compute the signature of the Java method at the very end of type checking a module, at a time when the full types of all the local variables in all contexts are known. Getting the type of antiquotation variables this way works fine in 8.0.2. But getting the expected return type of a quasiquotation and inferring a Java return type is not currently possible.

So e.g. above, we can know that x and y are int, because we know that Haskell side x and y have type Int32. But we can't know that the return type is also int, because even if the quasiquote expansion is something like

let result = <java method call> in result

The type of result isn't available, even at the point where module finalizers are executed.

comment:9 Changed 6 months ago by simonpj

That's helpful. So you can do it today, like this:

jadd :: Int32 -> Int32 -> IO Int32
jadd x y = let r = [java| { return $x + $y } |]
           in r

But that's a bit painful to write.

All this addModFinalizer stuff needs careful documentation.

comment:10 in reply to:  9 Changed 6 months ago by mboes

Replying to simonpj:

That's helpful. So you can do it today, [...] but that's a bit painful to write.

That's right. And not something I'm keen to ask my users to have to write. #13608 proposes to make the exact style of your example (giving a name to quasiquote results) the default desugaring for all quasiquotes.

The semantics I'd expect for addModFinalizer is:

  • Runs after all variables everywhere in the module have a type (including after TH expansion).
  • Like any Q action, the finalizer is allowed to perform I/O.
  • Any variable that is in context of the finalizer at the creation site can have its type reified.
  • The order of execution of each finalizer, if there are several, is undefined.

This ticket proposes to extend the set of reifiable variables to include in addition:

  • variables in the scope of the Q action that created the finalizer.

Not all of these will have types by the time the finalizer runs, because some variables might never be spliced in. But those that do, should have their type available in the finalizer.

#13608 is a much more modest change in comparison.

comment:11 Changed 5 months ago by facundo.dominguez

Another way to address this. We make

$([| let x = True in $(q) |])

desugar to

$(return (LetE [ x = True ] (Splice q)))

where Splice :: Q Exp -> Exp is a new constructor of the datatype Language.Haskell.TH.Syntax.Exp.

The compiler runs first the outer splice which becomes

let x = True in $(q)

and then it runs the inner splice $(q) as if it were a non-nested splice.

Pros: It makes inner splices work pretty much as outer splices. Cons: This probably is a bigger change in the compiler (hopefully not too big).

Last edited 5 months ago by facundo.dominguez (previous) (diff)

comment:12 Changed 5 months ago by facundo.dominguez

Regarding the previous proposal, some code might break because this code

do exp <- q
   [| let x = True in $(return exp) |]

stops being equivalent to [| let x = True in $(q) |].

comment:13 Changed 5 months ago by simonpj

But see #13608 comment:15 and following, for an idea that might submsume this (distressingly complicated) ticket.

Note: See TracTickets for help on using tickets.