New directions for Template Haskell

Nota bene: TemplateHaskell/BlogPostChanges is a copy of this blog post, but with subsequent edits and improvements. Don't pay too much attention to the text below; I'm keeping it only so that you can see the context for comments.

This post explores a set of design proposals for Template Haskell. They are inspired by discussion with Tim Sheard, Kathleen Fisher, and Jacques Carette. It was originally triggered by several Template Haskell tickets: including #4230, #4135, #4128, #4170, #4125, #4124, #4364, #6062, #6089. (See also #7016, which work better with the suggestions below.) Taken together, these proposals would make quite a big change to TH, I think for the better. Happily, I'm pretty sure they are relatively easy to implement.

There's an interesting critique of Template Haskell on StackOverflow, some (but not all) of which is addressed by proposals here.

But I'd like to get the design right; hence this post. I'm going to treat it as a working draft, and modify it in the light of comments. So please comment.

I'm going to assume that you are more or less familiar with Template Haskell. If not, there's lots of background on the Template Haskell page. It's a Wiki so you can help to improve it.

Here's the implementation page? the describes how we are going to implement this stuff.

Some brief background

After parsing, GHC runs two completely separate passes, one after the other:

  • The renamer resolves scopes, fixing precisely which binding site is connected which occurrence of every variable. For example, you write
    let x = \x -> x+1 in x
    
    and the renamer changes it to
    let x_77 = \x_78 -> x_78 + 1 in x_77
    
    The renamer also performs depdenency analysis, which sorts bindings (both value declarations and type declarations) into the smallest possible mutually recursive groups. This prior sorting is required to maximise polymorphism in mutually recursive value bindings.
  • The typechecker works on the renamed program, and typechecks it.

At the moment these two passes relate to Template Haskell as follows:

  • Quasi-quotes are run in the renamer. Why? Because quasi-quotes can expand to patterns.

Consider this, which has a quasi-quoted pattern:

\x -> \ [pads| blah |] -> x+1

Is the "x" in "x+1" bound by the outer \x or by the 'x' that might be bought into scope by the [pads| blah |] quasi-quote? The only way to know is to run the quasi-quote, so that's what happens.

  • All other Template Haskell stuff is run in the typechecker. Why? Because we try to typecheck quotations before feeding them into TH functions. More on this below.

The main issue

The big issue is this: Template Haskell is both too weakly typed and too strongly typed.

Too weakly typed

A TH quotation has type Q Exp, a type that says nothing about the type of the quoted term. For example, consider

qnot :: Q Exp -> Q Exp
qnot x = [| not $x |]

Presumably the author expects $x to be a boolean-valued term, but that is not checked. For example we might write

h = $(qnot [| "hello" |])

in which we pass a string-valued term to qnot. The splice will typecheck fine, but the returned code will be the ill-typed not "hello". There is no soundness problem because GHC typechecks the result of the splice $(qnot [| "hello" |]), but the error is reported in code that the user never wrote.

Moreover, errors can be delayed. For example, suppose qnot was like this:

qnot :: Q Exp -> Q Exp
qnot x = [| (not $x, length $x) |]

This cannot possibly be right, becuase $x cannot be both a boolean and a list. Yet TH will accept it, because a splice has type forall a.a. The error will only be reported to callers of qnot.

This is bad. MetaML solves this problem by giving types to quoted terms, something like this:

qnot :: TExp Bool -> TExp Bool
qnot x = [| not $x |]

Here TExp (short for typed expressions) has a type parameter that expresses the type of the quoted term.

In TH we deliberately did not do this, because it restricts expressiveness; see the original TH paper. We could make this choice without losing soundness because in TH, unlike in MetaML, all splicing is done at compile time, and the result of a splice is typechecked from scratch. But still, it's a weakness and, for some users (stand up Jacques), a very serious weakness.

Too strongly typed

Even though TH cannot guarantee to construct only type-correct code, every quotation is typechecked. For example, the quotation [| "hello" && True |] would be rejected because it is internally inconsistent.

But with the advent of type splices (a very useful feature) typechecking quotes has become hard to do. Consider this:

  f :: Q Type -> Q [Dec]
  f t = [d| data T = MkT $t; 
            g (MkT x) = g+1 |]

This function f returns a declaration quote, declaring T and g. You'll see that the constructor MkT takes an argument whose type is passed (as a quotation) to f -- a type splice.

The difficulty is that we can't typecheck the declaration of 'g' until we know what $t is; and we won't know that until f is called. In short,

  • we can't really typecheck the declaration quote at all

An analogous problem occurs in other declaration splices, for example

  f t = [d| instance C $t where ... |]

Here GHC's check that an instance decl is always of form

  instance C (T a b c)

falls over, again because we don't know what $t will be. Here's another example:

      [| let { f :: $t; f = e } in .. |]

We can't sensibly typecheck the term without knowing what f's type signature is, and we can't know that without expanding the splice.

Here's a rather different example, #4364:

type N0 = $( [t| Z |] )
type N1 = $( [t| Z |] )

Faced with a type splice

type N0 = $(...blah...)

the renamer doesn't know what the splice will expand to, because splices are currently run later, in the type checker. So it pessimistically assumes that the splice could expand to mention anything in scope. But that pessimistic assuption triggers the error message

     Cycle in type synonym declarations:
       m.hs:7:1-23: type N0 = $([t| Z |])
       m.hs:8:1-23: type N1 = $([t| Z |])  }}}

All this is quite annoying. Several users have said "I'm just using a quotation as a convenient way to build a syntax tree. Please don't even try to typecheck it; just wait until it is finally spliced into a top-level splice".


A proposal

TH currently embodies an uneasy compromise between being too strongly typed and too weakly typed. So my proposal is to move TH in both directions at once:

  • Part A: Move the existing structures towards the expressive but weakly-typed end.
  • Part B: Add new MetaML-style constructs for strongly-typed metaprogramming.
  • Part C: Clean up and improve reification
  • Part D: Quasi-quotation improvements

Part A: Reduce typechecking for quotes

On the "make-it-weaker" front, here's what I propose:

  • Cease typechecking TH quotes altogether. Instead, to use GHC's terminology, we would rename a quote, but not typecheck it. The renaming pass ensures that the scope hygiene mechanisms would remain unchanged. By not attempting to typecheck we avoid all the tricky problems sketched above.
  • Add pattern splices and local declaration splices, as requested in #1476. For example
    -- mkPat :: Q Pat -> Q Pat
    f $(mkPat [p| x |]) = x+1
    
    -- mkDecl :: Int -> Q [Dec]
    g x = let $(mkDecl 3) in ...
    
    These are not supported today because they bring new variables into scope, and hence are incompatible with running splices only after the renamer is finished; see Notes on Template Haskell, section 8.
  • Run TH splices in the renamer, uniformly with quasi-quotes. Of course, we must still typecheck the code we are about to run. But there's an existing TH restriction that code run in top-level splices must be imported. So we can typecheck this code even during renaming, because it can only mention imported (and hence already fully-typechecked) code.

This solves #4364 because we run the splice in the renamer, so things are sorted out by the time we are checking for cycles (in the type checker).

  • Allow quoted names as patterns as requested by Conal Eliott. This is just a variation on allowing splices in patterns, since a quoted name 'x is really just a simple splice

These changes would essentially implement the desires of those who say "I just want to generate syntax trees". All the mentioned bug reports would be fixed. The big loss is that quotes would not be typechecked at all.

Lexical scoping

Consider these definitions:

g :: Int -> Q Pat

y :: Int
y = 7

f :: Int -> Q Exp
f n = [| \ $(g n) -> y+1 |]

Where is the 'y' bound in the RHS of f?

  • Perhaps by the y = 7 that is in scope at the definition of f?
  • Perhaps by the pattern that $(g n) expands to?
  • Perhaps by a 'y' that is in scope at the splice site of f?
  • Does it depend on whether $(g n) in fact binds 'y'?

A major point about TH is that we get lexical scoping (also called "hygienic"). So, to me it seems the the first of these choices is the only reasonable one. If you want the second you can instead use explicit dynamic binding by saying

f n = [| \ $(g n) -> $(return (VarE (mkName "n"))) + 1 |]

So the rule would be:

  • In a quote, a variable 'v' is bound by the lexically enclosing binding of 'v', ignoring all pattern and declaration splices.

To be consistent this should apply to top level splices too.

A variation (probably not)

A possible, rather ad hoc, variation would be to still typecheck quotes that are (a) top level, and (b) expression quotes. For example, we might still reject this:

f x = [| $x + (True 'c') |]

because the quote is obviously ill-typed. Only quotes nested inside top-level splices would avoid the type checker (because if the splice is run in the renamer, we can't typecheck the nexted quote). For example:

$(f [| True 'c' |])

This splice would run in the renamer, and only the result of the splice would be typechecked. But what about this?

f t = [| let g :: $t-> Int; g = e in .... |]

This is still very awkward to typecheck. After all, if $t expands to a polymorphic type, the result of the splice might typecheck, but it's really impossible to typecheck without knowing the signature. Maybe we should just give up if there's a type splice? The only really simple thing is not to typecheck quotes at all.


Part B: Add MetaML-style typed quotes

Template Haskell has quotes for terms, types, patterns, and declarations. They are all untyped, in the sense that the type of the quote tells you nothing about the type of the quoted thing. For example

  [| True |] :: Q Exp

There's no clue that the type of the quoted expression is Bool.

In the case of terms (only), we know how from MetaML to have typed quotations. Here's a proposed extension to TH to add typed term quotes:

  • Add a new type of typed expressions TExp a
  • Add a new term quotation form [|| e ||], called a typed quote; the type of the quote is TExp ty, where ty is the type of e. In the type-system jargon, this is the "introduction form" for TExp.
  • Add a new splice form $$e, called a typed splice. The term e must have type TExp ty, and the splice $$e then has type ty. This is the "elimination form" for TExp.
  • Add a constant which takes a typed quote and returns an untyped one: unType :: TExp a -> Q Exp
  • Run these new typed splices in the typechecker, not the renamer.

(The precise syntax for typed-quotes and type-splices is of course up for grabs. But doubling the symbols seems plausible to me.)

Here's a standard example:

power :: Int -> TExp (Int -> Int)
power n = [|| \x -> $$(go n [|| x ||]) ||]
  where
    go :: TExp Int -> TExp Int
    go 0 x = [|| 1 ||]
    go n x = [|| $x * $$(go (n-1)) ||]

You could do this with existing TH but there'd be no guarantee that power would return a well-typed term. With TExp there is.

Points to notice

  • Unlike TH, the only way to construct a value of type TExp is with a quote. You cannot drop into do-ntation, nor use explicit construction of the values in the Exp algebraic data type. That restriction limits expressiveness, but it enables the strong typing guarantees.
  • There are no declaration, type, or pattern quotes for these typed quotes. Only terms.
  • You can't use an untyped splice in a typed quote, thus [|| ...$(e)... ||]. Similarly, you can't splice a type, pattern, or declaration group in a typed term quote.
  • Using unType you can go from the typed world to the untyped one, which lets you mix the worlds. Example:
    f :: TExp Int -> Q Exp -> Q Exp
    f qt qu = [| $(unType qt) + $qu |]
    
  • Unlike Exp, TExp is an abstract type, so you can't decompose values of type TExp. All you can do with them is splice them (into a program or a larger quote). Or you can convert to a Q Exp and then decompose, using the existing TH mechanisms. For example
    g :: TExp Int -> Q Exp
    g qt = do { tree <- unType qt
              ; case tree of
                  VarE v -> ...
                  ConE c -> ...
                  ...etc...  }
    
  • TExp is not a monad, but it is an applicative type constructor, although not quite an instance of Applicative class:
       pure e   ===   [|| e ||]
       f <*> g   =    [|| $$f $$g ||]
    
    Reminder: the Applicative class looks like this
    class Applicative f where
      pure :: a -> f a
      <*>  :: f (a->b) -> f a -> f b
    
    TExp is only "almost an instance" because pure isn't a function; its argument must be syntactically quoted.

Syntax is always a delicate point.

  • We could use some other kind of bracket, although brackets are always in short supply; eg (| ... |) or {| ... |}.
  • We could add Unicode brackets too (suggestions?); but I think we should have ASCII equivalents.
  • Ian asked whether $(...) could accept either Q Exp or TExp. I think not; we need to know which kind of splice it is before type checking.

Part C: Reification and typechecking

The third part of this proposal concerns reification. The Q monad gives you access to the typechecker's environment. Notably, Template Haskell provides the function

reify :: Name -> Q Info

which, given the Name of a variable, type, or class, looks the Name up in the type environment and returns what the type checker knows about it:

data Info = TyConI       -- A type
               Dec	        -- Declaration of the type

          | ClassI       -- A class
               Dec              -- Declaration of the class
               [ClassInstance]	-- The instances of that class

          ...etc...

What reify sees

A dark corner of reify is this: what types does reify see? Consider

f x = ($(do { xi <- reify 'x; ... }),
       x && True)

Here, reify can be used to examine the type of x. But the type of x isn't fully known until the type checker has seen the term (x && True). So in current TH it's going to be unpredicatable what you see for x, which is hardly satisfactory.

It seems to me that the only decent, predictable story is to say that reify can only consult the top-level type environment. More specifically, Template Haskell processes a program top-down:

  module M where
   import ...
   f x = x
   $(th1 4)
   h y = k y y $(blah1)
   $(th2 10)
   w z = $(blah2)

TH processes this module as follows:

  1. Typecheck down to, but not including, the first splice, $(th1 4). These declarations constitute the first declaration group.
  2. Typecheck and run the splice, which returns a bunch of declarations D1
  3. Typecheck the declarations D1 combined with the declarations down to, but not including, the second splice. These declarations consitute the second declaration group.
  4. Typecheck and run the next splice, $(th2 10)
  5. Rinse and repeat

So the proposal is as follows. A declaration group is the chunk of declarations created by a top-level declaration splice, plus those following it, down to but not includig the next top-level declaration splice. Then the type environment seen by reify includes all the declaration up to the end of the immediately preceding declaration block, but no more.

So, in the preceding example:

  • A reify inside the splice $(th1 ..) would see the definition of f.
  • A reify inside the splice $(blah) woudl see the definition of f, but would not see any bindings created by $(th1...).
  • A reify inside the splice $(th2..) would see the definition of f, all the bindings created by $(th1..), and teh definition of h.
  • A reify inside the splice $(blah2) would see the same definitions as the splice $(th2...).

This would mean that you could not say

f x = x
g y = $(do { info <- reify 'f; ... })

because there is no top=level splice between the declaration of f and the splice. But that seems reasonable to me. If you want that behaviour you can say

f x = x
$(return [])
g y = $(do { info <- reify 'f; ... })

Reifying expressions

But what about expressions? It would be useful (stand up Kathleen) to have a more elaborate reify, like this:

typecheck :: [(Name,Type)]  -- Extend the type environment with this 
          -> Exp	    -- The expression to typecheck
          -> Q (Either String Type)
 -- Typecheck the expression, returning
 --    Left error-message     if typechecking fails
 --    Right type             if typechecking succeeds

(For GHCi users, reify f is like :info f, while typecheck [] (...) is like :type (...).)

You might ask whether we can typeckeck an expression; remember, these Q ty things are going to be run in the renamer. But if the type environment is that in force just before the last top-level splice, then all is well: that stuff has been fully typechecked.


Part D: quasiquotation

This part is unrelated to the preceding proposals, and is responding to #4372 and #2041.

  • For #2041, rather than the proposal made there, I think the nicest thing is for Language.Haskell.TH to expose a Haskell quasiquoter:
    parseHaskell :: QuasiQuoter
    
    Remember that a QuasiQuoter is a quadruple of parsers:
    data QuasiQuoter = QuasiQuoter { quoteExp  :: String -> Q Exp,
                                     quotePat  :: String -> Q Pat,
                                     quoteType :: String -> Q Type,
                                     quoteDec  :: String -> Q [Dec] }
    
    If TH provided such parsers, you could use them to parse antiquotes. That seems better to than having strings in the TH syntax.

See #4430 for an excellent point about fixities.

  • For #4372, I'm a bit agnostic. There is no technical issue here; it's just about syntax. Read the notes on the ticket.
  • See #4429 for a suggestion about reifying Names.

Part E: Other minor issues

This section collects other TH changes that I think should be done.

  • The InfixE construtor of Syntax.Exp should only allow a Var in the operator position. See Trac #4877
  • Posted: 2010-10-18 21:55 (Updated: 2013-05-16 16:12)
  • Author: simonpj
  • Categories: (none)

Comments

1. rrnewton -- 2010-10-19 11:28

This would not directly related but would it be possible to fix #4417 during this rearchitecture? It would be nice to separate the compile-time from run-time library dependencies, either automatically, or by a mechanism like in R6RS scheme that distinguishes imports in different stages.

2. illissius -- 2010-10-19 15:45

I've definitely run into the "I just want to build ASTs, why must you typecheck?!" problem before, but it's harder to notice offhand where/whether in my existing code I've benefitted from the typechecking and might've had problems without it -- maybe there's none, I'm not sure. Anyway, +1 for this.

Another thing I've wanted is to be able to quote/splice parts of TH ASTs of types other than Exp/Dec/Pat/Type?, like Cxt or Match or whatever. I guess this wouldn't be worth the cost of implementation, but the more places I can use quotes/splices and not have to drop down to manual AST molestation, the nicer my code is.

I don't quite follow the train of thought on reification. You're saying, in the first example, that a reify in 'th2' wouldn't be able to see 'h', only 'g'? Why? Hasn't 'h' already been typechecked by the time it gets to 'th2', according to the process you sketched? The staging restriction is the most annoying part of TH as it is; I don't know if one might frequently bump up against this added restriction, but it definitely seems conceivable. (I'm not saying the motivations don't outweigh the costs -- just that I don't quite understand the motivations, and the costs seem real.)

Being able to typecheck expressions would definitely be a neat capability -- I know that I checked whether this kind of thing could be done via the existing report/recover mechanism back before isClassInstance was added, as it would've let me simulate the same thing (try some expression using the instance, and if it doesn't typecheck, presumably it's not an instance). I imagine there's other nifty things you could use it for.

I don't have an opinion on typed expressions or quasiquoters; I haven't so far encountered a situation where I wanted the former, and while the latter seems neat I haven't used it before.

My only additional comment is with regards to syntax: specifically, the ''Name syntax for quoting type names really annoys me. It's just weird and ugly, and -- given that a very popular use case for TH is generating instance declarations -- it's also a part of TH syntax which "plain old users" of TH libraries frequently need to be exposed to. The first time I saw it, in an fcLabels tutorial if I'm remembering correctly, before I had learned about TH, my reaction to it was along the lines of "what is this I don't even", and it acted to discourage me from trying the library. Shallow, I know, but these things happen. It would be really nice if this (and perhaps the term-level name quoter as well) could be exchanged for some kind of 'normal' sigil -- @, %, or &, maybe. (In the case of TExps I guess the doubling syntax, while still ugly, isn't as bad, and obviously there's not an infinite supply of good-looking sigils.)

3. m4dc4p -- 2010-10-19 15:53

I am excited to see #2041. It seems that will allow you to splice in top-level data declarations (as requested in #1476 and #2221). Am I right?

4. simonpj -- 2010-10-19 21:59

Brief rejoinders:

  • Yes, Part A encompases #1476; I've made that explicit now
  • I have clarified what I meant about "what reify sees" in Part C
  • I'm making an alternative proposal to that suggested in #2041; see part D
  • I'm open to suggestions about syntax. Merely wishing for better syntax isn't enough :-). There are only so many ASCII characters.
5. Ashley Yakeley -- 2010-10-20 08:52

+1 overall. Are you going to do GADTs as well?

6. reinerp -- 2010-10-23 04:18

If part A is "just about generating syntax trees", is there a reason to implement it in the language, rather than as a library of quasiquoters? Such a library might be useful in any case, as the quasiquoters could also support pattern matching on the syntax trees, rather than just generate them.

7. simonpj -- 2010-10-25 08:16

Ashley: I'm afraid I don't know what you mean by "do GADTs as well". You can already define GADTs in TH.

reinerp: I'm afraid I don't understand your alternative proposal. By all means make one, but you'll need to spell it out at a comparable level of detail.

8. reinerp -- 2010-10-25 10:44

I propose the following as an alternative to Part A.

In addition to the parseHaskell quasiquoter from Part D, also add the following quasiquoters to Language.Haskell.TH:

hsExpr, hsPat, hsDecl, hsTy :: QuasiQuoter

The idea is that what we currently write as [| some_expr |] would in future be written with the hsExpr quasiquoter as [hsExpr| some_expr |]; similarly, we make the following changes:

[t| some_type  |] --> [hsTy|   some_type  |]
[d| some_decls |] --> [hsDecl| some_decls |]
[p| some_pat   |] --> [hsPat|  some_pat   |]

The reasons I suggest this are as follows.

Firstly, such quasiquoters are more powerful than the current TH quotation mechanism. In particular, if they implemented quotePat (rather than just quoteExp), then these quasiquoters could also be used to pattern match on (say) Exps, rather than just construct them. For example, we could implement a simplifier in TH:

simpl :: Exp -> Q Exp
simpl [hsExp| 0 * $x |] = [hsExp| 0 |]
simpl [hsExp| $x * 0 |] = [hsExp| 0 |]
simpl [hsExp| 0 + $x |] = simpl x
...

(this is the same idea as the example "Implementing constant folding for C" from Mainland's paper on quasiquoting). With the current TH quotation mechanism, the pattern matches would have to be written as explicit pattern matches on the TH datatypes, which would be considerably more verbose.

Secondly, if such quasiquoters can be implemented in a library (rather than in the compiler), then why not do that?

I realise that there may be considerable duplication of effort if this proposal were adopted: for example, these quasiquoters would have to be aware of haskell's scoping rules.

Ultimately, I am not sure where I stand regarding this proposal. In my use of TH, I have not wanted to pattern-match on TH datatypes, so my first point above is purely hypothetical. Perhaps this proposal is too much work for not enough reward?

9. guest -- 2010-10-29 04:30

+1 x 1000 for parseHaskell. It's something I've been missing and because of the fixity issues, haskell-src-meta just isn't good enough.

10. carette -- 2010-11-11 16:21
As the principal 'mover' behind Part B, I am very happy with the proposal as-is. As far as syntax goes, while [
expr ] and $$expr are a little verbose, the massive constraints on any new syntax are such that I am quite willing to live with it (certainly in ASCII). If there was also a nicer way to do this in Unicode, that would be nice too.

Comment on Reiner's 25/10 comment: while it seems 'nice' to have such quasiquoters, it turns out that they are not needed for the application you give. One can write such simplification code in metaocaml which does not allow any introspection whatsoever. I am (now) a firm believer that the techniques used for writing such introspection-free simplifiers scale much better than matching-based approaches. Having said that, having these quasiquoters written as library code instead of in the compiler is indeed something worth considering.

Small comment on Part C: wouldn't the rule as stated encourage people to insert dummy top-level splices as a 'chunking' mechanism, to achieve their aims? Perhaps this points to a need to have such a 'chunking' mechanism? For Haskell, modules are used for this, but this seems heavyweight.

11. Ashley Yakeley -- 2010-12-16 08:02

I think it is possible to write an "instance Applicative TExp"?

12. guest -- 2011-02-28 02:33

It seems backwards to have the more verbose syntax for the type-safe constructs. The more restricted and safer way of doing things should be the simplest and best looking way as well.

13. tov -- 2011-03-03 21:25

Part A would let me do several useful things with TH that I couldn’t in the past, so I’m all in favor of that. It took me a while to realize that the purpose of quotations was to help implementors of TH-based libraries construct syntax for splicing, rather than to allow users of such libraries construct syntax to be passed to some extension that subsequently transforms it. With part A, you can do both.

Part B seems good for preventing incomprehensible error messages about code that a programmer didn’t directly write, but I wonder if it would also be useful to give TH a way to hook into the error message generating mechanism. There are probably useful extensions that aren’t expressible using MetaML-style typed quotations, but for which extension implementers could specify helpful type errors if they had a way to do so. This is very hard, though.

Regarding part C, if it were up to me, reification boundaries would align with module boundaries. Top-level splices currently—and in this proposal, as far as I can tell—force a particular order on source files that can make it very difficult to get mutual recursion involving several top-level declaration splices and other top-level declarations. Another approach would be to expand all splices in a source file before renaming or type checking any of it, but this at odds with the desire to use reification to generate things such as typeclass instances. (I also don’t know how that would mess with hygiene, but Scheme manages to have hygiene by interleaving expansion and renaming in a way that avoids ordering restrictions like TH has.)

Another +1 for the parseHaskell quasiquoter. It would be extraordinarily useful. I’ve written my own lame version of it when needed, but the real thing would be much better!

14. sabaki -- 2011-05-21 22:13

First of all, thank you so much for TH. I have been neck deep in it for the last month and loving it, despite some current infelicities. And when I like things I talk about them. Trevor "borrowed" one of my Galois business cards, and it came back with Chief Scientist crossed out and replaced with Template Haskell Salesman. :-)

Detailed comments follow. John


Part A.

I'd be happy to cease typechecking TH quotes altogether, and don't particularly bother with any ad hoc special cases, because those are never the tricky ones anyway. So much of the challenging stuff is done on expanded code anyway, that the TH splices-check is almost irrelevant. However, please have -ddump-splices print out legal, parseable Haskell. Sometimes for debugging I replace a splice with its expansion to have more direct interaction with ghc. Similarly, please please enable an output of a spliced module (even--especially--if typechecking fails) so that the full program can be read in context. I think that's important from a documentation perspective. It is also very helpful for any kind of verification.

Yes, to adding pattern and local declaration splices. I kept trying to do this. It is a very natural thing to want. Follows the principle of orthogonality.

Yes to run TH splices in the renamer. The import restriction on splice code is just fine. I never had a problem with it.

By the way, I would go back to requiring language splices to have the $ sign again, thus [$pads| .... |]. I know others don't like it, but it's more consistent with the rest of the design, and is an important visual marker compared with say [pads| .... ] (i.e. the list comprehension). For example, you cannot distinguish

[foo| x <- [1..10], y <- [1..10], let foo = x*y ] [foo| x <- [1..10], y <- [1..10], let foo = x*y |]

until you get right to the *end* of the construct. Bad bad bad. Note, you'd also want to do [$t| ... |] etc. for type quotes etc.

Part B.

It's a shame that the typed part cannot simply be defined within TH, in the same way that we have been able to do for pads. I don't see the fundamental issue at this point. In particular, would it work to have [$typed| .... |] and return a Q (TyExp? a)? You could still limit what kinds of splices were permitted inside a $typed quoter (e.g., not provide access to newName, etc). Perhaps it requires the splicing mechanism to be definable in whether specific programmable splices (like pads etc) to be renamer splices (returning Q Exp), or typechecker splices (returning Q (TyExp? a)).

I don't see why you cannot provide access to the constructors of TExp. You seem to imply that it would cause problems with strong typing guarantees. I don't see it.

Restricting the typed part to terms only makes sense. Also not permitting mixing of renamer splices with typechecker splices.

Part C.

Oh no! Horrible... Haskell always lets us reorder things without concern. If I understand your proposal right, this would break it. It's a recipe for disaster: I move a definition around in a module, and suddenly my program breaks because *somewhere else* in the depths of the module a TH expansion was relying on the textual order. I have lost the property of local idependence. No no no.

Maybe the real issue is exactly when reify is called. If the splices are expanded at rename-time, then reify will need to be delayed until type-checking. This will mean that you cannot introduce new names or dependencies based on reify, but maybe that's okay. Then we can still do topological sorting to structure the program into dependencies, and reify gets to see the type environment as constructed by all the previous strongly connected components. I think this meets the goal you have of having a clearly defined and completed type environment, but still preserves standard Haskell reasoning across the rest of the module. I think the same approach applies to reified expressions.

But, even if this doesn't work, the current proposal runs so contrary to Haskell that I think its a non-starter.

15. carette -- 2011-06-17 16:01

Has this proposal (especially part B) made it into a plan? I just checked the Trac tickets and did not see anything 'obvious'.

16. heisenbug -- 2012-05-01 15:08

I am interested in rewriting a DSL (which is an expression quotation, or a list thereof) and raised #6062. Since my DSL brings custom typing rules, I use a splice to transform the AST by inserting apply nodes. The latter are methods of a typeclass, so I can graft on an external type system for each of my dialects. For this to work, I need expression quotations to have the _syntax_ of Haskell (preferably the GHC parser proper for nice error messages), but not the typechecker. The latter totally gets into my way. My intention however is that after the splice the regular Haskell typechecker goes through.

17. heisenbug -- 2012-05-01 15:11

typo above "depdenency analysis" (...feel free to delete this comment)

18. JonasDuregard -- 2012-10-11 13:35

Why do we need to extend the type environment in typecheck? If it is mainly to handle data types and functions which are being generated but have not yet been spliced I would suggest this function (or some variant, NB I have no idea if this is remotely possible):

splice :: Q [Dec]-> Q ()

It takes a set of declarations and tries to splice them as a separate splice preceding the one we are currently building. So these programs are equivalent:

$(splice qxs >> qys)
$(qxs)
$(qys)

This would be very useful in situations where we are both generating a data type and instances for that type. Often there is a TH program that builds instances given a name, but they assume that the data type definition can be reified. It would also be extremely useful to be able to recover from a failed splice, i.e. trysplice :: Q [Dec] -> Q (Maybe String).

Also if we can have the typecheck function, what about this one:

typeCheckT :: Exp -> Q (Either (String) (TExp a))