Opened 4 years ago

Last modified 20 months ago

#4372 new bug

Extending quasiquotation support

Reported by: simonpj Owned by:
Priority: normal Milestone: 7.6.2
Component: Template Haskell Version: 6.12.3
Keywords: Cc: kfisher@…, gershomb@…, michael@…, pho@…
Operating System: Unknown/Multiple Architecture: Unknown/Multiple
Type of failure: None/Unknown Difficulty:
Test Case: Blocked By:
Blocking: Related Tickets:

Description (last modified by simonpj)

Gershom Bazerman (gershomb@…) writes: Attached is an experimental patch (not read for prime-time) that extends quasiquotation syntax. At the moment, quasiquoters can only be single identifiers of type QuasiQuoter (and of course declared outside the current module). This patch allows an entire expression of type QuasiQuoter in the quoter position. The staging restriction is extended to all free variables in the quasiquoter expression.

So if qq1 :: Int -> QuasiQuoter, one can now write [$qq1 12 | ... |]

This syntax would be quite useful for my own project (jmacro), and from discussions at ICFP, to others who also are beginning to take serious advantage of quasiquotation.

Here's one use case. Suppose jmt is a QuasiQuoter which returns both a parsed Javascript expression and its "module signature" (or "typing environment" if you prefer). Then, one can pass that typing environment directly into another quasiquoter in a different module, so that further code can be typechecked at compile-time with functions defined in the first quasiquote available to the Javascript typechecker. This style of programming is currently possible, but it requires defining additional new modules which contain QuasiQuoters parameterized with each successive typing environment.

There are a number of tricky choices made in this patch, not all of which are perhaps correct.

First, currently, the quoter and quotation are lexed and parsed as a single token. To avoid reinvoking the lexer and/or parser, now '[$' is lexed as one token, and a flag is set in the lexer state which lexes the |...|] (i.e. the quotation) as a single quotation on encountering the first vbar. This means that guards can't be used inside a quasiquoter expression -- i.e. [$let x | True = 1 in qq1 x|..|] would fail to parse. This also means that while now in ghc 7, one can write [qq|..], to parse a full expression rather than identifier, we need the dollar sign as well.

The former problem (stealing guards within quasiquoter expressions) can be fixed by a syntax change which moves from a single vbar to a new symbol (perhaps $| or || ?) to signal the transition between the quoter expression and the quote itself. I tend to feel that the loss of guards within quasiquoter expressions is not too painful, however. Adding a new symbol between quoter and quotee also would simplify the necessary changes to the lexer, removing the need to set a special flag in the lexer state.

The second problem (need to reintroduce a dollar) is somewhat irritating, but not terribly so. One could either introduce the dollar in all cases, which allows simplifying the lexer and parser, or keep the dollarless syntax as well for single identifiers, which adds both complexity and duplicate syntax, but keeps the default case especially lightweight.

The patch as it stands introduces "extended quasiquotations" as an orthogonal change that doesn't affect the existing quasiquotation machinery, and, at the moment, only allows for quasiquotations of expressions (i.e., not patterns, etc.).

If there is sentiment that this is useful and could be accepted, modulo whatever requested tweaks, I'd be happy to do whatever work is necessary -- implementing changes, adding documentation, pushing the changes through to quasiquoters in other positions, etc.

Attachments (1)

extend-qq.patch (160.4 KB) - added by simonpj 4 years ago.
Draft patch from Gershom

Download all attachments as: .zip

Change History (15)

Changed 4 years ago by simonpj

Draft patch from Gershom

comment:1 Changed 4 years ago by simonpj

  • Description modified (diff)

comment:2 follow-up: Changed 4 years ago by simonpj

Seems plausible to me. As you say, there are details to be worked out. I didn't understand your use-case though. Could you give it in a bit more detail; it's important, since it's really the justification for the change.

ccing Kathleen who is a heavy QQ user.

Simon

comment:3 Changed 4 years ago by simonpj

  • Cc gershomb@… added

comment:4 in reply to: ↑ 2 Changed 4 years ago by gershomb

Here's an attempt to better explain the use case.

jmt :: [ModuleSig] -> QuasiQuoter

-- file 1:
(jsModule,jsModuleSig) = [$jmt []| ... jscode ... |]

-- jsModule is statically typechecked by the jmt quasiquoter

-- file 2:
(jsModule2,jsModuleSig2) = [$jmt [jsModuleSig]| ... jscode ... |]

-- jsModule2 is statically typechecked by the jmt quasiquoter, with the types from jsModule already brought into scope. 
-- That is to say, the list of module signatures acts as a set of import statements.

In general, any time a quasiquoter needs to be parameterized by a potentially complex value then this should prove useful

comment:5 Changed 4 years ago by simonpj

But it really be so bad to say this?

-- file 1:
(jsModule,jsModuleSig) = [$jmt []| ... jscode ... |]
jmt2 = jmt [jsModuleSig]

-- file 2:
(jsModule2,jsModuleSig2) = [jmt2| ... jscode ... |]

comment:6 Changed 4 years ago by gershomb

Not for the simple case, no. But these are module signatures, and import lists. So we if we have, e.g., a collection of modules composing a standard library, which individual modules may want to mix and match, it seems awkward to have to define a new quasiquoter for every combination.

One can always get the desired functionality at the cost of an extra module, but if the idiom becomes commonly used in the codebase, then all of the extra modules defining different combinations of arguments could become fairly noisy and confusing.

comment:7 Changed 4 years ago by kfisher

I can see the utility of such a change, but I would want to ensure that the syntax stays very light for the forms that are currently supported. Ie, for my simple use, I want to be able to continue to use the syntax

[pads| <pads-syntax> |]

I don't want to have to use a $ before pads and I don't want to have a heavier-weight divider between the quoter and the quoted material. I can imagine allowing the richer forms by supporting also the more verbose syntax

[$quoter-expression $| quoted expression |]

or something along those lines.

Kathleen

comment:8 Changed 4 years ago by simonpj

It's also worth reminding ourselves that quasi-quote syntax is only shorthand for a TH splice. Thus

[pads| <blah> |]

means

$(quoteExp pads "<blah>")

So in the case proposed by Gershom you could say

$(quoteExp (jmt [jsModuleSig]) "<blah>")

and away you go. Is that so bad?

At the moment you can't do TH splices in patterns or local declarations, whereas you can use quasi-quotes. I have separate plans to change that (need time to write up), but let's assume for the sake of argument that patterns can be done the same way as expressions.

comment:9 Changed 4 years ago by gershomb

This is all shorthand, you're right. But the syntactic sugar makes a huge difference here. In quasiquote syntax, the only quotation issue that an end user has to worry about is the bar followed by close bracket. In standard splice syntax, the quoted expression is forced to respect all the rules of haskell. Multiline strings either use the \ syntax are have to be written with explicit appends. Double quotes need additional escaping, etc. Furthermore, error messages can't give nearly as nice locality.

With quasiquotes, its possible to write very fluently in an embedded dsl with almost arbitrary concrete syntax. Template Haskell splices on their own make doing so very painful.

Most of the libraries using quasiquotation that I now know of would be extremely painful to use without the shorthand syntax provided by quasiquotation.

In general, Kathleen's preference is fine -- super lightweight syntax for the simple case, and extended syntax for the extended case. This causes a bit more work at the lexing level, and in the parser (but only a few lines), but subsequent to that there's a single code path which is as simple as or simpler than what exists now.

In fact, '[$' for introducing an extended quoter and '|' alone for introducing the quotee, which is what I've now implemented, seems fine to me as well, unless there is strong sentiment otherwise.

comment:10 Changed 4 years ago by snoyberg

  • Cc michael@… added

Just to add another use case: Hamlet uses quasi-quotation for HTML templates. There are currently two parameters to the hamletWithSettings function: whether to close tags like HTML or XHTML (eg, <br> vs <br/>) and the doctype. Hamlet itself defines two quasi-quoters built on hamletWithSettings: hamlet and xhamlet.

But it's easily imaginable that someone will want to create some other combination (such as HTML 3.2) that is not currently provided. Currently, they would need to define their new quasi-quoter in a separate module; this proposal would appear to make that process a little bit nicer.

In fact, if I understand the proposal properly, I had originally assumed this is how quasi-quoters worked and was surprised when the compiler disagreed with me ;).

comment:11 Changed 3 years ago by igloo

  • Milestone set to 7.2.1

comment:12 Changed 2 years ago by PHO

  • Cc pho@… added

comment:13 Changed 2 years ago by igloo

  • Milestone changed from 7.4.1 to 7.6.1

comment:14 Changed 20 months ago by igloo

  • Milestone changed from 7.6.1 to 7.6.2
Note: See TracTickets for help on using tickets.