Opened 5 years ago

Last modified 5 months ago

#4372 new bug

Accept expressions in left-hand side of quasiquotations

Reported by: simonpj Owned by:
Priority: normal Milestone: 7.12.1
Component: Template Haskell Version: 6.12.3
Keywords: Cc: kfisher@…, gershomb@…, michael@…, pho@…
Operating System: Unknown/Multiple Architecture: Unknown/Multiple
Type of failure: None/Unknown Test Case:
Blocked By: Blocking:
Related Tickets: Differential Revisions:

Description (last modified by simonpj)

Gershom Bazerman (gershomb@…) writes: Attached is an experimental patch (not read for prime-time) that extends quasiquotation syntax. At the moment, quasiquoters can only be single identifiers of type QuasiQuoter (and of course declared outside the current module). This patch allows an entire expression of type QuasiQuoter in the quoter position. The staging restriction is extended to all free variables in the quasiquoter expression.

So if qq1 :: Int -> QuasiQuoter, one can now write [$qq1 12 | ... |]

This syntax would be quite useful for my own project (jmacro), and from discussions at ICFP, to others who also are beginning to take serious advantage of quasiquotation.

Here's one use case. Suppose jmt is a QuasiQuoter which returns both a parsed Javascript expression and its "module signature" (or "typing environment" if you prefer). Then, one can pass that typing environment directly into another quasiquoter in a different module, so that further code can be typechecked at compile-time with functions defined in the first quasiquote available to the Javascript typechecker. This style of programming is currently possible, but it requires defining additional new modules which contain QuasiQuoters parameterized with each successive typing environment.

There are a number of tricky choices made in this patch, not all of which are perhaps correct.

First, currently, the quoter and quotation are lexed and parsed as a single token. To avoid reinvoking the lexer and/or parser, now '[$' is lexed as one token, and a flag is set in the lexer state which lexes the |...|] (i.e. the quotation) as a single quotation on encountering the first vbar. This means that guards can't be used inside a quasiquoter expression -- i.e. [$let x | True = 1 in qq1 x|..|] would fail to parse. This also means that while now in ghc 7, one can write [qq|..], to parse a full expression rather than identifier, we need the dollar sign as well.

The former problem (stealing guards within quasiquoter expressions) can be fixed by a syntax change which moves from a single vbar to a new symbol (perhaps $| or || ?) to signal the transition between the quoter expression and the quote itself. I tend to feel that the loss of guards within quasiquoter expressions is not too painful, however. Adding a new symbol between quoter and quotee also would simplify the necessary changes to the lexer, removing the need to set a special flag in the lexer state.

The second problem (need to reintroduce a dollar) is somewhat irritating, but not terribly so. One could either introduce the dollar in all cases, which allows simplifying the lexer and parser, or keep the dollarless syntax as well for single identifiers, which adds both complexity and duplicate syntax, but keeps the default case especially lightweight.

The patch as it stands introduces "extended quasiquotations" as an orthogonal change that doesn't affect the existing quasiquotation machinery, and, at the moment, only allows for quasiquotations of expressions (i.e., not patterns, etc.).

If there is sentiment that this is useful and could be accepted, modulo whatever requested tweaks, I'd be happy to do whatever work is necessary -- implementing changes, adding documentation, pushing the changes through to quasiquoters in other positions, etc.

Attachments (1)

extend-qq.patch (160.4 KB) - added by simonpj 5 years ago.
Draft patch from Gershom

Download all attachments as: .zip

Change History (24)

Changed 5 years ago by simonpj

Draft patch from Gershom

comment:1 Changed 5 years ago by simonpj

  • Description modified (diff)

comment:2 follow-up: Changed 5 years ago by simonpj

Seems plausible to me. As you say, there are details to be worked out. I didn't understand your use-case though. Could you give it in a bit more detail; it's important, since it's really the justification for the change.

ccing Kathleen who is a heavy QQ user.

Simon

comment:3 Changed 5 years ago by simonpj

  • Cc gershomb@… added

comment:4 in reply to: ↑ 2 Changed 5 years ago by gershomb

Here's an attempt to better explain the use case.

jmt :: [ModuleSig] -> QuasiQuoter

-- file 1:
(jsModule,jsModuleSig) = [$jmt []| ... jscode ... |]

-- jsModule is statically typechecked by the jmt quasiquoter

-- file 2:
(jsModule2,jsModuleSig2) = [$jmt [jsModuleSig]| ... jscode ... |]

-- jsModule2 is statically typechecked by the jmt quasiquoter, with the types from jsModule already brought into scope. 
-- That is to say, the list of module signatures acts as a set of import statements.

In general, any time a quasiquoter needs to be parameterized by a potentially complex value then this should prove useful

comment:5 Changed 5 years ago by simonpj

But it really be so bad to say this?

-- file 1:
(jsModule,jsModuleSig) = [$jmt []| ... jscode ... |]
jmt2 = jmt [jsModuleSig]

-- file 2:
(jsModule2,jsModuleSig2) = [jmt2| ... jscode ... |]

comment:6 Changed 5 years ago by gershomb

Not for the simple case, no. But these are module signatures, and import lists. So we if we have, e.g., a collection of modules composing a standard library, which individual modules may want to mix and match, it seems awkward to have to define a new quasiquoter for every combination.

One can always get the desired functionality at the cost of an extra module, but if the idiom becomes commonly used in the codebase, then all of the extra modules defining different combinations of arguments could become fairly noisy and confusing.

comment:7 Changed 5 years ago by kfisher

I can see the utility of such a change, but I would want to ensure that the syntax stays very light for the forms that are currently supported. Ie, for my simple use, I want to be able to continue to use the syntax

[pads| <pads-syntax> |]

I don't want to have to use a $ before pads and I don't want to have a heavier-weight divider between the quoter and the quoted material. I can imagine allowing the richer forms by supporting also the more verbose syntax

[$quoter-expression $| quoted expression |]

or something along those lines.

Kathleen

comment:8 Changed 5 years ago by simonpj

It's also worth reminding ourselves that quasi-quote syntax is only shorthand for a TH splice. Thus

[pads| <blah> |]

means

$(quoteExp pads "<blah>")

So in the case proposed by Gershom you could say

$(quoteExp (jmt [jsModuleSig]) "<blah>")

and away you go. Is that so bad?

At the moment you can't do TH splices in patterns or local declarations, whereas you can use quasi-quotes. I have separate plans to change that (need time to write up), but let's assume for the sake of argument that patterns can be done the same way as expressions.

comment:9 Changed 5 years ago by gershomb

This is all shorthand, you're right. But the syntactic sugar makes a huge difference here. In quasiquote syntax, the only quotation issue that an end user has to worry about is the bar followed by close bracket. In standard splice syntax, the quoted expression is forced to respect all the rules of haskell. Multiline strings either use the \ syntax are have to be written with explicit appends. Double quotes need additional escaping, etc. Furthermore, error messages can't give nearly as nice locality.

With quasiquotes, its possible to write very fluently in an embedded dsl with almost arbitrary concrete syntax. Template Haskell splices on their own make doing so very painful.

Most of the libraries using quasiquotation that I now know of would be extremely painful to use without the shorthand syntax provided by quasiquotation.

In general, Kathleen's preference is fine -- super lightweight syntax for the simple case, and extended syntax for the extended case. This causes a bit more work at the lexing level, and in the parser (but only a few lines), but subsequent to that there's a single code path which is as simple as or simpler than what exists now.

In fact, '[$' for introducing an extended quoter and '|' alone for introducing the quotee, which is what I've now implemented, seems fine to me as well, unless there is strong sentiment otherwise.

comment:10 Changed 5 years ago by snoyberg

  • Cc michael@… added

Just to add another use case: Hamlet uses quasi-quotation for HTML templates. There are currently two parameters to the hamletWithSettings function: whether to close tags like HTML or XHTML (eg, <br> vs <br/>) and the doctype. Hamlet itself defines two quasi-quoters built on hamletWithSettings: hamlet and xhamlet.

But it's easily imaginable that someone will want to create some other combination (such as HTML 3.2) that is not currently provided. Currently, they would need to define their new quasi-quoter in a separate module; this proposal would appear to make that process a little bit nicer.

In fact, if I understand the proposal properly, I had originally assumed this is how quasi-quoters worked and was surprised when the compiler disagreed with me ;).

comment:11 Changed 5 years ago by igloo

  • Milestone set to 7.2.1

comment:12 Changed 4 years ago by PHO

  • Cc pho@… added

comment:13 Changed 4 years ago by igloo

  • Milestone changed from 7.4.1 to 7.6.1

comment:14 Changed 3 years ago by igloo

  • Milestone changed from 7.6.1 to 7.6.2

comment:15 Changed 14 months ago by thoughtpolice

  • Milestone changed from 7.6.2 to 7.10.1

Moving to 7.10.1.

comment:16 follow-up: Changed 8 months ago by thoughtpolice

  • Milestone changed from 7.10.1 to 7.12.1

Moving to 7.12.1 milestone; if you feel this is an error and should be addressed sooner, please move it back to the 7.10.1 milestone.

comment:17 in reply to: ↑ 16 Changed 6 months ago by songzh

Replying to thoughtpolice:

Moving to 7.12.1 milestone; if you feel this is an error and should be addressed sooner, please move it back to the 7.10.1 milestone.

Why this is being moved to 7.12? I was excited that it could be resolving in 7.10. I am only a newbie of Haskell, but I want to address something about this problem. I want to write type provider for Haskell that generates data type definition from JSON or OData sources. For example:

{"name" : "Ciambellone Cake",
"ingredients": [{ "name": "Flour",
"quantity": 250,
"measure": "gr" }]}

should generate

data Recipe = Recipe { recipeName :: String,
                       recipeIngredients: [Ingredient]}
                       
data Ingredient = Ingredient { ingredientName :: String,
                               ingredientQuantity :: Int,
                               ingredientMeasure :: String}

However, you need to give a type name when you do it (Here, it is Recipe). In F#, just write:

type Recipe = JsonProvider<"JSONSample.json">

However, it seems that we cannot do this in TH (I tried but cannot do it, if we can do tell me please!):

data Recipe = $(templateHaskellpart)

which means we have to have a parameterized quoter

quotejson :: String -> QuasiQuoter
quotejson name = QuasiQuoter {
        quoteDec = \jsonStr -> gen name (getJSON jsonStr)
        }   
getJSON :: String  -> JValue
gen :: String -> JValue -> Q [Dec]

Being unable to do this means that we have to define a json quoter for each we need in a separated module, which is painful. Well, we can use quoteExp so solve this, however, the quotation marks and other special characters in JSON or other data sources can be very annoying sometimes.

Another problem is that I suppose that the 4 quoters t, p, e, d are implemented in the layer of syntax, i.e. they do not have QuasiQuoter type, so if I want to defined a quoter called e or d, it seems to be impossible. I personally suppose that it might be better if the 4 quoters are provided in library layer instead of syntax level, just like other regular quoters. By which I mean we can define the lexer as following:

'[' quoterExp '|' strings '|]'

and handle quoterExp afterwards. I do not understand why it is implemented in current way. Apologize if I said something stupid. :-)

Last edited 6 months ago by songzh (previous) (diff)

comment:18 follow-up: Changed 6 months ago by simonpj

I'm not against this if someone wants to

  • Write a wiki page explaining the feature
  • Drive a discussion to refine any corners of the design; it's mainly a syntax question I think.
  • Implement a patch on Phabricator; including user manual documentation and some tests.

Another problem is that I suppose that the 4 quoters t, p, e, d are implemented in the layer of syntax

I'm afraid I did not understand this part of your comment. Maybe it's a separate issue?

Simon

comment:19 follow-up: Changed 6 months ago by gershomb

At this point I consider this a "nice idea" but I'm not sure how to modify the parser to handle it within the many grammatical constraints we already have. The somewhat clunky syntax that was pointed out by Simon upthread is actually adequate for this. This, for example is the style adopted by Manuel's language-c-inline:

nslog msg = $(objc ['msg :> ''String] (void [cexp| NSLog(@"Here is a message from Haskell: %@", msg) |]))

Here, we use the quasiquoter to capture the expression, and then embed that quasiquoted expression itself within a TH block, in order to pass in additional information during codegen.

If a lighter-weight syntax was possible, I'd remain all for it. But, I honestly can't think how to provide it given the constraints we have -- my patch at the time didn't really solve the problem, and I haven't had any better ideas since :-)

comment:20 in reply to: ↑ 19 Changed 6 months ago by songzh

Replying to gershomb:

At this point I consider this a "nice idea" but I'm not sure how to modify the parser to handle it within the many grammatical constraints we already have. The somewhat clunky syntax that was pointed out by Simon upthread is actually adequate for this. This, for example is the style adopted by Manuel's language-c-inline:

I am sorry for not being a GHC developer, I will try to be one. I have skimmed the parser in the source code. My understanding is that the quoters do not need to be all kinds of expressions since allowing only function application will be more than enough. If expression like CondE, MultiIfE, DoE, CompE should be not allowed in quoter position, the guarded expression and other syntax problem will vanish. For achieving this, maybe a specific parser for quoter position is needed. I am just blind guessing here.

nslog msg = $(objc ['msg :> ''String] (void [cexp| NSLog(@"Here is a message from Haskell: %@", msg) |]))

Here, we use the quasiquoter to capture the expression, and then embed that quasiquoted expression itself within a TH block, in order to pass in additional information during codegen.

If a lighter-weight syntax was possible, I'd remain all for it. But, I honestly can't think how to provide it given the constraints we have -- my patch at the time didn't really solve the problem, and I haven't had any better ideas since :-)

I will look into the c-language package and your patch. Thanks.

comment:21 in reply to: ↑ 18 Changed 6 months ago by songzh

Replying to simonpj:

I'm not against this if someone wants to

  • Write a wiki page explaining the feature
  • Drive a discussion to refine any corners of the design; it's mainly a syntax question I think.
  • Implement a patch on Phabricator; including user manual documentation and some tests.

Another problem is that I suppose that the 4 quoters t, p, e, d are implemented in the layer of syntax

I'm afraid I did not understand this part of your comment. Maybe it's a separate issue? Simon

It is not an issue, only a design decision. I was excepting

> :t e
e :: QuasiQuoter

but it is actually syntax. A little bit surprised.

comment:22 Changed 6 months ago by rwbarton

songzh: '[' quoterExp '|' strings '|]' is not really possible since it looks too much like any list comprehension. If you wrote [ x * y | x <- [1,2,3], y <- [4,5] ] is that a list comprehension or the beginning of a quasiquote, that might terminate arbitrarily far away?

With quasiquotes as they stand now you cannot write a list comprehension like [x|x<-[1,2,3]]; you must include a non-identifier character before the |. That's a mild imposition, but it is already something, and I don't see how the syntax can be pushed much farther.

As for the ticket in general: Quasiquotes offer two conveniences over TH splices: overloading the same name to refer to expression, pattern, etc. splice generators; and raw string literals (terminated by |]). The overloading can be nice, but is never essential (the resolution is purely syntactic anyway) and often you only need the expression quoter. The raw string literal syntax is quite useful, but it's a problem that only needs to be solved once. Specifically, once you have a quasiquoter q that produces string literals, you can rewrite any

[foo|bar|]

as

$(foo [q|bar|])    -- this 'foo' is the old 'quoteExp foo'.
                   -- at worst, you could call it 'fooE' or something.

which is already not that much longer; and any syntax for parameterized quasiquoters would have to be longer than [foo arg|bar|], so there's very little room left for improvement over what we can build with the pieces we have today.

So basically I agree with Gershom: the workaround he suggested is always possible, it's pretty lightweight, and it's hard to see how a new syntax for this would really be worth it, when it can't gain much (and has the opportunity cost of not being able to use the syntax for something else).

comment:23 Changed 5 months ago by ezyang

  • Summary changed from Extending quasiquotation support to Accept expressions in left-hand side of quasiquotations

Renamed to something more descriptive.

Note: See TracTickets for help on using tickets.