Opened 3 years ago

Last modified 16 months ago

#5239 new feature request

Em-dash for "--" with UnicodeSyntax.

Reported by: Eelis- Owned by:
Priority: normal Milestone: 7.6.2
Component: Compiler (Parser) Version: 7.0.3
Keywords: unicode syntax extension Cc:
Operating System: Unknown/Multiple Architecture: Unknown/Multiple
Type of failure: None/Unknown Difficulty: Unknown
Test Case: Blocked By:
Blocking: Related Tickets:

Description

It would be neat if the UnicodeSyntax? extension supported the Unicode "—" EM DASH (U+2014) character as an alternative for the "--" single-line comment introduction character sequence.

One possible objection I can anticipate is that its use could be confusing when using a monospace font, but it seems unjust to let that hold back those of us who have liberated ourselves from monospace. :-)

Attachments (1)

mdash.patch (5.1 KB) - added by porges 3 years ago.
patch

Download all attachments as: .zip

Change History (8)

comment:1 Changed 3 years ago by igloo

  • Component changed from Compiler to Compiler (Parser)
  • Milestone set to 7.4.1

Changed 3 years ago by porges

patch

comment:2 Changed 3 years ago by porges

  • Status changed from new to patch

I have added a tentative patch. It works fine, but if UnicodeSyntax? is disabled then the lexer gives an error upon encountering the mdashes... unexpected character '\n' or something. I thought that this was because of the 'Unicode fix' which transforms mdash into \x7, but Alex always complains about \n, not whatever it gets transformed into.

I'm not sure how to fix that, so I'm attaching it here in the hopes that someone else knows.

I also added some extra checking to the 'not in scope' error (in RnEnv?.lhs) that suggests that users might want to enable UnicodeSyntax? if compilation fails because an mdash isn't in scope. A further extension would be for this to happen when any UnicodeSyntax? character turns up here. (This can't be seen at the moment because of the aforementioned issue but works fine if only this part is enabled.)

comment:3 Changed 3 years ago by porges

Figured out what was wrong with my patch. The '$mdash' declaration needs to be in the $symbol character class. After that change, all works as expected.

comment:4 Changed 2 years ago by igloo

  • Milestone changed from 7.4.1 to 7.6.1

comment:5 Changed 21 months ago by simonpj

  • Difficulty set to Unknown
  • Status changed from patch to new

Dear porges

Sorry that we've been playing dead on this.

We don't have an opinion either way, but it's not entirely clear to us that everyone would welcome such a change; eg they might want to use em-dash in an operator.

Could you initiate a thread on glasgow-haskell-users to see if other Unicode-aware folk actively want the change? If so, we'll apply it. A final patch would be useful; and it should include documentation in 7.3.1 of the user manual.

Thanks

Simon

comment:6 Changed 19 months ago by igloo

  • Milestone changed from 7.6.1 to 7.6.2

comment:7 Changed 16 months ago by guest

I just want to speak out in support of this feature. I would prefer en-dashes over em-dashes though. (That would be consistent with the TeX convention of -- for en-dashes and --- for em-dashes.)

Note: See TracTickets for help on using tickets.