Opened 5 months ago

#8524 new bug

GHC is inconsistent with the Haskell Report on which Unicode characters are allowed in string and character literals

Reported by: oerjan Owned by:
Priority: normal Milestone:
Component: Compiler Version: 7.6.3
Keywords: Cc:
Operating System: Unknown/Multiple Architecture: Unknown/Multiple
Type of failure: GHC rejects valid program Difficulty: Unknown
Test Case: Blocked By:
Blocking: Related Tickets:

Description

GHC is inconsistent with the Haskell Report on which Unicode characters are allowed in string and character literals. (And I don't like either option, why leave out any characters in strings unnecessarily?)

Examples from ghci 7.6.3 (also tested in lambdabot on irc):

Prelude> "​" -- Unicode char \8203, Format class.

<interactive>:10:2:
    lexical error in string/character literal at character '\8203'
Prelude> " " -- Unicode char \8202, Space class.
"\8202"
Prelude> "t\ \est" -- Unicode char \8202 in a string gap.

<interactive>:14:4:
    lexical error in string/character literal at character '\8202'

My reading of http://www.haskell.org/onlinereport/haskell2010/haskellch2.html
(section 2.2 and 2.6):

  • The report BNF token "graphic", which can be used in literals, includes indirectly many Unicode classes, but uniWhite is not one of them. Thus the only Unicode whitespace allowed to represent itself in literals is ASCII space.
  • Unicode formatting characters are not mentioned in the BNF that I can see, so are not allowed in literals.
  • String gaps are made out of the report BNF token whitespace, which does include uniWhite.

Who wants what:

GHC Report Me
Format in string No No Yes
Space/uniWhite in string Yes No Yes
Space/uniWhite in string gap No Yes Dunno

In short, GHC's behavior is buggy and/or annoying in two opposite ways:

  • It leaves out some Unicode characters as allowable in strings and character literals, presumably because the report says so.
  • It allows some characters the report says it shouldn't, and refuses some characters the report says it should.

Change History (0)

Note: See TracTickets for help on using tickets.