Opened 4 years ago

Closed 4 years ago

#7671 closed bug (fixed)

No break spaces

Reported by: zenzike Owned by:
Priority: normal Milestone: 7.8.1
Component: Compiler (Parser) Version: 7.6.2
Keywords: alex lexer Cc:
Operating System: Unknown/Multiple Architecture: Unknown/Multiple
Type of failure: None/Unknown Test Case: T7671
Blocked By: Blocking:
Related Tickets: Differential Rev(s):
Wiki Page:


I thought I was going mad when the following code wasn't compiling:

{-# LANGUAGE UnicodeSyntax #-}
{-# LANGUAGE RankNTypes #-}

type F f = forall x . f x

GHC was producing the following error message:

    Illegal symbol '.' in type
    Perhaps you intended -XRankNTypes or similar flag
    to enable explicit-forall syntax: forall <tvs>. <type>

It turns out that I had somehow inserted a unicode no break space, character code U+00a0, in the line importing RankNTypes, just after the first #. Baffling.

This begs the question: should GHC treat this unicode space as an ordinary space when the UnicodeSyntax extension is enabled? If not should there have been some warning that I had inserted this symbol in a language pragma?

Change History (3)

comment:1 Changed 4 years ago by igloo

Component: CompilerCompiler (Parser)
difficulty: Unknown
Keywords: alex lexer added
Milestone: 7.8.1
Test Case: T7671

Thanks for the report. This is meant to work, but it's broken.

It looks like what's going wrong is that known_pragma is assuming that len is a number of bytes, but it's actually a number of characters. It therefore sees -#\160LAN[...] when it expects {-#\160LAN[...].

comment:2 Changed 4 years ago by ian@…

commit c68aac1f2e59d0844a285b757777b950da91a8be

Author: Ian Lynagh <>
Date:   Tue Feb 26 01:27:43 2013 +0000

    Fix parsing of pragmas containing unicode characters; fixes #7671

 compiler/parser/Lexer.x |    7 +++++--
 1 files changed, 5 insertions(+), 2 deletions(-)

comment:3 Changed 4 years ago by igloo

Resolution: fixed
Status: newclosed


Note: See TracTickets for help on using tickets.