Opened 3 years ago

Closed 3 years ago

#9987 closed feature request (duplicate)

GHC refuses to compile a file that starts with a Byte Order Mark (BOM)

Reported by: Henk-Jan Owned by:
Priority: low Milestone:
Component: Compiler Version: 7.10.1-rc1
Keywords: Cc:
Operating System: Windows Architecture: Unknown/Multiple
Type of failure: GHC rejects valid program Test Case:
Blocked By: Blocking:
Related Tickets: #6016 Differential Rev(s):
Wiki Page:

Description

Trying to compile a file that starts with a Byte Order Mark (BOM) results in the message like:

Camels.hs:1:1: lexical error at character '\65279'

No compilation is done. Note that, if a file is saved as UTF-8, Notepad adds this BOM to the beginning of the file.

Change History (6)

comment:1 Changed 3 years ago by hvr

Type: bugfeature request

This is definitely not a bug on GHC's part, but rather on Notepad's.

BOMs cause many problems when used in UTF8 and are highly discouraged, so it should come to no surprise that GHC complains about it.

comment:2 Changed 3 years ago by hvr

Priority: highlow

comment:3 Changed 3 years ago by Henk-Jan

When I remove the BOM by saving the file in ANSI coding (using Notepad), I get the following message from GHC:

Camels.hs:152:56:

lexical error in string/character literal (UTF-8 decoding error)

This is because of an o-umlaut in the comments. The file can be found at:

https://raw.githubusercontent.com/wxHaskell/wxHaskell/master/samples/contrib/Camels.hs

(Geany states that the file is in CP1252 code and displays it correctly)

comment:4 Changed 3 years ago by hvr

Currently, GHC's lexer assumes its input to be ASCII or UTF8 (for which a BOM is rather pointless -- as an UTF8 stream doesn't allow for different byteorders).

The CP1252 (same with ISO-8859-1 btw) encoding, however, is only compatible for the lowest 128 code-points.

I believe the usual recommendation is to use Notepad++ which allows to write UTF8 w/o that gratuitous BOM.

comment:5 Changed 3 years ago by hvr

Milestone:

comment:6 Changed 3 years ago by thomie

Resolution: duplicate
Status: newclosed

I am bringing you good news from #6016. A fix for BOMs in Haskell source files will be in 7.10.

Note: See TracTickets for help on using tickets.