Opened 4 years ago

Closed 2 months ago

Last modified 2 months ago

#6037 closed bug (fixed)

Compile-time crash with sources with non-representable unicode characters

Reported by: akio Owned by: snoyberg
Priority: normal Milestone: 7.10.3
Component: Compiler Version: 7.4.1
Keywords: Cc: pho@…, dagitj@…,…, RyanGlScott
Operating System: Linux Architecture: Unknown/Multiple
Type of failure: Compile-time crash Test Case: T6037
Blocked By: Blocking:
Related Tickets: Differential Rev(s):
Wiki Page:


The following file causes GHC to crash, if compiled in the "C" locale.

$ LC_ALL=C ghc unicode.hs
[1 of 1] Compiling Foo              ( unicode.hs, unicode.o )

    Warning: Pattern match(es) are overlapped
             In an equation for `<stderr>: hPutChar: invalid argument (invalid character)


module Foo where
δ x = 3
δ x = 4

Change History (16)

comment:1 Changed 4 years ago by simonmar

  • difficulty set to Unknown
  • Milestone set to 7.6.1

I suppose we should be using an encoding that does character translation rather than failing for a Unicode character that is not supported by the current locale.

comment:2 Changed 4 years ago by PHO

  • Cc pho@… added

comment:3 Changed 3 years ago by igloo

  • Milestone changed from 7.6.1 to 7.6.2

comment:4 Changed 3 years ago by igloo

  • Milestone changed from 7.6.2 to 7.8.1
  • Test Case set to T6037

I think that we want to do something along the lines of:

enc <- mkIconvEncoding TransliterateCodingFailure localeEncodingName
hSetEncoding stdout enc
hSetEncoding stderr enc

but that means using internal modules, doing different things on different platforms, etc.

Instead: I think we should be able to just do:

hSetEncodingFailureMode stdout TransliterateCodingFailure
hSetEncodingFailureMode stderr TransliterateCodingFailure

I've added a test.

comment:5 Changed 3 years ago by dagit

  • Cc dagitj@… added

comment:6 Changed 3 years ago by liyang

  • Cc… added

comment:7 Changed 19 months ago by thoughtpolice

  • Milestone changed from 7.8.3 to 7.10.1

Moving to 7.10.1.

comment:8 Changed 11 months ago by thoughtpolice

  • Milestone changed from 7.10.1 to 7.12.1

Moving to 7.12.1 milestone; if you feel this is an error and should be addressed sooner, please move it back to the 7.10.1 milestone.

comment:9 Changed 6 months ago by RyanGlScott

  • Cc RyanGlScott added

comment:10 Changed 3 months ago by snoyberg

  • Owner set to snoyberg

comment:11 Changed 3 months ago by snoyberg

I've sent, which turns on transliteration as mentioned above. I'm not intimately familiar with the Handle encoding API, so there may be a better way to do this.

comment:12 Changed 3 months ago by Ben Gamari <bgamari.foss@…>

In 22aca536/ghc:

Transliterate unknown characters at output

This avoids the compiler from crashing when, for example, a warning
contains a non-Latin identifier and the LANG variable is set to C.
Fixes #6037.

Test Plan:
Create a Haskell source file containing an identifier with non-Latin
characters and no type signature. Compile with `LANG=C ghc -Wall
foo.hs`, and it should fail. With this patch, it will succeed.

Reviewers: austin, rwbarton, bgamari

Subscribers: thomie

Differential Revision:

GHC Trac Issues: #6037, #10762

comment:13 Changed 3 months ago by bgamari

  • Milestone changed from 7.12.1 to 7.10.3
  • Status changed from new to merge

Seems like this should go into a potential 7.10.3 if one happens.

comment:14 Changed 3 months ago by Thomas Miedema <thomasmiedema@…>

In c8d438f/ghc:

Testsuite: mark T6037 expect_fail on Windows (#6037)

comment:15 Changed 2 months ago by bgamari

  • Resolution set to fixed
  • Status changed from merge to closed

This was merged to ghc-7.10 as bbd6730f64a47d6fd4c831b78a3bbcd7a929ce4a.

comment:16 Changed 2 months ago by bgamari

The testsuite update was merged to ghc-7.10 as a399888.

Note: See TracTickets for help on using tickets.