Opened 3 years ago

Last modified 3 weeks ago

#6037 merge bug

Compile-time crash with sources with non-representable unicode characters

Reported by: akio Owned by: snoyberg
Priority: normal Milestone: 7.10.3
Component: Compiler Version: 7.4.1
Keywords: Cc: pho@…, dagitj@…, hackage.haskell.org@…, RyanGlScott
Operating System: Linux Architecture: Unknown/Multiple
Type of failure: Compile-time crash Test Case: T6037
Blocked By: Blocking:
Related Tickets: Differential Revisions:

Description

The following file causes GHC to crash, if compiled in the "C" locale.

$ LC_ALL=C ghc unicode.hs
[1 of 1] Compiling Foo              ( unicode.hs, unicode.o )

unicode.hs:2:1:
    Warning: Pattern match(es) are overlapped
             In an equation for `<stderr>: hPutChar: invalid argument (invalid character)

unicode.hs:

module Foo where
δ x = 3
δ x = 4

Change History (13)

comment:1 Changed 3 years ago by simonmar

  • difficulty set to Unknown
  • Milestone set to 7.6.1

I suppose we should be using an encoding that does character translation rather than failing for a Unicode character that is not supported by the current locale.

comment:2 Changed 3 years ago by PHO

  • Cc pho@… added

comment:3 Changed 3 years ago by igloo

  • Milestone changed from 7.6.1 to 7.6.2

comment:4 Changed 3 years ago by igloo

  • Milestone changed from 7.6.2 to 7.8.1
  • Test Case set to T6037

I think that we want to do something along the lines of:

enc <- mkIconvEncoding TransliterateCodingFailure localeEncodingName
hSetEncoding stdout enc
hSetEncoding stderr enc

but that means using internal modules, doing different things on different platforms, etc.

Instead: I think we should be able to just do:

hSetEncodingFailureMode stdout TransliterateCodingFailure
hSetEncodingFailureMode stderr TransliterateCodingFailure

I've added a test.

comment:5 Changed 3 years ago by dagit

  • Cc dagitj@… added

comment:6 Changed 2 years ago by liyang

  • Cc hackage.haskell.org@… added

comment:7 Changed 17 months ago by thoughtpolice

  • Milestone changed from 7.8.3 to 7.10.1

Moving to 7.10.1.

comment:8 Changed 9 months ago by thoughtpolice

  • Milestone changed from 7.10.1 to 7.12.1

Moving to 7.12.1 milestone; if you feel this is an error and should be addressed sooner, please move it back to the 7.10.1 milestone.

comment:9 Changed 3 months ago by RyanGlScott

  • Cc RyanGlScott added

comment:10 Changed 3 weeks ago by snoyberg

  • Owner set to snoyberg

comment:11 Changed 3 weeks ago by snoyberg

I've sent https://phabricator.haskell.org/D1153, which turns on transliteration as mentioned above. I'm not intimately familiar with the Handle encoding API, so there may be a better way to do this.

comment:12 Changed 3 weeks ago by Ben Gamari <bgamari.foss@…>

In 22aca536/ghc:

Transliterate unknown characters at output

This avoids the compiler from crashing when, for example, a warning
contains a non-Latin identifier and the LANG variable is set to C.
Fixes #6037.

Test Plan:
Create a Haskell source file containing an identifier with non-Latin
characters and no type signature. Compile with `LANG=C ghc -Wall
foo.hs`, and it should fail. With this patch, it will succeed.

Reviewers: austin, rwbarton, bgamari

Subscribers: thomie

Differential Revision: https://phabricator.haskell.org/D1153

GHC Trac Issues: #6037, #10762

comment:13 Changed 3 weeks ago by bgamari

  • Milestone changed from 7.12.1 to 7.10.3
  • Status changed from new to merge

Seems like this should go into a potential 7.10.3 if one happens.

Note: See TracTickets for help on using tickets.