Opened 2 years ago

Last modified 5 months ago

#7593 new bug

Unable to print exceptions of unicode identifiers

Reported by: dagit Owned by:
Priority: normal Milestone: 7.12.1
Component: Compiler Version: 7.6.1
Keywords: Cc:
Operating System: Windows Architecture: Unknown/Multiple
Type of failure: Incorrect warning at compile-time Test Case:
Blocked By: Blocking:
Related Tickets: Differential Revisions:

Description

I suspect this is windows specific but I'm not certain (I couldn't reproduce it with an older ghc on osx).

Here is an example of the problem:

$ ghcii.sh
GHCi, version 7.6.1: http://www.haskell.org/ghc/  :? for help
Loading package ghc-prim ... linking ... done.
Loading package integer-gmp ... linking ... done.
Loading package base ... linking ... done.
Prelude λ> ⇒

<interactive>:2:1: parse error on input `*** Exception: <stderr>: hPutChar: invalid argument (invalid character)

On other platforms the message looks more like: parse error on input `=>'

(not the automatic translation from ⇒ to =>, I think that's a separate bug related to the unicode symbols extension)

On windows my ghc is version 7.6.1. On OSX (where I could NOT reproduce the exception), my ghc is 7.4.1. I don't know if the GHC version matters. I strongly suspect this is a windows unicode issue.

GHC gives similar behavior to ghci, for example on this input:

⇒ = 1

ghc tries to give an error message but throws an exception printing ⇒.

Change History (8)

comment:1 Changed 2 years ago by dagit

Changing to the codepage to 65001 before compiling the file gets rid of the exception, but the error message still looks a bit off:

Foo.hs:1:1: parse error on input `⇒��'

comment:2 Changed 2 years ago by igloo

  • difficulty set to Unknown
  • Milestone set to 7.8.1

Thanks for the report. We'll take a look.

comment:3 Changed 20 months ago by nomeata

Just FTR: It works with 7.6.1 on Linux, so the Windows attribute seems to be correct.

comment:4 Changed 16 months ago by huzhe

I believe this is the same issue as this:

Prelude> putStrLn "я"
*** Exception: <stdout>: hPutChar: invalid argument (invalid character)
Prelude>

This doesn't happen with Latin-1 characters like "è" on my machine (codepage dependency?)

comment:5 Changed 13 months ago by thoughtpolice

  • Milestone changed from 7.8.3 to 7.10.1

Moving to 7.10.1

comment:6 Changed 10 months ago by RyanGlScott

This issue affects both Win32 consoles (e.g., cmd.exe) and mintty consoles (e.g., Cygwin and MSYS), but in different ways.

In mintty consoles, the issue is easily fixed by setting the encoding of the file handle to utf8:

> putStrLn "→"
*** Exception: <stdout>: hPutChar: invalid argument (invalid character)
> import System.IO
> hSetEncoding stdout utf8
> putStrLn "→"
→
> getChar >>= putChar
→
Γ> getChar >>= putChar
å> getChar >>= putChar
Æ> getChar >>= putChar
> hSetEncoding stdin utf8
> getChar >>= putChar
→
→> 

Fixing the issue on Win32 consoles is not as easy. One approach is to use the FFI to call the native API calls for Unicode output (WriteConsoleW) and input (ReadConsoleInputW). This is the approach that haskeline takes, and it seems to work well:

> import System.Console.Haskeline
> import Data.Maybe
> runInputT defaultSettings (getInputChar "") >>= runInputT defaultSettings . outputStrLn . (:[]) . fromJust
→
→

comment:7 Changed 5 months ago by thoughtpolice

  • Milestone changed from 7.10.1 to 7.12.1

Moving to 7.12.1 milestone; if you feel this is an error and should be addressed sooner, please move it back to the 7.10.1 milestone.

comment:8 Changed 5 months ago by jeremy-list

The workaround I've been using is to add "/TRANSLIT" to the encoding of stdin, stdout, and stderr. I believe this should be the default rather than something I have to specify in my program.

Last edited 5 months ago by jeremy-list (previous) (diff)
Note: See TracTickets for help on using tickets.