Changes between Initial Version and Version 3 of Ticket #1079


Ignore:
Timestamp:
Jul 3, 2007 4:19:16 PM (7 years ago)
Author:
Isaac Dupree
Comment:

This reminds me of the case like "\213\23\231" ( = '\213' : '\23' : '\231' : [] according to Report) where GHC treated multiple of them as one Unicode character. We should probably explicitly say somewhere: shape of String is UTF-32 (so that each Char the list contains is one Unicode code-point), and make that true for all the standard functions.

Even if we assume the standard I/O uses UTF-8 (it has to, for ASCII compatibility), if String is in practice also used for binary data (is it?), the only compatible way might be to bring in a new I/O library as Bulat says. For me, I would like Prelude input and output functions to use UTF-8 as the external format.

Legend:

Unmodified
Added
Removed
Modified
  • Ticket #1079

    • Property Cc Bulat.Ziganshin@… id@… added
    • Property Milestone changed from to 6.8
  • Ticket #1079 – Description

    initial v3  
    55main = putStrLn "あ" 
    66}}} 
    7 but we only get `B', the least 8bit of the character `あ' (U+3042).  Because of this incompleteness, we cannot print any non-ascii characters without converting for the case of writing Haskell codes with UTF-8.  Although it is easy to write converting functions for this purpose, such converting should be supported by the compiler. 
     7but we only get `B`, the least 8bit of the character `あ` (U+3042).  Because of this incompleteness, we cannot print any non-ascii characters without converting for the case of writing Haskell codes with UTF-8.  Although it is easy to write converting functions for this purpose, such converting should be supported by the compiler. 
    88 
    99IMHO, desired approach is similar to Hugs.  In Hugs, when printing non-ascii characters, it first converts the characters to UTF-8 octets and then prints them.  However, with binary-mode Handle, it just print characters without convert.  This behavior will be acceptable for many haskell programmers.