Opened 14 years ago

Closed 14 years ago

Last modified 47 years ago

#81 closed bug (Fixed)

Unicode bug in toUpper/toLower

Reported by: norpan Owned by: nobody
Priority: normal Milestone:
Component: Prelude Version: 5.04
Keywords: Cc:
Operating System: Architecture:
Type of failure: None/Unknown Test Case:
Blocked By: Blocking:
Related Tickets: Differential Rev(s):
Wiki Page:


Since GHC now does full unicode, toUpper ought to be
full Unicode also. According to the Haskell 98 Library
Report, section 9:

Function toUpper converts a letter to the corresponding
upper-case letter, leaving any other character
unchanged. Any Unicode letter which has an upper-case
equivalent is transformed.

I take as my example the character ÿ (which is the only
one I can write in iso-8859-1 by the way).

toUpper 'ÿ' ought to be unicode 0178 (hexadecimal). But
it's not

toUpper 'ÿ' gives 'ÿ'
toLower (toEnum 0x178) gives toEnum 0x178

I understand that this may cause more trouble than it's
worth, but either the report needs to be rewritten or
the implementation changed.

Change History (3)

comment:1 Changed 14 years ago by simonmar

Logged In: YES 

GHC doesn't support Unicode in any real sense.  The Char 
type is 32 bits, but the rest of the system really only works 
with the ISO 8859 character set.

Nonetheless, I'll leave the bug report here as a reminder :-)

comment:2 Changed 14 years ago by simonmar

Logged In: YES 

This is now fixed in the HEAD.

comment:3 Changed 14 years ago by simonmar

Status: assignedclosed
Note: See TracTickets for help on using tickets.