Opened 12 years ago

Closed 11 years ago

Last modified 44 years ago

#81 closed bug (Fixed)

Unicode bug in toUpper/toLower

Reported by: norpan Owned by: nobody
Priority: normal Milestone:
Component: Prelude Version: 5.04
Keywords: Cc:
Operating System: Architecture:
Type of failure: Difficulty:
Test Case: Blocked By:
Blocking: Related Tickets:

Description

Since GHC now does full unicode, toUpper ought to be
full Unicode also. According to the Haskell 98 Library
Report, section 9:

Function toUpper converts a letter to the corresponding
upper-case letter, leaving any other character
unchanged. Any Unicode letter which has an upper-case
equivalent is transformed.

I take as my example the character ÿ (which is the only
one I can write in iso-8859-1 by the way).

toUpper 'ÿ' ought to be unicode 0178 (hexadecimal). But
it's not

toUpper 'ÿ' gives 'ÿ'
toLower (toEnum 0x178) gives toEnum 0x178

I understand that this may cause more trouble than it's
worth, but either the report needs to be rewritten or
the implementation changed.

Change History (3)

comment:1 Changed 12 years ago by simonmar

Logged In: YES 
user_id=48280

GHC doesn't support Unicode in any real sense.  The Char 
type is 32 bits, but the rest of the system really only works 
with the ISO 8859 character set.

Nonetheless, I'll leave the bug report here as a reminder :-)

comment:2 Changed 11 years ago by simonmar

Logged In: YES 
user_id=48280

This is now fixed in the HEAD.

comment:3 Changed 11 years ago by simonmar

  • Status changed from assigned to closed
Note: See TracTickets for help on using tickets.