On the other hand, there is GHC.Unicode.isPrint, the predicate for
printable Unicode characters, that is calling on a foreign function
u_iswprint for the knowledge.
One of application that is broken by this change, is when a customized Show instance of a type is controlled by other variables in that type. For example, the following code simulates a press code that respects privacy for people of age under 20.
dataSex=Male|FemaledataPerson=Person{name::String,age::Int,sex::Sex}instanceShowPersonwhereshow(Person_aMale)|a<20="A boy ("++showa++")"show(Person_aFemale)|a<20="A girl ("++showa++")"show(Personna_)=nassert$show(Person"村主崇行"19Male)=="A boy (19)"assert$show(Person"村主崇行"20Male)=="\26449\20027\23815\34892"
I'm very looking forward to learn other drawbacks of this change.
Absolutely any code in the entire world that relies on the current behavior will break. The current behavior is expressed in the reference implementation in the Haskell 2010 report. Frankly, changing it is not an option. You can write your own function to unescape valid Unicode. You can also write your own UShow class if you like with a method for showing various things using Unicode generally. You can then try to convince other developers to depend on your package and write instances of your class.
16.6 String representationsshowLitChar :: Char -> ShowS Convert a character to a string using only printable characters, using Haskell source-language escape conventions. For example: showLitChar '\n' s = "\\n" ++ s
You can put something like this in your .ghci file:
:seti -XScopedTypeVariables:{let myShow :: Show a => a -> String myShow x = go (show x) where go :: String -> String go [] = [] go s@(x:xs) = case x of '\"' -> '\"' : str ++ "\"" ++ go rest '\'' -> '\'' : char : '\'' : go rest' _ -> x : go xs where (str :: String, rest):_ = reads s (char :: Char, rest'):_ = reads s:}:{let myPrint :: Show a => a -> IO () myPrint = putStrLn . myShow:}:set -interactive-print=myPrint
Dear thomie, thank you for your comment. Yes, -interactive-print is a great feature! I regret that I was not able to search out this has been done for years.
There are also several customized show function proposed, like myShow here. However, when I used it in some detail, I found that printing in Unicode has many corner cases that are more difficult than it seems .... As far as I have searched, I cannot find a unicode-printing function
that satisfies
read . unicode_show == id for sufficiently many types. For example,
https://gist.github.com/nushio3/4a10f3c0092295696daf
(+1) to suggestion that to change the default interactive printer to display unicode characters nicely.
The algorithm in unicode-show might be suitable for the purpose, although there should be various opinions on what is the "nice way to print unicode."
By the way, if we update the default interactive printer, will we be breaking the doctests that shows values with unicodes, forcing them to update the expected results from the interpreter?
I would love for something like ticket:11529#comment:115501 to become the default in ghci. It could even be simpler/stupider and just replace any sequence like \12345 with the corresponding Unicode character wherever it appears. I mean when would you ever have such a string in the output of show, short of a weird custom Show instance? And it would be more robust to other weird custom Show instances, that used quotes in an unbalanced fashion.
I don't think we should replace \n or \ESC or especially \\ though. Just printable Unicode characters outside the ASCII range, probably. And we could decline to do the replacement if the replacement character can't be encoded in the user's locale.
One drawback is that the user's font might not contain the Unicode characters in question, like mine does not contain \12345. So there should probably be an option to disable these replacements.
Absolutely any code in the entire world that relies on the current behavior will break. The current behavior is expressed in the reference implementation in the Haskell 2010 report. Frankly, changing it is not an option. You can write your own function to unescape valid Unicode. You can also write your own UShow class if you like with a method for showing various things using Unicode generally. You can then try to convince other developers to depend on your package and write instances of your class.
I disagree. I think, the current implementation is actually wrong and does not adhere to the standard. The standard states in 16.6 that showLitChar be defined as follows:
Convert a character to a string using only printable characters, using Haskell source-language escape conventions.
However, the current implementation of showLitChar fail to use isPrint; instead it uses a naive condition, c > '\DEL', to determine printability. This is wrong.
The solution is simple, replace the condition c > '\DEL' by not (isPrint c) in the definition of showLitChar.
isPrint does not answer the question "can this character be displayed by the current user given their current locale?". That would require it to be in IO, and would limit the ability to use it in other contexts.
isPrint answers the question "is the Unicode codepoint contained in the given Char considered printable by the version of the Unicode standard to which the runtime conforms?".
isPrint does not answer the question "can this character be displayed by the current user given their current locale?". That would require it to be in IO, and would limit the ability to use it in other contexts.
isPrint answers the question "is the Unicode codepoint contained in the given Char considered printable by the version of the Unicode standard to which the runtime conforms?".
It is not the correct question to ask here.
It is, however, what the standard prescribes. IMHO it is also the right thing to do as it leads to less unexpected behaviour than the current implementation.