Do not make CAFs from literal strings
Currently (as I discovered in #15038 (closed)), we get the following code for GHC.Exception.Base.patError
:
lvl2_r3y3 :: [Char]
[GblId]
lvl2_r3y3 = unpackCString# lvl1_r3y2
-- RHS size: {terms: 7, types: 6, coercions: 2, joins: 0/0}
patError :: forall a. Addr# -> a
[GblId, Arity=1, Str=<B,U>x, Unf=OtherCon []]
patError
= \ (@ a_a2kh) (s_a1Pi :: Addr#) ->
raise#
@ SomeException
@ 'LiftedRep
@ a_a2kh
(Control.Exception.Base.$fExceptionPatternMatchFail_$ctoException
((untangle s_a1Pi lvl2_r3y3)
`cast` (Sym (Control.Exception.Base.N:PatternMatchFail[0])
:: (String :: *) ~R# (PatternMatchFail :: *))))
That stupid lvl2_r3y3 :: String
is a CAF, and hence patError
has CAF-refs, and hence so does any function that calls patError
, and any function that calls them.
That's bad! Lots more CAF entries in SRTs, lots more work traversing those SRTs in the garbage collector. And for what? To share the work of unpacking a C string! This is nuts.
What to do?
-
Somehow refrain from floating
unpackCSTring# lit
to top level, even if you could otherwise do so. But that seems very ad-hoc, and it make the function bigger and less inlinable. -
Treat a top level definition
x :: [Char] x = unpackCString# y
as NOT a CAF, and make it single-entry so that the thunk is not updated. Then every use of
x
will unpack the string afresh, which is probably a good idea anyhow.I like this more. It would be implemented somewhere in the code generator.