Opened 5 years ago

Last modified 2 months ago

#7307 new bug

Share top-level code for strings

Reported by: simonpj Owned by: parcs
Priority: normal Milestone:
Component: Compiler Version: 7.6.1
Keywords: strings Cc:
Operating System: Unknown/Multiple Architecture: Unknown/Multiple
Type of failure: Runtime performance bug Test Case:
Blocked By: Blocking:
Related Tickets: Differential Rev(s):
Wiki Page:

Description (last modified by bgamari)

A string constant in GHC turns into

foo :: String
foo = unpackCString# "the-string'

This is a top-level thunk, and it expands into rather a lot of code like this

.text
	.align 4,0x90
	.long	0
	.long	22
.globl _Foo_zdfTypeableTzuds1_info
_Foo_zdfTypeableTzuds1_info:
.LcvI:
	movl %esi,%eax
	leal -12(%ebp),%ecx
	cmpl 84(%ebx),%ecx
	jb .LcvQ
	addl $8,%edi
	cmpl 92(%ebx),%edi
	ja .LcvS
	movl $_stg_CAF_BLACKHOLE_info,-4(%edi)
	movl 100(%ebx),%ecx
	movl %ecx,0(%edi)
	leal -4(%edi),%ecx
	pushl %ecx
	pushl %eax
	pushl %ebx
	movl %eax,76(%esp)
	call _newCAF
	addl $12,%esp
	testl %eax,%eax
	je .LcvL
	movl $_stg_bh_upd_frame_info,-8(%ebp)
	leal -4(%edi),%eax
	movl %eax,-4(%ebp)
	movl $_cvJ_str,-12(%ebp)
	addl $-12,%ebp
	jmp _ghczmprim_GHCziCString_unpackCStringzh_info
.LcvL:
	movl 64(%esp),%eax
	jmp *(%eax)
.LcvS:
	movl $8,116(%ebx)
.LcvQ:
	movl %eax,%esi
	jmp *-12(%ebx)

That's rather a lot of goop for one thunk! Of course we can share this, by making a 2-word thunk like this:

------------------------------
| TopUnpack_info  |   -------|-----> "the-string"#
------------------------------

where TopUnpack_info is a shared RTS info-table and code that embodies the code fragment above.

This would save useless code bloat for every constant string. (This came up when looking at the code generated by deriving(Typeable).)

Attachments (1)

0001-Share-top-level-code-for-strings.patch (11.4 KB) - added by parcs 4 years ago.

Download all attachments as: .zip

Change History (15)

comment:1 Changed 5 years ago by igloo

Milestone: 7.8.1

comment:2 Changed 4 years ago by parcs

Owner: set to parcs

I will try to do this.

Changed 4 years ago by parcs

comment:3 Changed 4 years ago by parcs

Here's an experimental patch that achieves what's described in the ticket. It creates a new closure type, StgTopUnpack along with a new info table stg_TOP_UNPACK_info. All top-level strings are emitted as 4-word thunks (all thunks need to have the same closure representation), with the address of the static string being stored in the 2nd word.

Miraculously, the code works. Binary sizes drop a further ~(2-3)% (on top of size reduction from #8590) and GHC can still bootstrap itself. But there are a few issues that I am having trouble with:

  1. I am not sure what closure flags TOP_UNPACK should have.
  2. I do not think entry code for TOP_UNPACK is correct. For starters, I mixed high-level Cmm (argument lists) with low-level Cmm (use of Sp). But the whole thing is likely wrong.

Could somebody take a quick look?

comment:4 Changed 4 years ago by carter

Very cool! So this would also be possibly part of a first step towards generally "statically constructing/evaluating" small constant data structures at compile time rather than computing them? I'm not familiar with that piece of GHC as yet, but i'll take a wee look at the patch.

comment:5 in reply to:  4 Changed 4 years ago by parcs

Replying to carter:

Very cool! So this would also be possibly part of a first step towards generally "statically constructing/evaluating" small constant data structures at compile time rather than computing them? I'm not familiar with that piece of GHC as yet, but i'll take a wee look at the patch.

The String is still created at runtime. But the code that creates a heap object from the static object is no longer duplicated for each string literal in the module.

(Hmm, I wonder whether this technique could be generalized for all CAFs, not just top-level strings...)

comment:6 Changed 4 years ago by ezyang

Yeah, this code is buggy. Here are the problems I can find from a quick pass:

  • TOP_UNPACK is declared to have a thunk-representation. In that case, it needs to use a StgThunkHeader, so that the extra padding required to allow for lock-less thunk update in multithreaded code.
  • Similarly, it should be given the THU and SRT flags. Since these thunks are emitted into the data section (i.e. are static data), they also need the static flag (like THUNK_STATIC). So really TOP_UNPACK is a bad name.
  • I am not sure this is a good idea, but it seems like this mechanism could be generalized to variadic THUNK_STATICs, which are essentially THUNK_STATIC but have some payload attached to the end. You don't even have to hard-code the infotable anymore.
  • The GC code is wrong; TOP_UNPACK should be handled like a THUNK_STATIC.
  • I know for a fact you're missing this case from a number of other giant switch statements in the RTS. Do a grep.
  • The generated info table never pushes an update frame to adjust itself. So what happens when a reference to a TOP_UNPACK is inside generated code? I think you will end up repeatedly unpacking the string, though I am not 100% sure here. Whoops, I was looking at an old version of newCAF; I think this is right.
  • As for whether or not your C-- is right, you should double-check it with the old C-- we were generating for handling unpacking per thunk.
Last edited 4 years ago by ezyang (previous) (diff)

comment:7 Changed 4 years ago by parcs

Hah, that's a lot of bugs for such a small patch.

Your first point seems to contradict the documentation at https://ghc.haskell.org/trac/ghc/browser/ghc/includes/rts/storage/Closures.h#L30, which says that the usage of StgThunkHeader doesn't apply to THUNK_STATICs. Could you clarify this?

It seems to me that I could have just given the TopUnpack infotable the THUNK_STATIC type instead of creating a new closure type for it. Would you be inclined to agree?

Thanks for reviewing.

Last edited 4 years ago by parcs (previous) (diff)

comment:8 Changed 4 years ago by ezyang

You may be right; static thunks might be blackholed anyway so the update can be done atomically. But I'd have to check: note the comment also states that static thunks have no payload; this is not true for TopUnpack.

I am not sure if adjusting THUNK_STATIC to now take a payload will cause any problems. Probably it's best to check all cases in the RTS that handle this closure type and see if they have any assumptions about payload sie.

comment:9 Changed 3 years ago by thoughtpolice

Milestone: 7.8.37.10.1

Moving to 7.10.1

comment:10 Changed 3 years ago by thoughtpolice

Milestone: 7.10.17.12.1

Moving to 7.12.1 milestone; if you feel this is an error and should be addressed sooner, please move it back to the 7.10.1 milestone.

comment:11 Changed 2 years ago by thoughtpolice

Milestone: 7.12.18.0.1

Milestone renamed

comment:12 Changed 21 months ago by thomie

Type of failure: None/UnknownRuntime performance bug

comment:13 Changed 21 months ago by thomie

Milestone: 8.0.1

comment:14 Changed 2 months ago by bgamari

Description: modified (diff)
Keywords: strings added
Note: See TracTickets for help on using tickets.