Opened 11 months ago

Last modified 10 months ago

#8996 new feature request

mark more things as const in C codegen

Reported by: slyfox Owned by:
Priority: normal Milestone:
Component: Compiler (CodeGen) Version: 7.8.2
Keywords: Cc: simonmar
Operating System: Unknown/Multiple Architecture: Unknown/Multiple
Type of failure: None/Unknown Test Case:
Blocked By: Blocking:
Related Tickets: Differential Revisions:

Description

Recently i've tried to build ghc-7.8.2 on ia64 and
it overflows so called "short data" section.
(it's where global 'static' variables live).

I've looked at low-hanging fruits of squashing such global "variables".
One source of them seems to be a string literal.

Consider one-line module:

    module B (v) where v = "hello"

in -fvia-C (or unreg) mode it generates code like

    static char gibberish_str[] = "hello";


It uselessly eats data section.
The patch switches generator to emit:

    static const char gibberish_str[] = "hello";

Some notes:

  1. as far as I see native codegens already put ot to .rodata,

thus it should be safe.

  1. I likely didn't cover more cases, like
    • RelocatableReadOnlyData
    • and mysterious pprTop (CmmData _section (Statics lbl [CmmUninitialised size]))

Size of unreg-stage2 on amd64 before and after the patch:

$ size inplace/lib/bin/ghc-stage2 (unpatched)
   text    data     bss     dec     hex filename
81986382        20776344          44096 102806822       620b526 inplace/lib/bin/ghc-stage2
$ size inplace/lib/bin/ghc-stage2 (patched)
   text    data     bss     dec     hex filename
83648494        19062936          44096 102755526       61fecc6 inplace/lib/bin/ghc-stage2

Text section increased for 1.6MBs (consts moved here, likely .rodata actually),
data section decreased for 1.7MBs.

I think with minor fiddling with linker flags we can merge equal constants later on.

Thanks!

Attachments (2)

ghc-7.8.2-cgen-constify.patch (1.1 KB) - added by slyfox 11 months ago.
ghc-7.8.2-cgen-constify.patch
ghc-HEAD-constify-WIP.patch (7.7 KB) - added by slyfox 11 months ago.
ghc-HEAD-constify-WIP.patch

Download all attachments as: .zip

Change History (7)

Changed 11 months ago by slyfox

ghc-7.8.2-cgen-constify.patch

comment:1 Changed 11 months ago by ezyang

I think the intent is good. However, I'd probably go about doing it slightly differently: we want this to apply to all read-only data (so ReadOnlyData, RelocatableReadOnlyData and ReadOnlyData16), so it sounds like what you should do is go through all the PprC branches looking for places where a 'const' could be inserted, and then check if the relevant section is readonly.

comment:2 Changed 11 months ago by slyfox

I think something, like use cases of

pprLocalness :: CLabel -> SDoc
pprLocalness lbl | not $ externallyVisibleCLabel lbl = ptext (sLit "static ")
                 | otherwise = empty

might be very close to 'const' case.

Changed 11 months ago by slyfox

ghc-HEAD-constify-WIP.patch

comment:3 Changed 11 months ago by slyfox

  • difficulty changed from Easy (less than 1 hour) to Unknown

This patch attempts to mark everything ghc knows as RODATA,
what raises a couple of questions:

Patch can be tested as './configure --enable-unregisterised' in amd64.

Example of build failure:

  HC [stage 1] libraries/ghc-prim/dist-install/build/GHC/Types.o

In file included from /tmp/ghc26762_0/ghc26762_1.hc:3:0: 


/tmp/ghc26762_0/ghc26762_1.hc:295:6:
     error: conflicting types for 'ghczmprim_GHCziTypes_False_closure'
     CEI_(ghczmprim_GHCziTypes_False_closure);
          ^

/home/slyfox/dev/git/ghc/includes/Stg.h:213:52:
     note: in definition of macro 'CEI_'
     #define CEI_(X)         extern const StgWordArray (X) GNU_ATTRIBUTE(aligned (8))
                                                        ^

/tmp/ghc26762_0/ghc26762_1.hc:27:9:
     note: previous definition of 'ghczmprim_GHCziTypes_False_closure' was here
     StgWord ghczmprim_GHCziTypes_False_closure[] = {
             ^

GHC thinks that extern points to rodata, but actually place or to .data (rwdata).
I think it's due to rts/StgMiscClosures.cmm:

/* put these in the *data* section, since the garbage collector relies
 * on the fact that static closures live in the data section.
 */

#if !(defined(COMPILING_WINDOWS_DLL))
section "data" {
 stg_CHARLIKE_closure:
    CHARLIKE_HDR(0)
    CHARLIKE_HDR(1)
    CHARLIKE_HDR(2)
    CHARLIKE_HDR(3)
...

Is it an absolute need to use mutable data section for it?
Could GC be tuned to scan for certain rodata block?

Or if it's not an option,

fix ghc to know that those closure externs
are really in "data" section.

Thanks!

comment:4 Changed 11 months ago by ezyang

The comment dates back to GHC 4.01, so it might be wrong; I can't think of any reason for the GC to be doing any scribbling here (Chars and Ints are NOCAF, so we don't need to link them together on the static_objects list; ditto for all of the other similar miscellaneous closures.). Maybe Simon knows better.

In any case, it will be a moot point when is #8199 fixed, after which "static closures" absolutely must be writable since we need to update them to indirect onto blocks we've allocated.

comment:5 Changed 10 months ago by simonmar

I think the comment dates back to when we used to figure out what was static data by knowing the boundaries of sections. We haven't done that for a long time, so the comment is out of date. Let's kill it!

Note: See TracTickets for help on using tickets.