Opened 3 years ago

Closed 5 months ago

Last modified 5 months ago

#8996 closed feature request (fixed)

mark more things as const in C codegen

Reported by: slyfox Owned by:
Priority: normal Milestone: 8.4.1
Component: Compiler (CodeGen) Version: 7.8.2
Keywords: Cc: simonmar
Operating System: Unknown/Multiple Architecture: Unknown/Multiple
Type of failure: None/Unknown Test Case:
Blocked By: Blocking:
Related Tickets: Differential Rev(s): Phab:D3481
Wiki Page:

Description

Recently i've tried to build ghc-7.8.2 on ia64 and it overflows so called "short data" section. (it's where global 'static' variables live).

I've looked at low-hanging fruits of squashing such global "variables". One source of them seems to be a string literal.

Consider one-line module:

    module B (v) where v = "hello"

in -fvia-C (or unreg) mode it generates code like

    static char gibberish_str[] = "hello";

It uselessly eats data section. The patch switches generator to emit:

    static const char gibberish_str[] = "hello";

Some notes:

  1. as far as I see native codegens already put ot to .rodata,

thus it should be safe.

  1. I likely didn't cover more cases, like
    • RelocatableReadOnlyData
    • and mysterious pprTop (CmmData _section (Statics lbl [CmmUninitialised size]))

Size of unreg-stage2 on amd64 before and after the patch:

$ size inplace/lib/bin/ghc-stage2 (unpatched)
   text    data     bss     dec     hex filename
81986382        20776344          44096 102806822       620b526 inplace/lib/bin/ghc-stage2
$ size inplace/lib/bin/ghc-stage2 (patched)
   text    data     bss     dec     hex filename
83648494        19062936          44096 102755526       61fecc6 inplace/lib/bin/ghc-stage2

Text section increased for 1.6MBs (consts moved here, likely .rodata actually), data section decreased for 1.7MBs.

I think with minor fiddling with linker flags we can merge equal constants later on.

Thanks!

Attachments (2)

ghc-7.8.2-cgen-constify.patch (1.1 KB) - added by slyfox 3 years ago.
ghc-7.8.2-cgen-constify.patch
ghc-HEAD-constify-WIP.patch (7.7 KB) - added by slyfox 3 years ago.
ghc-HEAD-constify-WIP.patch

Download all attachments as: .zip

Change History (11)

Changed 3 years ago by slyfox

ghc-7.8.2-cgen-constify.patch

comment:1 Changed 3 years ago by ezyang

I think the intent is good. However, I'd probably go about doing it slightly differently: we want this to apply to all read-only data (so ReadOnlyData, RelocatableReadOnlyData and ReadOnlyData16), so it sounds like what you should do is go through all the PprC branches looking for places where a 'const' could be inserted, and then check if the relevant section is readonly.

comment:2 Changed 3 years ago by slyfox

I think something, like use cases of

pprLocalness :: CLabel -> SDoc
pprLocalness lbl | not $ externallyVisibleCLabel lbl = ptext (sLit "static ")
                 | otherwise = empty

might be very close to 'const' case.

Changed 3 years ago by slyfox

Attachment: ghc-HEAD-constify-WIP.patch added

ghc-HEAD-constify-WIP.patch

comment:3 Changed 3 years ago by slyfox

difficulty: Easy (less than 1 hour)Unknown

This patch attempts to mark everything ghc knows as RODATA, what raises a couple of questions:

Patch can be tested as './configure --enable-unregisterised' in amd64.

Example of build failure:

  HC [stage 1] libraries/ghc-prim/dist-install/build/GHC/Types.o

In file included from /tmp/ghc26762_0/ghc26762_1.hc:3:0: 


/tmp/ghc26762_0/ghc26762_1.hc:295:6:
     error: conflicting types for 'ghczmprim_GHCziTypes_False_closure'
     CEI_(ghczmprim_GHCziTypes_False_closure);
          ^

/home/slyfox/dev/git/ghc/includes/Stg.h:213:52:
     note: in definition of macro 'CEI_'
     #define CEI_(X)         extern const StgWordArray (X) GNU_ATTRIBUTE(aligned (8))
                                                        ^

/tmp/ghc26762_0/ghc26762_1.hc:27:9:
     note: previous definition of 'ghczmprim_GHCziTypes_False_closure' was here
     StgWord ghczmprim_GHCziTypes_False_closure[] = {
             ^

GHC thinks that extern points to rodata, but actually place or to .data (rwdata). I think it's due to rts/StgMiscClosures.cmm:

/* put these in the *data* section, since the garbage collector relies
 * on the fact that static closures live in the data section.
 */

#if !(defined(COMPILING_WINDOWS_DLL))
section "data" {
 stg_CHARLIKE_closure:
    CHARLIKE_HDR(0)
    CHARLIKE_HDR(1)
    CHARLIKE_HDR(2)
    CHARLIKE_HDR(3)
...

Is it an absolute need to use mutable data section for it? Could GC be tuned to scan for certain rodata block?

Or if it's not an option,

fix ghc to know that those closure externs are really in "data" section.

Thanks!

comment:4 Changed 3 years ago by ezyang

The comment dates back to GHC 4.01, so it might be wrong; I can't think of any reason for the GC to be doing any scribbling here (Chars and Ints are NOCAF, so we don't need to link them together on the static_objects list; ditto for all of the other similar miscellaneous closures.). Maybe Simon knows better.

In any case, it will be a moot point when is #8199 fixed, after which "static closures" absolutely must be writable since we need to update them to indirect onto blocks we've allocated.

comment:5 Changed 3 years ago by simonmar

I think the comment dates back to when we used to figure out what was static data by knowing the boundaries of sections. We haven't done that for a long time, so the comment is out of date. Let's kill it!

comment:6 Changed 5 months ago by slyfox

Differential Rev(s): Phab:D3481

Three years later I now have a vague idea how things fit together now :)

Sent https://phabricator.haskell.org/D3481 for review.

Last edited 5 months ago by slyfox (previous) (diff)

comment:7 Changed 5 months ago by Ben Gamari <ben@…>

In b68697e5/ghc:

compiler/cmm/PprC.hs: constify labels in .rodata

Consider one-line module
    module B (v) where v = "hello"
in -fvia-C mode it generates code like
    static char gibberish_str[] = "hello";

It resides in data section (precious resource on ia64!).
The patch switches genrator to emit:
    static const char gibberish_str[] = "hello";

Other types if symbols that gained 'const' qualifier are:

- info tables (from haskell and CMM)
- static reference tables (from haskell and CMM)

Cleanups along the way:

- fixed info tables defined in .cmm to reside in .rodata
- split out closure declaration into 'IC_' / 'EC_'
- added label declaration (based on label type) right before
  each label definition (based on section type) so that C
  compiler could check if declaration and definition matches
  at definition site.

Signed-off-by: Sergei Trofimovich <slyfox@gentoo.org>

Test Plan: ran testsuite on unregisterised x86_64 compiler

Reviewers: simonmar, ezyang, austin, bgamari, erikd

Reviewed By: bgamari, erikd

Subscribers: rwbarton, thomie

GHC Trac Issues: #8996

Differential Revision: https://phabricator.haskell.org/D3481

comment:8 Changed 5 months ago by bgamari

Milestone: 8.4.1
Resolution: fixed
Status: newclosed

comment:9 Changed 5 months ago by bgamari

trofi, feel free to reopen if you know of anything else that hasn't yet been marked as const.

Note: See TracTickets for help on using tickets.