Changes between Version 19 and Version 20 of Commentary/Rts/HeapObjects


Ignore:
Timestamp:
Oct 20, 2006 9:00:10 AM (9 years ago)
Author:
simonmar
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • Commentary/Rts/HeapObjects

    v19 v20  
    1 
    2 [[PageOutline]]
    3 
    4 = GHC Commentary: The Layout of Heap Objects =
    5 
    6 == Terminology ==
    7 
    8  * A ''lifted'' type is one that contains bottom (_|_), conversely an ''unlifted'' type does not contain _|_.
    9    For example, {{{Array}}} is lifted, but {{{ByteArray#}}} is unlifted.
    10 
    11  * A ''boxed'' type is represented by a pointer to an object in the heap, an ''unboxed'' object is represented by a value.
    12    For example, {{{Int}}} is boxed, but {{{Int#}}} is unboxed.
    13 
    14 The representation of _|_ must be a pointer: it is an object that when evaluated throws an exception or enters an infinite loop.  Therefore, only boxed types may be lifted.
    15 
    16 There are boxed unlifted types: eg. {{{ByteArray#}}}.  If you have a value of type {{{ByteArray#}}}, it definitely points to a heap object with type {{{ARR_WORDS}}} (see below), rather than an unevaluated thunk.
    17 
    18 Unboxed tuples {{{(#...#)}}} are both unlifted and unboxed.  They are represented by multiple values passed in registers or on the stack, according to the [wiki:Commentary/Rts/HaskellExecution return convention].
    19 
    20 Unlifted types cannot currently be used to represent terminating functions: an unlifted type on the right of an arrow is implicitly lifted to include `_|_`.
    21 
    22 -----------
    23 
    24 == Heap Objects ==
    25 
    26 All heap objects have the same basic layout, embodied by the type {{{StgClosure}}} in [http://darcs.haskell.org/ghc/includes/Closures.h Closures.h].  The diagram below shows the layout of a heap object:
    27 
    28 [[Image(heap-object.png)]]
    29 
    30 A heap object always begins with a ''header'', defined by {{{StgHeader}}} in [http://darcs.haskell.org/ghc/includes/Closures.h Closures.h]:
    31 
    32 {{{
    33 typedef struct {
    34     const struct _StgInfoTable* info;
    35 #ifdef PROFILING
    36     StgProfHeader         prof;
    37 #endif
    38 #ifdef GRAN
    39     StgGranHeader         gran;
    40 #endif
    41 } StgHeader;
    42 }}}
    43 
    44 The most important part of the header is the ''info pointer'', which points to the info table for the closure.  In the default build, this is all the header contains, so a header is normally just one word.  In other builds, the header may contain extra information: eg. in a profilnig build it also contains information about who built the closure.
    45 
    46 Most of the runtime is insensitive to the size of {{{StgHeader}}};
    47 that is, we are careful not to hardcode the offset to the payload
    48 anywhere, instead we use C struct indexing or {{{sizeof(StgHeader)}}}.
    49 This makes it easy to extend {{{StgHeader}}} with new fields if we
    50 need to.
    51 
    52 The compiler also needs to know the layout of heap objects, and the way this information is plumbed into the compiler from the C headers in the runtime is described here: [wiki:Commentary/Compiler/CodeGen#Storagemanagerrepresentations].
    53 
    54 -----------
    55 
    56 == Info Tables ==
    57 
    58 The info table contains all the information that the runtime needs to know about the closure.  The layout of info tables is defined by {{{StgInfoTable}}} in [[GhcFile(includes/InfoTables.h)]].  The basic info table layout looks like this:
    59 
    60 [[Image(basic-itbl.png)]]
    61 
    62 Where:
    63  
    64  * The ''closure type'' is a constant describing the kind of closure this is (function, thunk, constructor etc.).  All
    65    the closure types are defined in [[GhcFile(includes/ClosureTypes.h)]], and many of them have corresponding C struct
    66    definitions in [[GhcFile(includes/Closures.h)]].
    67 
    68  * The ''SRT bitmap'' field is used to support [wiki:Commentary/Rts/CAFs garbage collection of CAFs].
    69 
    70  * The ''layout'' field describes the layout of the payload for the garbage collector, and is described in more
    71    detail in [[ref(Types of Payload Layout)]] below.
    72 
    73  * The ''entry code'' for the closure is usually the code that will ''evaluate'' the closure.  There is one exception:
    74    for functions, the entry code will apply the function to the arguments given in registers or on the stack, according
    75    to the calling convention.  The entry code assumes all the arguments are present - to apply a function to fewer arguments
    76    or to apply an unknown function, the [wiki:Commentary/Rts/HaskellExecution/EvalApply generic apply functions] must
    77    be used.
    78 
    79 Some types of object add more fields to the end of the info table, notably functions, return addresses, and thunks.
    80 
    81 Space in info tables is a premium: adding a word to the standard info table structure increases binary sizes by 5-10%.
    82 
    83 === {{{TABLES_NEXT_TO_CODE}}} ===
    84 
    85 Note that the info table is followed immediately by the entry code, rather than the code being at the end of an indirect pointer.  This both reduces the size of the info table and eliminates one indirection when jumping to the entry code; however, arranging to generate code like this presents some difficulties when compiling via C, see [wiki:Commentary/EvilMangler]. 
    86 
    87 GHC can generate code that uses the indirect pointer instead; the {{{TABLES_NEXT_TO_CODE}}} turns on the optimised layout.  Generally {{{TABLES_NEXT_TO_CODE}}} is turned off when compiling unregisterised.
    88 
    89 When {{{TABLES_NEXT_TO_CODE}}} is off, info tables get another field, {{{entry}}}, which points to the entry code.  In a generated object file, each symbol {{{X_info}}} representing an info table will have an associated symbol {{{X_entry}}} pointing to the entry code (in {{{TABLES_NEXT_TO_CODE}}}, the entry symbol is omitted to keep the size of symbol tables down).
    90 
    91 -----------
    92 
    93 == Types of Payload Layout ==
    94 
    95 The GC needs to know two things about the payload of a heap object: how many words it contains, and which of those words are pointers.  There are two basic kinds of layout for the payload: ''pointers-first'' and ''bitmap''.  Which of these kinds of layout is being used is a property of the ''closure type'', so the GC first checks the closure type to determine how to interpret the layout field of the info table.
    96 
    97 === Pointers-first layout ===
    98 
    99 The payload consists of zero or more pointers followed by zero or more non-pointers.
    100 This is the most common layout: constructors, functions and thunks use this layout.  The  layout field contains
    101 two half-word-sized fields:
    102 
    103   * Number of pointers
    104   * Number of non-pointers
    105 
    106 === Bitmap layout ===
    107 
    108 The payload consists of a mixture of pointers and non-pointers, described by a bitmap.  There are two kinds of bitmap:
    109 
    110 '''Small bitmaps.''' A small bitmap fits into a single word (the layout word of the info table), and looks like this:
    111 
    112 || Size (bits 0-4) || Bitmap (bits 5-31) ||
    113 
    114 (for a 64-bit word size, the size is given 6 bits instead of 5). 
    115 
    116 The size field gives the size of the payload, and each bit of the bitmap is 1 if the corresponding word of payload contains a pointer to a live object.
    117 
    118 The macros {{{MK_BITMAP}}}, {{{BITMAP_SIZE}}}, and {{{BITMAP_BITS}}} in [[GhcFile(includes/InfoTables.h)]] provide ways to conveniently operate on small bitmaps.
    119 
    120 '''Large bitmaps.''' If the size of the stack frame is larger than the 27 words that a small bitmap can describe, then the fallback mechanism is the large bitmap.  A large bitmap is a separate structure, containing a single word size and a multi-word bitmap: see {{{StgLargeBitmap}}} in [[GhcFile(includes/InfoTables.h)]].
    121 
    122 
    123 -----------
    124 
    125 == Dynamic vs. Static objects ==
    126 
    127 Objects fall into two categories:
    128 
    129  * ''dynamic'' objects reside in the heap, and may be moved by the garbage collector.
    130 
    131  * ''static'' objects reside in the compiled object code.  They are never moved, because pointers to such objects are
    132    scattered through the object code, and only the linker knows where.
    133 
    134 To find out whether a particular object is dynamic or static, use the {{{HEAP_ALLOCED()}}} macro, from [[GhcFile(rts/MBlock.h)]].  This macro works by consulting a bitmap (or structured bitmap) that tells for each [wiki:Commentary/Rts/Storage#Structureofblocks megablock] of memory whether it is part of the dynamic heap or not.
    135 
    136 === Dynamic objects ===
    137 
    138 Dynamic objects have a minimum size, because every object must be big
    139 enough to be overwritten by a
    140 forwarding pointer ([[ref(Forwarding Pointers)]]) during GC.
    141 The minimum size of the payload is given by {{{MIN_PAYLOAD_SIZE}}} in [[GhcFile(includes/Constants.h)]].
    142 
    143 === Static objects ===
    144 
    145 All static objects have closure types ending in {{{_STATIC}}}, eg. {{{CONSTR_STATIC}}} for static data constructors.
    146 
    147 Static objects have an additional field, called the ''static link
    148 field''.  The static link field is used by the GC to link all the
    149 static objects in a list, and so that it can tell whether it has
    150 visited a particular static object or not - the GC needs to traverse
    151 all the static objects in order to [wiki:Commentary/Rts/CAFs garbage collect CAFs].
    152 
    153 The static link field resides after the normal payload, so that the
    154 static variant of an object has compatible layout with the dynamic
    155 variant.  To access the static link field of a closure, use the
    156 {{{STATIC_LINK()}}} macro from [[GhcFile(includes/ClosureMacros.h)]].
    157 
    158 -----------
    159 
    160 == Types of object ==
    161 
    162 === Data Constructors ===
    163 
    164 All data constructors have pointers-first layout:
    165 
    166 || Header || Pointers... || Non-pointers... ||
    167 
    168 Data constructor closure types:
    169 
    170  * ({{{CONSTR}}}: a vanilla, dynamically allocated constructor
    171  * {{{CONSTR_p_n}}}: a constructor whose layout is encoded in the closure type (eg. {{{CONSTR_1_0}}} has one pointer
    172    and zero non-pointers.  Having these closure types speeds up GC a little for common layouts.
    173  * {{{CONSTR_INTLIKE}}}, {{{CONSTR_CHARLIKE}}}: special closure types corresponding to types like {{{Int}}} and
    174    {{{Char}}}.  The RTS includes some static instances of these types so that instead of allocating a new {{{Char}}}
    175    on the heap, we can use the static RTS instance instead and save some heap space.  See   
    176    [[GhcFile(rts/StgMiscClosures.cmm)]].
    177  * {{{CONSTR_STATIC}}}: a statically allocated constructor.
    178 
    179 The entry code for a constructor returns immediately to the topmost stack frame, because the data constructor is already in WHNF.  The return convention may be vectored or non-vectored, depending on the type (see [wiki:Commentary/Rts/HaskellExecution#ReturnConvention]).
    180 
    181 Symbols related to a data constructor X:
    182 
    183  * X_{{{con_info}}}: info table for a dynamic instance of X
    184  * X_{{{static_info}}}: info table for a static instance of X
    185  * X_{{{info}}}: the ''wrapper'' for X (a function, equivalent to the
    186    curried function {{{X}}} in Haskell, see
    187    [wiki:Commentary/Compiler/EntityTypes]). 
    188  * X_{{{closure}}}: static closure for X's wrapper
    189 
    190 === Function Closures ===
    191 
    192 A function closure represents a Haskell function.  For example:
    193 {{{
    194   f = \x -> let g = \y -> x + y
    195             in g x
    196 }}}
    197 Here, {{{f}}} would be represented by a static function closure (see below), and {{{g}}} a dynamic function closure.  Every function in the Haskell program generates a new info table and entry code, and top-level functions additionally generate a static closure.
    198  
    199 All function closures have pointers-first layout:
    200 
    201 || Header || Pointers... || Non-pointers... ||
    202 
    203 The payload of the function closure contains the free variables of the function: in the example above, a closure for {{{g}}} would have a payload containing a pointer to {{{x}}}.
    204 
    205 Function closure types:
    206 
    207  * {{{FUN}}}: a vanilla, dynamically allocated function
    208  * {{{FUN_p_n}}}: same, specialised for layout (see constructors above)
    209  * {{{FUN_STATIC}}}: a static (top-level) function closure
    210 
    211 Symbols related to a function {{{f}}}:
    212 
    213  * {{{f_info}}}: f's info table and code
    214  * {{{f_closure}}}: f's static closure, if f is a top-level function.
    215    The static closure has no payload, because there are no free
    216    variables of a top-level function.  It does have a static link
    217    field, though.
    218 
    219 === Thunks ===
    220 
    221 A thunk represents an expression that is not obviously in head normal
    222 form.  For example, consider the following top-level definitions:
    223 {{{
    224   range = between 1 10
    225   f = \x -> let ys = take x range
    226             in sum ys
    227 }}}
    228 Here the right-hand sides of {{{range}}} and {{{ys}}} are both thunks;
    229 the former is static while the latter is dynamic.
    230 
    231 Thunks have pointers-first layout:
    232 
    233 || Header || Pointers... || Non-pointers... ||
    234 
    235 As for function closures, the payload contains the free variables of
    236 the expression.  A thunk differs from a function closure in that it
    237 can be [wiki:Commentary/Rts/HaskellExecution#Updates updated].
    238 
    239 There are several forms of thunk:
    240 
    241  * {{{THUNK}}}, {{{THUNK_p_n}}}: vanilla, dynamically allocated
    242    thunks.  Dynamic thunks are overwritten with normal indirections
    243    {{{IND}}}, or old generation indirections {{{IND_OLDGEN}}} when
    244    evaluated.
    245 
    246  * {{{THUNK_STATIC}}}: a static thunk is also known as a ''constant
    247    applicative form'', or ''CAF''.  Static thunks are overwritten with
    248    static indirections ({{{IND_STATIC}}}).
    249 
    250 The only label associated with a thunk is its info table:
    251 
    252  * {{{f_info}}} is f's info table.
    253 
    254 === Selector thunks ===
    255 
    256 {{{THUNK_SELECTOR}}} is a (dynamically allocated) thunk whose entry
    257 code performs a simple selection operation from a data constructor
    258 drawn from a single-constructor type.  For example, the thunk
    259 {{{
    260 x = case y of (a,b) -> a
    261 }}}
    262 is a selector thunk.  A selector thunk is laid out like this:
    263 
    264 || Header || Selectee pointer ||
    265 
    266 The layout word contains the byte offset of the desired word in the
    267 selectee.  Note that this is different from all other thunks.
    268 
    269 The garbage collector "peeks" at the selectee's tag (in its info
    270 table).  If it is evaluated, then it goes ahead and does the
    271 selection, and then behaves just as if the selector thunk was an
    272 indirection to the selected field.  If it is not evaluated, it
    273 treats the selector thunk like any other thunk of that shape.
    274 
    275 This technique comes from the Phil Wadler paper [http://homepages.inf.ed.ac.uk/wadler/topics/garbage-collection.html Fixing some space leaks with a garbage collector], and later Christina von Dorrien who called it "Stingy Evaluation".
    276 
    277 There is a fixed set of pre-compiled selector thunks built into the
    278 RTS, representing offsets from 0 to {{{MAX_SPEC_SELECTOR_THUNK}}},
    279 see [[GhcFile(rts/StgMiscThunks.cmm)]].
    280 The info tables are labelled {{{__sel_n_upd_info}}} where {{{n}}} is the
    281 offset.  Non-updating versions are also built in, with info tables
    282 labelled {{{_sel_n_noupd_info}}}.
    283 
    284 These thunks exist in order to prevent a space leak.  For example, if y is a thunk that has been evaluated, and y is unreachable, but x is reachable, the risk is that x keeps both the a and b components of y live.  By making the selector thunk a special case, we make it possible to reclaim the memory associated with b.  (The situation is further complicated when selector thunks point to other selector thunks; the garbage collector sees all, knows all.)
    285 
    286 === Partial applications ===
    287 
    288 Partial applications are tricky beasts.
    289 
    290 A partial application, closure type {{{PAP}}}, represents a function
    291 applied to too few arguments.  Partial applications are only built by
    292 the [wiki:Commentary/Rts/HaskellExecution/EvalApply generic apply]
    293 functions in AutoApply.cmm.
    294 
    295 || Header || Arity || No. of words || Function closure || Payload... ||
    296 
    297 Where:
    298 
    299  * ''Arity'' is the arity of the PAP.  For example, a function with
    300    arity 3 applied to 1 argument would leave a PAP with arity 2.
    301 
    302  * ''No. of words'' refers to the size of the payload in words.
    303 
    304  * ''Function closure'' is the function to which the arguments are
    305    applied.  Note that this is always a pointer to one of the
    306    {{{FUN}}} family, never a {{{PAP}}}.  If a {{{PAP}}} is applied
    307    to more arguments to give a new {{{PAP}}}, the arguments from
    308    the original {{{PAP}}} are copied to the new one.
    309 
    310  * The payload is the sequence of arguments already applied to
    311    this function.  The pointerhood of these words are described
    312    by the function's bitmap (see {{{scavenge_PAP_payload()}}} in
    313    [[GhcFile(rts/GC.c)]] for an example of traversing a PAP).
    314 
    315 There is just one standard form of PAP. There is just one info table
    316 too, called {{{stg_PAP_info}}}.  A PAP should never be entered, so its
    317 entry code causes a failure.  PAPs are applied by the generic apply
    318 functions in {{{AutoApply.cmm}}}.
    319 
    320 === Generic application ===
    321 
    322 An {{{AP}}} object is very similar to a {{{PAP}}}, and has identical layout:
    323 
    324 || Header || Arity || No. of words || Function closure || Payload... ||
    325 
    326 The difference is that an {{{AP}}} is not necessarily in WHNF.  It is
    327 a thunk that represents the application of the specified function to
    328 the given arguments.
    329 
    330 The arity field is always zero (it wouldn't help to omit this field,
    331 because it is only half a word anyway).
    332 
    333 {{{AP}}} closures are used mostly by the byte-code interpreter, so that it only needs a single form of thunk object.  Interpreted thunks are always represented by the application of a {{{BCO}}} to its free variables.
    334 
    335 === Stack application ===
    336 
    337 An {{{AP_STACK}}} is a special kind of object:
    338 
    339 || Header || Size || Closure || Payload... ||
    340 
    341 It represents computation of a thunk that was suspended midway through evaluation.  In order to continue the computation, copy the payload onto the stack (the payload was originally the stack of the suspended computation), and enter the closure.
    342 
    343 Since the payload is a chunk of stack, the GC can use its normal stack-walking code to traverse it.
    344 
    345 {{{AP_STACK}}} closures are built by {{{raiseAsync()}}} in [[GhcFile(rts/RaiseAsync.c)]] when an [wiki:Commentary/Rts/AsyncExceptions asynchronous exception] is raised.
    346 
    347 === Indirections ===
    348 
    349 Indirection closures just point to other closures. They are introduced
    350 when a thunk is updated to point to its value.  The entry code for all
    351 indirections simply enters the closure it points to.
    352 
    353 The basic layout of an indirection is simply
    354 
    355 || Header || Target closure ||
    356 
    357 There are several variants of indirection:
    358 
    359  * {{{IND}}}: is the vanilla, dynamically-allocated indirection.
    360    It is removed by the garbage collector.  An {{{IND}}} only exists in the youngest generation. 
    361    The update code ({{{stg_upd_frame_info}}} and friends) checks whether the updatee is in the youngest
    362    generation before deciding which kind of indirection to use.
    363  * {{{IND_OLDGEN}}}: an old generation indirection.  Same layout as {{{IND}}}.  This used to have
    364    different layout when the old-generation mutable list was threaded through the objects, but now
    365    {{{IND_OLDGEN}}} is exactly the same as {{{IND}}} (and there's no good reason to have it at all, I think).
    366  * {{{IND_PERM}}}: sometimes we don't want an indirection to be removed by the GC, so we use {{{IND_PERM}}} instead.
    367    The profiler is one user of this closure type; cost-center semantics requires that we keep track of the
    368    cost center in an indirection, so we can't eliminate the indirection.
    369  * {{{IND_OLDGEN_PERM}}}: same as above, but for the old generation.
    370  * {{{IND_STATIC}}}: a static indirection, arises when we update a {{{THUNK_STATIC}}}.  A new {{{IND_STATIC}}}
    371    is placed on the mutable list when it is created (see {{{newCaf()}}} in [[GhcFile(rts/Storage.c)]]).
    372 
    373 === Byte-code objects ===
    374 
    375 {{{BCO}}}
    376 
    377 === Black holes ===
    378 
    379 {{{BLACKHOLE}}}, {{{CAF_BLACKHOLE}}}
    380 
    381 === Arrays ===
    382 
    383 {{{ARR_WORDS}}}, {{{MUT_ARR_PTRS_CLEAN}}}, {{{MUT_ARR_PTRS_DIRTY}}}, {{{MUT_ARR_PTRS_FROZEN0}}},
    384 {{{MUT_ARR_PTRS_FROZEN}}}
    385 
    386 === MVars ===
    387 
    388 {{{MVar}}}
    389 
    390 === Weak pointers ===
    391 
    392 {{{Weak}}}
    393 
    394 === Stable Names ===
    395 
    396 {{{STABLE_NAME}}}
    397 
    398 === Thread State Objects ===
    399 
    400 Closure type {{{TSO}}} is a Thread State Object.  It represents the complete state of a thread, including its stack.
    401 
    402 TSOs are ordinary objects that live in the heap, so we can use the existing allocation and garbage collection machinery to manage them.  This gives us one important benefit: the garbage collector can detect when a blocked thread is unreachable, and hence can never become runnable again.  When this happens, we can notify the thread by sending it the {{{BlockedIndefinitely}}} exception.
    403 
    404 GHC keeps stacks contiguous, there are no "stack chunk" objects.  This is simpler, but means that when growing a stack we have to copy the old contents to a larger area (see {{{threadStackOverflow()}}} in [[GhcFile(rts/Schedule.c)]]).
    405 
    406 The TSO structure contains several fields.  For full details see [[GhcFile(includes/TSO.h)]].  Some of the more important fields are:
    407 
    408  * ''link'': field for linking TSOs together in a list.  For example, the threads blocked on an {{{MVar}}} are kept in
    409    a queue threaded through the link field of each TSO.
    410  * ''global_link'': links all TSOs together; the head of this list is {{{all_threads}}} in [[GhcFile(rts/Schedule.c)]].
    411  * ''what_next'': how to resume execution of this thread.  The valid values are:
    412    * {{{ThreadRunGhc}}}: continue by returning to the top stack frame.
    413    * {{{ThreadInterpret}}}: continue by interpreting the BCO on top of the stack.
    414    * {{{ThreadKilled}}}: this thread has received an exception which was not caught.
    415    * {{{ThreadRelocated}}}: this thread ran out of stack and has been relocated to a larger TSO; the link field points
    416      to its new location.
    417    * {{{ThreadComplete}}}: this thread has finished and can be garbage collected when it is unreachable.
    418  * ''why_blocked'': for a blocked thread, indicates why the thread is blocked.  See [[GhcFile(includes/Constants.h)]] for
    419    the list of possible values.
    420  * ''block_info'': for a blocked thread, gives more information about the reason for blockage, eg. when blocked on an
    421     MVar, block_info will point to the MVar.
    422  * ''bound'': pointer to a [wiki:Commentary/Rts/Scheduler#Task Task] if this thread is bound
    423  * ''cap'': the [wiki:Commentary/Rts/Scheduler#Capabilities Capability] on which this thread resides.
    424 
    425 === STM objects ===
    426 
    427 These object types are used by [wiki:Commentary/Rts/STM STM]: {{{TVAR_WAIT_QUEUE}}}, {{{TVAR}}}, {{{TREC_CHUNK}}}, {{{TREC_HEADER}}}.
    428 
    429 === Forwarding Pointers ===
    430 
    431 The {{{EVACUATED}}} object only appears temporarily during GC.  An object which has been copied into to-space (''evacuated'') is replaced by an {{{EVACUATED}}} object:
    432 
    433 || Header || Forwarding pointer ||
    434 
    435 which points to the new location of the object.
    436 
    437 == Objects for PAR, GRAN ==
    438 
    439 {{{BLOCKED_FETCH}}}, {{{FETCH_ME}}}, {{{FETCH_ME_BQ}}}, {{{RBH}}}, {{{REMOTE_REF}}}
     1[[redirect(wiki:Commentary/Rts/Storage/HeapObjects)]]