Changes between Version 5 and Version 6 of Commentary/Rts/HaskellExecution/PointerTagging


Ignore:
Timestamp:
Aug 1, 2007 9:43:18 AM (7 years ago)
Author:
simonmar
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • Commentary/Rts/HaskellExecution/PointerTagging

    v5 v6  
    1818 * For a pointer to any other object, the tag bits are always zero. 
    1919 
    20 Pointer tagging is optional: if all the tags were zero, everything would still work.  The presence of tag bits enables certain optimisations, however: 
     20The presence of tag bits enables certain optimisations: 
    2121 
    2222 * In a case-expression, if the variable being scrutinised has non-zero tag bits, then we know 
     
    3636== Invariants == 
    3737 
    38 In the current implementation, all pointers are guaranteed to be tagged, except for pointers to static constructors or functions.  We cannot guarantee to correctly tag a reference to a static closure, because the compiler does not necessarily know what the tag value should be if the static closure resides in another module. 
    39  
    40 When optimisation is on, we do know the arities of external functions, and this information is indeed used to tag pointers to imported functions, but when optimisations is off we do not have this information.  For constructors, the interface doesn't contain information about the constructor tag, except that there may be an unfolding, but the unfolding is not necessarily reliable (the unfolding may be a constructor application, but in reality the closure may be a CAF, e.g. if any of the fields are references outside the current shared library). 
    41  
    42 Do we ever assume that a pointer is tagged?  Yes, in the following places: 
     38Pointer tagging is ''not'' optional, contrary to what the paper says.  We originally planned that it would be: if the GC threw away all the tags, then everything would continue to work albeit more slowly.  However, it turned out that in fact we really want to assume tag bits in some places: 
    4339 
    4440  * In the continuation of an algebraic case, R1 is assumed tagged 
    4541  * On entry to a non-top-level function, R1 is assumed tagged 
    4642 
    47 These assumptions make the code faster: when extracting values from the closure pointed to be R1, we just subtract the (known) tag from the offset. 
     43If we don't assume the value of the tag bits in these places, then extra code is needed to untag the pointer.  If we can assume the value of the tag bits, then we just take this into account when indexing off R1. 
    4844 
    49 So each place that enters one of these code fragments must ensure that it is correctly tagging R1.  Here are the cases for an algebraic case alternative: 
     45This means that everywhere that enters either a case continuation or a non-top-level function must ensure that R1 is correctly tagged.  For a case continuation, the possibilities are: 
    5046 
    51   * the scrutinee of the case jumps directly to the alternative, if R1 is already tagged 
     47  * the scrutinee of the case jumps directly to the alternative if R1 is already tagged. 
    5248  * the constructor entry code returns to an alternative.  This code adds the correct tag. 
    5349  * if the case alternative fails a heap or stack check, then the RTS will re-enter the alternative after 
     
    5551    virtue of going through the constructor entry code. 
    5652 
    57 Here are the cases for a function: 
     53For a non-top-level function, the cases are: 
    5854 
    59   * we can assume that pointers to non-top-level functions are always tagged, so entering directly 
    60     is safe. 
    6155  * unknown function application goes via `stg_ap_XXX` (see [wiki:Commentary/Rts/HaskellExecution/FunctionCalls#GenericApply Generic Apply]).   
    6256    The generic apply functions must therefore arrange to correctly tag R1 before entering the function. 
     57  * A known function can be entered directly, if the call is made with exactly the right number of arguments. 
     58  * If a function fails its heap check and returns to the runtime to garbage collect, on re-entry the closure 
     59    pointer must be still tagged. 
     60 
     61In the second case, calling a known non-top-level function must pass the function closure in R1, and this pointer ''must'' be correctly tagged.  The code generator does not arrange to tag the pointer before calling the function; it assumes the pointer is already tagged.  Since we arrange to tag the pointer when the closure is created, this assumption is normally safe.  However, if the pointer has to be saved on the stack, say across a call, then when the pointer is retrieved again we must either retag it, or be sure that it is still tagged.  Currently we do the latter, but this imposes an invariant on the garbage collector: all tags must be retained on non-top-level function pointers. 
     62 
     63Pointers to top-level functions are not necessarily tagged, because we don't always know the arity of a function that resides in another module.  When optimisation is on, we do know the arities of external functions, and this information is indeed used to tag pointers to imported functions, but when optimisations is off we do not have this information.  For constructors, the interface doesn't contain information about the constructor tag, except that there may be an unfolding, but the unfolding is not necessarily reliable (the unfolding may be a constructor application, but in reality the closure may be a CAF, e.g. if any of the fields are references outside the current shared library). 
    6364 
    6465== Compacting GC == 
    6566 
    66 Compacting GC also needs to tag pointers, because it needs to distinguish between a heap pointer and an info pointer quickly.  Unfortunately, this means we lose one tag value (not tag bit) in our space of tags.  Therefore we have 3 tag values (including 0) available on a 32-bit machine, and 7 on a 64-bit machine. 
     67Compacting GC also uses tag bits, because it needs to distinguish between a heap pointer and an info pointer quickly.  The compacting GC has complicated scheme to ensure that pointer tags are retained, see the comments in [[GhcFile(rts/sm/Compact.c)]]. 
    6768 
    6869== Dealing with tags in the code ==