Version 3 (modified by simonmar, 10 years ago) (diff)


Pointer Tagging

Paper: Faster laziness using dynamic pointer tagging

In GHC we "tag" pointers to heap objects with information about the object they point to. The tag goes in the low 2 bits of the pointer, which would normally be zero since heap objects are always word-aligned (3 bits on a 64-bit platform).

The way the tag bits are used depends on the type of object pointed to:

  • If the object is a constructor, the tag bits contain the constructor tag, if the number of constructors in the datatype is less than 4 (less than 8 on a 64-bit platform). If the number of constructors in the datatype is more than 4 (resp 8), then the tag bits have the value 1.
  • If the object is a function, the tag bits contain the arity of the function, if the arity fits in the tag bits.
  • For a pointer to any other object, the tag bits are always zero.

Pointer tagging is optional: if all the tags were zero, everything would still work. The presence of tag bits enables certain optimisations, however:

  • In a case-expression, if the variable being scrutinised has non-zero tag bits, then we know that it points directly to a constructor and we can avoid entering it to evaluate it. Furthermore, for datatypes with only a few constructors, the tag bits will tell us which constructor it is, eliminating a further memory load to extract the constructor tag from the info table.
  • In a generic apply, if the function being applied has a tag value that indicates it has exactly the right arity for the number of arguments being applied, we can jump directly to the function, instead of inspecting its info table first.

Pointer-tagging is a fairly significant optimisation: we measured 10-14% depending on platform. A large proportion of this comes from eliminating the indirect jumps in a case expression, which are hard to predict by branch-prediction. The paper has full results and analysis.

The garbage collector maintains tag bits on the pointers it traverses; additionally when it eliminates an indirection it takes the tag bits from the pointer inside the indirection.


In the current implementation, all pointers are guaranteed to be tagged, except for pointers to static constructors or functions. We cannot guarantee to correctly tag a reference to a static closure, because the compiler does not necessarily know what the tag value should be if the static closure resides in another module.

When optimisation is on, we do know the arities of external functions, and this information is indeed used to tag pointers to imported functions, but when optimisations is off we do not have this information. For constructors, the interface doesn't contain information about the constructor tag, except that there may be an unfolding, but the unfolding is not necessarily reliable (the unfolding may be a constructor application, but in reality the closure may be a CAF, e.g. if any of the fields are references outside the current shared library).

Do we ever assume that a pointer is tagged? Yes, in the following places:

  • In the continuation of an algebraic case, R1 is assumed tagged
  • On entry to a function, R1 is assumed tagged

These assumptions make the code faster: when extracting values from the closure pointed to be R1, we just subtract the (known) tag from the offset.

There is one unexpected consequence of the function entry code assuming that R1 is tagged: the function pointer in a RET_FUN stack frame must be tagged, because it is just loaded into R1 before jumping to the entry code for the function. This is the single exception to the general rule that tags are optional in pointers found in heap objects.

Compacting GC

Compacting GC also needs to tag pointers, because it needs to distinguish between a heap pointer and an info pointer quickly. Unfortunately, this means we lose one tag value (not tag bit) in our space of tags. Therefore we have 3 tag values (including 0) available on a 32-bit machine, and 7 on a 64-bit machine.

Dealing with tags in the code

Every time we dereference a pointer to a heap object, we must first zero the tag bits. In the RTS, this is done with the macro UNTAG_CLOSURE(); in .cmm code this is done with the UNTAG() macro. Surprisingly few places needed untagging to be added.