This is particularly useful when tracking down memory leaks due to retaining (sub)bytestrings which themselves retain larger bytestrings. At the moment, all the profile tells us is that this memory is "PINNED" but it doesn't give us any info at all as to where the memory was allocated.

Simon and I discussed this a while ago. Here's his summary of where we ended up:

I think what we want to do is basically mark-sweep on the BF_PINNED blocks when profiling (only). The main question is how to represent the mark bit: I think just using the low-order bit of the info pointer should be fine, since that's what we use for ordinary forwarding pointers too. Specifically:

  • in evacuate_large, if the block is BF_PINNED, mark the object by setting the low-order bit of its info table. (PROFILING only)
  • after the GC is finished, sweep all the BF_PINNED blocks that we touched, which can be found by traversing the scavenged_large_objects list of each generation. For each BF_PINNED block, walk through the memory zeroing out any unmarked ARR_WORDS objects, and unmarking the marked objects.
  • in allocatePinned, zero out any slop caused by alignment constraints.

Then I believe the pinned memory can be traversed correctly by the heap profiler with no further changes.

We discussed this during this week's GHC meeting. Simon Marlow confirmed that Ian's suggestion would likely be feasible given a few days of effort. There is, however, the possibility that we would need to be more careful about zeroing out slots that we might accidentally dereference.

