wiki:Commentary/Profiling

Version 13 (modified by kirsten, 7 years ago) (diff)

--

Profiling

GHC includes at least two types of profiling: cost-centre profiling and ticky-ticky profiling.

Cost-centre profiling operates at something close to the source level, and ticky-ticky profiling operates at something much closer to the machine level. This means that the two types of profiling are useful for different tasks. Ticky-ticky profiling is mainly meant for compiler implementors, and cost-centre profiling for mortals. However, because cost-centre profiling operates at a high level, it can be difficult (if not impossible) to use it to profile optimized code. Personally, I (Kirsten) have had a lot of success using cost-centre profiling to find problems that were due to my own bad algorithms, but less success once I was fairly sure that I wasn't doing anything obviously stupid and was trying to figure out why my code didn't get optimized as well as it could have been.

You can't use cost-centre profiling and ticky-ticky profiling at the same time; in the past, this was because ticky-ticky profiling relied on a different closure layout, but now that's no longer the case. You probably can't use both at the same time as it is unless you wanted to modify the build system to allow using way=p and way=t at the same time to build the RTS. I haven't thought about whether it would make sense to use both at the same time.

Cost-centre profiling

(add more details)

Ticky-ticky profiling

Ticky-ticky should now be working in the HEAD, though not in any so-far-released version.

TODO update the GHC manual section on this.

Using ticky-ticky profiling

Ticky-ticky profiling is very simple (conceptually): instrument the C code generated by GHC with a lot of extra code that updates counters when various (supposedly) interesting things happen, and generate a report giving the values of the counters when your program terminates. GHC does this instrumentation for you when you compile your program with a special flag. Then, you use another flag to tell the RTS to generate the profiling report.

You might want to use ticky-ticky profiling for one of the following two reasons:

  • You are an implementor trying to understand the effect of an optimization in GHC more precisely.
  • You are a user trying to observe the behavior of your programs with optimization turned on. GHC doesn't do certain transformations in the presence of cost centres, so cost-centre profiling can be less than accurate if you're trying to understand what really happens when you're compiling with -O.

I won't necessarily try to argue that ticky-ticky is useful at all for the second group of people, but it's better than nothing, and perhaps the ticky-ticky data could be used to build a better profiler.

To use ticky-ticky, first you need to do:

make way=t

in the rts/ subdirectory in your GHC tree. This will build a version of the RTS library that has all the necessary instrumentation code.

Then, compile the code you want to profile with the -fticky-ticky flag.

Finally, run your executable with:

+RTS -rfoo.ticky -RTS

and this will generate a file called foo.ticky (or whatever you want) in the current directory containing the ticky-ticky profiling data: i.e., the values of various counters, and some summary data.

If some of the counters are zero when they shouldn't be, that means they're not implemented yet (for example, probably nothing having to do with counting allocations is working). If you want them to be, complain on the ghc-users mailing list. Counters for function entries, if nothing else, should be working.

TODO document what the counters mean.

Implementation notes

When compiling with -fticky-ticky on, the back-end generates calls to a bunch of C-- macros that update the ticky counters. The relevant compiler code is mostly in compiler/codeGen/CgTicky.hs.

Those macros are defined in includes/Cmm.h. Most of them (probably all of them, at the moment) just increment counters (variables in C) that are declared in includes/TickyCounters.h. The latter file is likely to get out of sync with the former, so it really should be automatically generated.

The code in the RTS that prints out the ticky report is in rts/Ticky.c