|Version 33 (modified by guest, 9 years ago) (diff)|
Documentation for the new GHCi debugger
- User's Manual
- Implementation notes
Documentation for the new GHCi debugger
These notes detail the breakpoint debugger which is being incorportated into GHCi. Note that there was/is a previous prototype debugger, and we share some of its code (specifically the term printer) (see: GhciDebugger).
Setting break points
The general rule of thumb for breakpoints is that you can set a breakpoint on any thing which is not a value (though there are some exceptions). For example, a literal character is a value, but a case expression is not.
We call the places where you can set breakpoints as breakable expressions (even if some of them aren't strictly expressions).
You can set breakpoints on the following things: (XXX) Check this list carefully!
- Function applications. We allow breakpoints on partial applications, even though they are technically values. Also, if there is an application with more than one argument, we only allow breaks on the whole expression, not on the sub-applications within: e.g. for the expression map f list, we allow a break on the whole expression, but not on the sub-application of map f.
- Case expressions.
- Function declarations (all the equations of a function).
- Case alternatives.
- Do statements.
- Bodies of functions, pattern bindings, lambdas, guarded equations.
Conversely, you cannot set breakpoints on the following things, except if they occur as the outermost expression in the body of a declaration:
- Do blocks. XXX check this one
- List comprehensions. XXX check this one
You can set a breakpoint in three ways:
- By line number.
- By line and column number.
- By function name (not implemented yet).
In each case you can specify in which module you want to set the breakpoint, however, if that is omitted, the debugger will choose a suitable default module for you (XXX give a better explanation of what module is chosen by default).
The syntax for setting breakpoints by line number is:
:break OptionalModuleName 12
This will activate the breakpoint which corresponds to the leftmost outermost breakable expression which begins and ends on line 12 in the module called OptionalModuleName, if such an expression exists. XXX If no such expression exists then what happens? Currently the debugger will report an error message, but perhaps it is nicer for it to probe a few lines ahead until it finds a breakable expression, or give up after some threshold number of lines?
The syntax for setting breakpoints by line and column is:
:break OptionalModuleName 12 7
This will activate the breakpoint which corresponds to the smallest breakable expression which encloses the source location on line 12, column 7, if such an expression exists. If no such expression exists the debugger will report an error message and no breakpoints will be set.
The syntax for setting breakpoints by function name is: (XXX not yet implemented)
:break OptionalModuleName functionName
This will activate the outermost breakpoint associated with the definition of the function called functionName. The breakpoint will cover all the equations of a multi-equation function. XXX What about local functions? XXX What about functions defined in type classes (default methods) and instance declarations?
Listing the active breakpoints
You can list the set of active breakpoints with the following command:
Each breakpoint is given a unique number, which can be used to identify the breakpoint should you wish to delete it (see the :delete command). Here is an example list of breakpoints:
0) Main (12,4)-(12,8) 1) Foo (13,9)-(13,13) 2) Bar (14,4)-(14,47)
Breakpoint 0 is set in the module Main on the breakable expression which spans between the source locations (12,4) to (12,8). Similarly for breakpoints 1 and 2.
You can delete any active breakpoint with the :delete command. Breakpoints are refered to by their unique number which is displayed by the :show breaks command (see above). You can refer to more than one breakpoint at a time, for example:
:delete 2 12
This will delete the breakpoints numbered 2 and 12. If you specify a breakpoint which does not exist, the debugger will simply ignore it.
You can also delete all the active breakpoints by giving the asterisk as an argument to delete, like so:
What happens when the debugger hits a breakpoint?
When an executing computation hits an active breakpoint, control is returned to the GHCi prompt. The debugger will print out a message indicating where the breakpoint occurred, and the names and types of the local variables which are in scope at that point. Here is an example:
Stopped at breakpoint in Main. Location: (6,6)-(6,20). Locals: x :: Bool, f :: Bool -> Bool, xs :: [Bool], fx :: Bool, j :: Bool *Main>
The string *Main> is GHCi's prompt marker. Note that it can change depending on what modules you have loaded.
All the normal GHCi commands work at the prompt, including the evaluation of arbitrary expressions. In addition to the normal prompt behaviour, the local variables of the breakpoint are also made available. For instance, in the above example the variable f is a function from booleans to booleans, and we can apply it to an argument in the usual way:
*Main> f False True
The debugger also provides commands for inspecting the values of local variables without forcing their evaluation any further (see Inspecting values below).
You can continue execution of the current computation with the :continue and :step commands, explained below.
It is important to note that, due to the non-strict semantics of Haskell (particularly lazy evaluation), the values of local variables at a breakpoint may only be partially evaluated. Therefore printing values may cause them to be further evaluated. This raises some interesting issues for the debugger because evaluating something could raise an exception, or it could cause another breakpoint to be fired, or it could cause non-termination. For these reasons we want to be able to print values in a way which preserves their current state of evaluation. The debugger provides the :print command for this purpose.
For example, suppose the local variable xs is bound to a list of booleans, but the list is completely unevaluated at a breakpoint. We can inspect its value without forcing any more evaluation like so:
*Main> :print xs xs = (_t1::[Bool])
The debugger uses fresh variable names (starting with underscores) to display unevaluated expressions (often called thunks). Here _t1 is a thunk. A side effect of the :print command is that these fresh variables are made available to the command line, so we can refer to them future commands.
Sometimes we want to evaluate thunks a little bit further. This is easy to do because they are bound to variable names. For example, we can evaluate the outermost data constructor of _t1 using seq like so:
*Main> seq _t1 () ()
This forces the evaluation of the thunk bound to _t1 to Weak Head Normal Form (WHNF), and then returns (). The purpose of the expression is to force the evaluation of _t1, we don't actually care about the answer, so () makes a good dummy value.
If we print xs again we can see that it has been evaluated a little bit more:
*Main> :print xs xs = [True | (_t2::[Bool])]
Here we discover that the value of xs is a list with True as its head and a thunk as its tail. The thunk is bound to the fresh variable _t2, which can be manipulated at the command line as usual.
Another way to force further evaluation of a thunk is to use it inside another expression. For instance, we could examine the spine of the list xs by computing its length:
*Main> length xs 3 *Main> :print xs xs = [True,(_t3::Bool),(_t4::Bool)]
Continuing execution after a breakpoint
Known problems in the debugger
Wishlist of features (please add your's here)
- Replace Loc with a proper source span type
- Look at slow behaviour of :print command on long list of chars (I've asked Pepe about this).
- Investigate whether the compiler is eta contracting this def: "bar xs = print xs", this could be a problem if we want to print out "xs".
- Implement show command (to list currently set breakpoints)
- Fix the ghci help command
- Implement the delete command (to delete one or more breakpoints)
- Save/restore the link environment at break points. At a breakpoint we modify both the hsc_env of the current Session, and
also the persistent linker state. Both of these are held under IORefs, so we have to be careful about what we do here. The "obvious" option is to save both of these states on the resume stack when we enter a break point and then restore them when we continue execution. I have to check with Simon if there are any difficult issues that need to be resolved here, like gracefully handling exceptions etc.
- Remove dependency on -fhpc flag, put debugging on by default and have a flag to turn it off
- Allow break points to be set by function name. Some questions: what about local functions? What about functions inside type class instances, and default methods of classes?
- Support Unicode in data constructor names inside info tables
- Fix the slow search of the ticktree for larger modules, perhaps by keeping the ticktree in the module info, rather than re-generating it each time.
- Use a primop for inspecting the STACK_AP, rather than a foreign C call
- timing and correctness tests
- Wolfgang's patch for PIC seems to break the strings in Info tables, so we need to fix that.
- stabilise the API
- user documentation
- fix the calculation of free variables at tick sites (currently done too late in the pipeline, gives some wrong results). Note a possible problem with letrecs, which means some locals vars are missing in where clause.
- perhaps there are some redundant ticks we can delete, such as ones which begin at the same start position?
- allow breakpoints to be enabled and disabled without deleting them, as in gdb
- extend breaks and step with counters, so that we stop after N hits, rather than immediately
- revert to adding tick information to the BCO directly, and remove the byte code instructions for breaks