wiki:NewGhciDebugger

Version 57 (modified by guest, 7 years ago) (diff)

--

Documentation for the new GHCi debugger

These notes detail the breakpoint debugger which is being incorportated into GHCi. Note that there was/is a previous prototype debugger, and we share some of its code (specifically the term printer) (see: GhciDebugger).


User's Manual

Command summary

Parameters to commands are indicated in angular brackets <...>, optional parameters are followed by a question mark. Ellipses ... indicate where a sequence of parameters is allowed. Everything else is literal text, as you would type it at the prompt.

Setting breakpoints:

   :break <module>? <line>
   :break <module>? <line> <column>
   :break <module>? <identifier>

Listing breakpoints:

   :show breaks   

Deleting breakpoints:

   :delete <break_number> ... <break_number>
   :delete *

Inspecting values:

   :print <expression>

Single stepping:

   :step
   :step <expression>

Continuting execution from a breakpoint:

   :continue

Starting the debugger

The debugger is integrated with GHCi, and it is on by default. The debugger slows program execution down by a factor of approximately XXX times. You can turn it off (to avoid the slowdown) using the -fno-debug command line argument, when you start GHCi.

Setting break points

The general rule of thumb for breakpoints is that you can set a breakpoint on any thing which is not a value (though there are some exceptions). For example, a literal character is a value, but a case expression is not.

We call the places where you can set breakpoints as breakable expressions (even if some of them aren't strictly expressions).

You can set breakpoints on the following things: (XXX) Check this list carefully!

  1. Function applications. We allow breakpoints on partial applications, even though they are technically values. Also, if there is an application with more than one argument, we only allow breaks on the whole expression, not on the sub-applications within: e.g. for the expression map f list, we allow a break on the whole expression, but not on the sub-application of map f.
  2. Case expressions.
  3. Function declarations (all the equations of a function).
  4. Case alternatives.
  5. Do statements.
  6. Guards.
  7. Bodies of functions, pattern bindings, lambdas, guarded equations.

Conversely, you cannot set breakpoints on the following things, except if they occur as the outermost expression in the body of a declaration:

  1. Literals.
  2. Variables.
  3. Do blocks. XXX check this one
  4. List comprehensions. XXX check this one

You can set a breakpoint in three ways:

  1. By line number.
  2. By line and column number.
  3. By function name (not implemented yet).

In each case you can specify in which module you want to set the breakpoint, however, if that is omitted, the debugger will choose a suitable default module for you (XXX give a better explanation of what module is chosen by default).

The syntax for setting breakpoints by line number is:

   :break <module>? <line>

This will activate the breakpoint which corresponds to the leftmost outermost breakable expression which begins and ends on the line indicated by the <line> parameter, if such an expression exists. XXX If no such expression exists then what happens? Currently the debugger will report an error message, but perhaps it is nicer for it to probe a few lines ahead until it finds a breakable expression, or give up after some threshold number of lines?

The syntax for setting breakpoints by line and column is:

   :break <module>? <line> <column>

This will activate the breakpoint which corresponds to the smallest breakable expression which encloses the source location (<line>, <column>), if such an expression exists. If no such expression exists the debugger will report an error message.

The syntax for setting breakpoints by function name is: (XXX not yet implemented)

   :break <module>? <identifier>

This will activate the outermost breakpoint associated with the definition of <identifier>. If <identifier> is defined by multiple equations, the breakpoint will cover them all. This means the computation will stop whenever any of those equations is evaluated. XXX What about local functions? XXX What about functions defined in type classes (default methods) and instance declarations?

Listing the active breakpoints

You can list the set of active breakpoints with the following command:

   :show breaks

Each breakpoint is given a unique number, which can be used to identify the breakpoint should you wish to delete it (see the :delete command). Here is an example list of breakpoints:

   0) Main (12,4)-(12,8)
   1) Foo (13,9)-(13,13)
   2) Bar (14,4)-(14,47)

Breakpoint 0 is set in the module Main on the breakable expression which spans between the source locations (12,4) to (12,8). Similarly for breakpoints 1 and 2.

Deleting breakpoints

You can delete any active breakpoint with the :delete command. Breakpoints are refered to by their unique number which is displayed by the :show breaks command (see above). You can refer to more than one breakpoint at a time, for example:

   :delete <break_number> ... <break_number>

This will delete all the breakpoints which are identified by the numbers <break_number> ... <break_number>. If you specify a breakpoint which does not exist, the debugger will simply ignore it.

You can also delete all the active breakpoints by giving the asterisk as an argument to delete, like so:

   :delete *

What happens when the debugger hits a breakpoint?

When an executing computation hits an active breakpoint, control is returned to the GHCi prompt. The debugger will print out a message indicating where the breakpoint occurred, and the names and types of the local variables which are in scope at that point. Here is an example:

   Stopped at breakpoint in Main. Location: (6,6)-(6,20).
   Locals: x :: Bool, f :: Bool -> Bool, xs :: [Bool], fx :: Bool, j :: Bool
   *Main>

The string "*Main>" is GHCi's prompt marker. Note that it can change depending on what modules you have loaded.

All the normal GHCi commands work at the prompt, including the evaluation of arbitrary expressions. In addition to the normal prompt behaviour, the local variables of the breakpoint are also made available. For instance, in the above example the variable f is a function from booleans to booleans, and we can apply it to an argument in the usual way:

   *Main> f False
   True

The debugger also provides commands for inspecting the values of local variables without forcing their evaluation any further (see Inspecting values below).

You can continue execution of the current computation with the :continue and :step commands, explained below.

Inspecting values

It is important to note that, due to the non-strict semantics of Haskell (particularly lazy evaluation), the values of local variables at a breakpoint may only be partially evaluated. Therefore printing values may cause them to be further evaluated. This raises some interesting issues for the debugger because evaluating something could raise an exception, or it could cause another breakpoint to be fired, or it could cause non-termination. For these reasons we want to be able to print values in a way which preserves their current state of evaluation. The debugger provides the :print command for this purpose.

For example, suppose the local variable xs is bound to a list of booleans, but the list is completely unevaluated at a breakpoint. We can inspect its value without forcing any more evaluation like so:

   *Main> :print xs
   xs = (_t1::[Bool])

The debugger uses fresh variable names (starting with underscores) to display unevaluated expressions (often called thunks). Here _t1 is a thunk. A side effect of the :print command is that these fresh variables are made available to the command line, so we can refer to them future commands.

Sometimes we want to evaluate thunks a little bit further. This is easy to do because they are bound to variable names. For example, we can evaluate the outermost data constructor of _t1 using seq like so:

   *Main> seq _t1 ()
   ()

This forces the evaluation of the thunk bound to _t1 to Weak Head Normal Form (WHNF), and then returns (). The purpose of the expression is to force the evaluation of _t1, we don't actually care about the answer, so () makes a good dummy value.

If we print xs again we can see that it has been evaluated a little bit more:

   *Main> :print xs
   xs = [True | (_t2::[Bool])]

Here we discover that the value of xs is a list with True as its head and a thunk as its tail. The thunk is bound to the fresh variable _t2, which can be manipulated at the command line as usual.

Another way to force further evaluation of a thunk is to use it inside another expression. For instance, we could examine the spine of the list xs by computing its length:

   *Main> length xs
   3
   *Main> :print xs
   xs = [True,(_t3::Bool),(_t4::Bool)]

Single stepping

When a computation has hit a breakpoint it is sometimes useful to continue execution up until the next breakable expression is evaluated, regardless of whether there is a breakpoint set at that location. This functionality is provided by the :step command:

   :step

The :step command accepts an optional argument expression. The expression is evaluated as usual, but the computation will stop at the first breakable expression which is encountered:

   :step <expression>

Continuing execution after a breakpoint

A computation which has stopped at a breakpoint can be resumed with the :continue command:

   :continue

Known problems in the debugger

Orphaned threads

Computations which fork concurrent threads can use breakpoints, but sometimes a thread gets blocked indefinitely. Consider this program:

   main = do
      forkIO foo
      bar

   foo = do something

   bar = do something else

Suppose we have a breakpoint set somewhere inside the computation done by foo, but there are no breakpoints in the computation done by bar. When we run this program in GHCi the following things happen:

  • foo gets forked and foo and bar begin their work
  • bar completes its job and we return to the GHCi prompt (uh oh!)
  • foo eventually hits a breakpoint and attempts to return to the command line, but it can't because we are already there (in the previous step).

Now the foo thread is blocked, so we can't witness the breakpoint.

Wrong variable names in patterns

Consider this program:

   foo (Just _  : xs) = [xs] 
   foo (Nothing : ys) = [ys]  {- set a breakpoint on this line -}

   main = print (foo [Nothing, Just ()])

If we hit a breakpoint in the second equation for foo we expect to see ys displayed as the local variables. Unfortunately, the debugger says that the locals are called xs:

   *Main> :break 2
   Breakpoint activated in Main. Location: (2,22)-(2,25).
   *Main> main
   Stopped at breakpoint in Main. Location: (2,22)-(2,25).
   Locals: xs :: [Maybe ()]

The problem is the way that the compiler turns pattern matches into case expressions. XXX This issue deserves some extra thought.


Wishlist of features (please add your's here)


Todo

Pending

  • Replace Loc with a proper source span type
  • Look at slow behaviour of :print command on long list of chars (I've asked Pepe about this).
  • Investigate whether the compiler is eta contracting this def: "bar xs = print xs", this could be a problem if we want to print out "xs".
  • Implement show command (to list currently set breakpoints)
  • Fix the ghci help command
  • Implement the delete command (to delete one or more breakpoints)
  • Save/restore the link environment at break points. At a breakpoint we modify both the hsc_env of the current Session, and

also the persistent linker state. Both of these are held under IORefs, so we have to be careful about what we do here. The "obvious" option is to save both of these states on the resume stack when we enter a break point and then restore them when we continue execution. I have to check with Simon if there are any difficult issues that need to be resolved here, like gracefully handling exceptions etc.

  • Remove dependency on -fhpc flag, put debugging on by default and have a flag to turn it off
  • Allow break points to be set by function name. Some questions: what about local functions? What about functions inside type class instances, and default methods of classes?
  • Support Unicode in data constructor names inside info tables
  • Fix the slow search of the ticktree for larger modules, perhaps by keeping the ticktree in the module info, rather than re-generating it each time.
  • Use a primop for inspecting the STACK_AP, rather than a foreign C call
  • timing and correctness tests
  • Wolfgang's patch for PIC seems to break the strings in Info tables, so we need to fix that.
  • stabilise the API
  • user documentation
  • fix the calculation of free variables at tick sites (currently done too late in the pipeline, gives some wrong results). Note a possible problem with letrecs, which means some locals vars are missing in where clause.

Partially done

Tentative

  • perhaps there are some redundant ticks we can delete, such as ones which begin at the same start position?
  • allow breakpoints to be enabled and disabled without deleting them, as in gdb
  • extend breaks and step with counters, so that we stop after N hits, rather than immediately
  • revert to adding tick information to the BCO directly, and remove the byte code instructions for breaks

Implementation notes