wiki:GhciDebugger

Version 4 (modified by mnislaih, 8 years ago) (diff)

--

Work in Progress

Report of the implementation of the GHCi debugger

During the Summer of 2006 I have been working on this project sponsorized by theGoogle SoC initiative. My mentors were Simon Marlow and David Himmelstrup (lemmih).

It has been a lot of fun, and I've learnt a huge amount of things, but the reader must be warned that I am still a beginner in many aspects, and that my knowledge of ghc is very shallow. So please, take my words with a bit of perspective.

The goals of the project were mainly three:

  • To produce a closure viewer, capable of showing intermediate computations without forcing them, and without depending on types.
  • To put the basic breakpoint primitive to use in a system of dynamic breakpoints for ghci.
  • To come up with a mechanism to show call stack traces at breakpoints

The closure viewer

The closure viewer functionality is provided by the following function at the GHC module:

obtainTerm :: Session -> Bool -> Id -> IO (Maybe Term)

The term datatype is defined at a module RtClosureInspect included in the ghci folder. This datatype represents a partially evaluated Haskell value as an annotated tree:

data Term = Term { ty        :: Type 
                 , dc        :: DataCon 
                 , val       :: HValue 
                 , subTerms  :: [Term] }

          | Prim { ty        :: Type
                 , value     :: String }

          | Suspension { ctype    :: ClosureType
                       , mb_ty    :: Maybe Type
                       , val      :: HValue
                       , bound_to :: Maybe Name   -- Does not belong here, but useful for printing
                       }

A few other comments on this module:

  • It is not meant to be included in the stage1 compiler
  • It is imported by GHC, so in order to avoid introducing more cyclical dependencies I've tried to keep all Session related stuff in the GHC module.

Implementation details

Quoting from Simon Marlow in the ghc-cvs list:

(..)being in GHCi, we have all the compiler's information about the code to hand - including full definitions of data types. So for a given constructor application in the heap, we can print a source-code representation of it

What the closure viewer does is to obtain the address in the heap of a Haskell value, find out the address of its info table, and trace back to the DataCon? corresponding to this info table. This is possible because the ghc runtime allocates a static info table for each and every datacon, so all we have to do is extend the linker with a dictionary relating the static info table addresses to a DataCon? name. Moreover, the ghci linker can dynamically link bytecodes containing additional data or newtype declarations. So the ghci linking code is extended in the same way. To sum up:

  • linker.c has a new hashtable for datacons.
  • ghci/Linker.hs has been extended in a similar way. The Persistent Link State datatype now includes a datacons environment. At linkExpr and dynLinkBCOs the environment is extended with _any_ new datacons witnessed.
    • Since this scheme makes no distinction between statically and dynamically loaded info tables a lot of redundancy goes into this environment, maybe it's worth to fix this.

Two new primitive ops have been created which allow to obtain the address of a closure info table and to obtain the closure payload (i.e. if it is a value, the arguments of the datacon).

infoPtr# :: a -> Addr#
closurePayload# :: a -> (# Array# b, ByteArr# #)

The use of these primitives is encapsulated in the RtClosureInspect module, which provides:

getClosureType  :: a -> IO ClosureType
getInfoTablePtr :: a -> Ptr StgInfoTable
getClosureData  :: a -> IO Closure

data Closure = Closure { tipe         :: ClosureType 
                       , infoTable    :: StgInfoTable
                       , ptrs         :: Array Int HValue
                       , nonPtrs      :: ByteArray# 
                       }

data ClosureType = Constr 
                 | Fun 
                 | Thunk Int 
                 | ThunkSelector
                 | Blackhole 
                 | AP 
                 | PAP 
                 | Indirection Int 
                 | Other Int
 deriving (Show, Eq)

The implementation of the datacon recovery stuff is scattered around:

Linker.recoverDataCon :: a -> TcM Name
 |- recoverDCInDynEnv :: a -> IO (Maybe Name)
 |- recoverDCInRTS    :: a -> TcM Name
    |- ObjLink.lookupDataCon :: Ptr StgInfoTable -> IO (Maybe String)

First we must make sure that we are dealing with a whnf value (i.e. a Constr), as opposed to a thunk, fun, indirection, etc. This information is retrieved from the very own info table (StgInfoTable? comes with a Storable instance, defined at ByteCodeItbls?). From here ahead I will use constr to refer to a whnf value.

Once we have the ability to recover the datacon of a constr and thus its (possibly polymorphic) type , we can construct its tree representation. The payload of a closure is an ordered set of pointers and non pointers (words). For a Constr closure, the non pointers correspond to primitive unboxed values, whereas the pointers are references to other closures. In the tree representation as a term of a constr, the subTerms are obtained from the constr payload.

Type reconstruction

obtainTerm recursively traverses all the closures that conform a term. Indirections are followed and suspensions are optionally forced. The only problem here is dealing with types. DataCons? can have polymorphic types, so the knowledge of the datacon only is not enough. There are two other sources of type information:

  1. The typechecker, via the Id argument to obtainTerm
  2. The concrete types of the subterms, if instantiated

The process followed to reconstruct the types of a value as much as possible is:

  1. obtain the subTerms of the value recursively calling obtainTerm with the available type info (dataCon plus typechecker), discovering new type info in the process.
  2. refine the type of the value. This is accomplished with a step of unification between (1) and (2) above, and matching the result with the type of the datacon, obtaining the tyvars, which are used to instantiate. This step obtains the most concrete type.
    • Note that the handling of tyvars is delicate. We want to ensure that the tyvars of every subterm type are independent.
  3. refine the type of the subterms (recursively) with the reconstructed type.

About handling suspensions in the interactive environment

The interactive ui uses GHC.obtainTerm to implement the :print and :sprint command. The difference is that :print, additionally, binds suspended values. Thus, suspensions inside semievaluated terms are bound by :print to _txx names in the interactive environment, available for the user.

This is done at InteractiveUI.pprintClosure. Whenever the suspensions are not sufficiently typed, tyvars are substituted with the type GHC.Base.Unknown, which has an associated Show instance that instructs the user to seq them to recover the type.

There are two quirks with the current solution:

  • It does not remember previous bindings. Two consecutive uses of :print will generate two separate bindings for the same thing, generating redundancy and potential confusion. But...
  • since type reconstruction (for polymorphic/untyped things) can eventually happen whenever the suspensions are forced, it is necessary to use :print again to obtain a properly typed binding
    • It is a future work to make ghci do this type reconstruction implicitly on the existing, polymorphic bindings. This would be nice for the _txx things, but even nicer for the local bindings happening at a breakpoint.

Dynamic Breakpoints

The approach followed here has been the well known 'do the simplest thing that could possibly work'. We instrument the code with 'auto' breakpoints at event sites. Current event sites are only binding introductions (at let, where, and top level).

The instrumentation is done at the renamer, because we need to know and have the local bindings at a site in order to create the breakpoint. The overhead is ...TODO

There are several quirks with the current solution:

  • Introduced breakpoints will show up at compile-time errors, confusing the user
  • it does not contemplate interrupting the execution at unexpected conditions (exceptions)

The breakpoints api at ghc-api

Once an 'auto' breakpoint, that is a breakpoint inserted by the renamer, is hit, an action is taken. There are hooks to customize this behaviour in the ghc-api. The GHC module provides:

data BkptHandler a = BkptHandler {
     handleBreakpoint  :: forall b. Session -> [(Id,HValue)] -> BkptLocation a ->  String -> b -> IO b
   , isAutoBkptEnabled :: Session -> BkptLocation a -> IO Bool
   }

to be finished

Modifications in the renamer

summarize the code instrumentation stuff

Passing the sitelist of a module around

summarize the modifications made to thread the site list of a module from the renamer to the ghc-api

The Opt_Debugging flag

This is activated in command-line via -fdebugging and can be disabled with -fno-debugging. When it is enabled:

  • Breakpoint instrumentation takes place in the renamer.
  • the :breakpoint command is available in ghci

-fno-debugging is different from -fignore-breakpoints in that user inserted breakpoints will still work.

Interrupting at exceptions

Ideally, a breakpoint that would witness an exception would stop the execution, no more questions. Sadly, it seems impossible to 'witness' an exception. Throw and catch are essentially primitives (throw#, throwio# and catch#), we could install an exception handler at every breakpoint site but that:

  • Would add more overhead
  • Would require serious instrumentation to embed everything in IO, and thus
  • Would alter the evaluation order

So it is not doable via this route.

We could try and use some tricks. For instance, in every 'throw' we spot, we insert a breakpoint based on the condition on this throw. In every 'assert' we do the same. But this would see only user exceptions, missing system exceptions (pattern match failures for instance), asynchronous exceptions and others. Which is not acceptable imho.

For now I am stuck :S

Pending work

The most important is call stack traces Put together all the small and big todos here