Changes between Version 4 and Version 5 of GhciDebugger


Ignore:
Timestamp:
Aug 28, 2006 6:18:49 PM (9 years ago)
Author:
mnislaih
Comment:

Version 0.2 (not finished yet)

Legend:

Unmodified
Added
Removed
Modified
  • GhciDebugger

    v4 v5  
    33During the Summer of 2006 I have been working on this project sponsorized by the[http://code.google.com/soc Google SoC] initiative. My mentors were Simon Marlow and David Himmelstrup (lemmih). 
    44 
    5 It has been a lot of fun, and I've learnt a huge amount of things, but the reader must be warned that I am still a beginner in many aspects, and that my knowledge of ghc is very shallow. So please, take my words with a bit of perspective. 
    6  
    7 The goals of the project were mainly three: 
    8  * To produce a closure viewer, capable of showing intermediate computations without forcing them, and without depending on types. 
     5It has been a lot of fun, and I've learnt a huge amount of things, but the reader must be warned that I am still a beginner in many aspects, and that my knowledge of ghc is very shallow. So please take my words with a bit of perspective. 
     6 
     7The contributions of the project have been mainly two: 
     8 * A closure viewer, capable of showing intermediate computations without forcing them, and without depending on types (and of course that excludes dependency on Show instances) 
    99 * To put the basic `breakpoint` primitive to use in a system of dynamic breakpoints for ghci. 
    10  * To come up with a mechanism to show call stack traces at breakpoints 
    1110 
    1211= The closure viewer = 
     
    1615}}} 
    1716 
    18 The term datatype is defined at a module `RtClosureInspect` included in the ghci folder. This datatype represents a partially evaluated Haskell value as an annotated tree: 
     17The term datatype is defined at a module `RtClosureInspect` in the ghci folder. This datatype represents a partially evaluated Haskell value as an annotated tree: 
    1918{{{ 
    2019data Term = Term { ty        :: Type  
     
    4140  (..)being in GHCi, we have all the compiler's information about the code to hand - including full definitions of data types.  So for a given constructor application in the heap, we can print a source-code representation of it 
    4241 
    43 What the closure viewer does is to obtain the address in the heap of a Haskell value, find out the address of its info table, and trace back to the DataCon corresponding to this info table. This is possible because the ghc runtime allocates a static info table for each and every datacon, so all we have to do is extend the linker with a dictionary relating the static info table addresses to a DataCon name. 
    44 Moreover, the ghci linker can dynamically link bytecodes containing additional `data` or `newtype` declarations. So the ghci linking code is extended in the same way. To sum up: 
     42=== DataCon recovery === 
     43The closure viewer obtains the heap address of a Haskell value, find out the address of its associated info table, and trace back to the DataCon corresponding to this info table. This is possible because the ghc runtime allocates a static info table for each and every datacon, so all we have to do is extend the linker with a dictionary relating the static info table addresses to a DataCon name. 
     44Moreover, the ghci linker can load interpreted code containing new `data` or `newtype` declarations. So the dynamic linker code is extended in the same way. To sum up: 
    4545 * `linker.c` has a new hashtable for datacons. 
    4646 * `ghci/Linker.hs` has been extended in a similar way. The Persistent Link State datatype now includes a datacons environment. At `linkExpr` and `dynLinkBCOs` the environment is extended with _any_ new datacons witnessed. 
     
    8383    |- ObjLink.lookupDataCon :: Ptr StgInfoTable -> IO (Maybe String) 
    8484}}} 
    85 First we must make sure that we are dealing with a whnf value (i.e. a Constr), as opposed to a thunk, fun, indirection, etc. This information is retrieved from the very own info table (StgInfoTable comes with a Storable instance, defined at ByteCodeItbls). From here ahead I will use constr to refer to a whnf value. 
    86  
    87 Once we have the ability to recover the datacon of a constr and thus its (possibly polymorphic) type , we can construct its tree representation. The payload of a closure is an ordered set of pointers and non pointers (words). For a Constr closure, the non pointers correspond to primitive unboxed values, whereas the pointers are references to other closures. In the tree representation as a term of a constr, the subTerms are obtained from the constr payload. 
     85First we must make sure that we are dealing with a whnf value (i.e. a Constr), as opposed to a thunk, fun, indirection, etc. This information is retrieved from the very own info table (StgInfoTable comes with a Storable instance, defined at ByteCodeItbls). From here on I will use simply constr to refer to a Constr closure. 
     86 
     87Once we have the ability to recover the datacon of a constr and thus its (possibly polymorphic) type, we can construct its tree representation. The payload of a closure is an ordered set of pointers and non pointers (words). For a Constr closure, the non pointers correspond to leafs of the tree, primitive unboxed values, the pointers being the so-called subTerms, references to other closures. 
    8888 
    8989=== Type reconstruction === 
    90 `obtainTerm` recursively traverses all the closures that conform a term. Indirections are followed and suspensions are optionally forced. The only problem here is dealing with types. DataCons can have polymorphic types, so the knowledge of the datacon only is not enough. There are two other sources of type information: 
    91  1. The typechecker, via the `Id` argument to `obtainTerm` 
    92  2. The concrete types of the subterms, if instantiated 
     90`obtainTerm` recursively traverses all the closures that conform a term. Indirections are followed and suspensions are optionally forced. The only problem here is dealing with types. DataCons can have polymorphic types which we would want to instantiate, so the knowledge of the datacon only is not enough. There are two other sources of type information: 
     91 1. The typechecker, via the `Id` argument to `obtainTerm`. 
     92 2. The concrete types of the subterms, if they are sufficiently evaluated. 
    9393 
    9494The process followed to reconstruct the types of a value as much as possible is: 
     
    9696 1. obtain the subTerms of the value recursively calling `obtainTerm` with the available type info (dataCon plus typechecker), discovering new type info in the process. 
    9797 2. refine the type of the value. This is accomplished with a step of unification between (1) and (2) above, and matching the result with the type of the datacon, obtaining the tyvars, which are used to instantiate. This step obtains the most concrete type.  
    98    * Note that the handling of tyvars is delicate. We want to ensure that the tyvars of every subterm type are independent. 
    99  3. refine the type of the subterms (recursively) with the reconstructed type.  
     98   * Note that tyvars need renaming to avoid collisions. 
     99 3. refine the type of the subterms (inductively) with the reconstructed type.  
    100100 
    101101 
     
    109109 * It does not remember previous bindings. Two consecutive uses of `:print` will generate two separate bindings for the same thing, generating redundancy and potential confusion. But... 
    110110 * since type reconstruction (for polymorphic/untyped things) can eventually happen whenever the suspensions are forced, it is necessary to use `:print` again to obtain a properly typed binding 
    111    * It is a future work to make ghci do this type reconstruction implicitly on the existing, polymorphic bindings. This would be ''nice'' for the _t,,xx,, things, but even nicer for the local bindings happening at a breakpoint. 
     111   * It is a future work to make ghci do this type reconstruction implicitly on the existing, polymorphic bindings. This would be ''nice'' for the _t,,xx,, things, but even nicer for the local bindings in the context of a breakpoint. 
     112 
     113=== Pretty printing of terms === 
     114We want to customize the printing of some stuff, such as Integers, Floats, Doubles, Lists, Tuples, Arrays, and so on. 
     115 At the `RtClosureInspect` module there is some infrastructure to build a custom printer, with a basic custom printer that covers the enumerated types. 
     116 
     117In InteractiveUI.hs the function `pprintClosure` takes advantage of this and makes use of a custom printer that uses Show instances if available. 
     118 
     119=== Recovering non-pointers === 
     120This happens at `RtClosureInspect.extractUnboxed` and might potentially break in some architectures. 
     121 
     122= Breakpoints = 
     123 
     124== `breakpoint`  Implementation == 
     125When compiling to bytecodes, breakpoints are desugared to 'fake' jump functions, i.e. they are not defined anywhere, later in the interactive environment we link them to something:  
     126{{{ 
     127breakpoint => breakpointJump 
     128breakpointCond => breakpointCondJump 
     129breakpointAuto => breakpointAutoJump 
     130}}} 
     131The types would be: 
     132{{{ 
     133breakpointAutoJump, breakpointJump ::  
     134                    Int                         -- Address of a StablePtr containing the Ids 
     135                 -> [()]                        -- Local bindings list 
     136                 -> (String, String, Int)       -- Package, Module and site number 
     137                 -> String                      -- Location message (filename + srcSpan) 
     138                 -> b -> b                  
     139breakpointCond :: Int -> [()] -> (String,String,Int) -> String -> Bool -> b -> b 
     140}}} 
     141They get filled with the pointer to the ids in scope, their values, the site, a message, and the wrapped value in the desugarer. Everything served with the right amounts of unsafeCoerce sauce and TyApp dressing to make the generated Core lint. 
     142 
     143The site number is relevant only for 'auto' breakpoints, explained later. For the other two types of breakpoints its value should be 0. 
     144 
     145The desugarer monad has been extended with an OccEnv of Ids to track the bindings in scope. Of course this environment thing is probably too ad-hoc to use it for anything else. The monad also carries a mutable table of breakpoint sites for the current module. This is explained below. 
     146 
     147=== Default HValues for the Jump functions === 
     148The dynamic linker has been modified so that it won't panic if one of the jump functions fails to resolve. 
     149Now, if the dynamic linker fails to find a HValue for a Name, before looking for a static symbol it will ask  
     150{{{ 
     151DsBreakpoint.lookupBogusBreakpointVal :: Name -> Maybe HValue 
     152}}} 
     153which returns a "just return the wrapped thing" if it is one of the Jump names and Nothing otherwise. 
     154 
     155This is necessary because a TH function might contain a call to a breakpoint function So if the module it lives in is compiled to bytecodes, the breakpoints will be desugared to 'jumps'. Whenever this code is spliced, the linker will fail to find the jumpfunctions unless there is a default. 
     156 
     157Why didn't I address the problem by forbidding breakpoints inside TH code? I couldn't find an easy solution for this, considering the user is free to put a manual breakpoint wherever. 
     158Why did I introduce the default as a special case in the linker? 
     159I considered other options: 
     160 * Running TH splices in an extended link env. This would probably scatter breakpoint related code deep in the typechecker, and is ugly. 
     161 * Making the 'jump' functions real, by giving them equations and types, maybe in the GHC.Exts module. This solution seemed fine but I wasn't sure of how this would interact with dynamic linking of 'jumps'.  
     162 
     163                                    
     164=== A note about bindings in scope in a breakpoint === 
     165While I was trying to get the generated core for a breakpoint to lint, I made the design decision of not making available the things bound in a recursive group in the breakpoint context. This includes lets, wheres, and mdo notation. The latter case however is not enforced: I haven't found the time to work it out yet. 
    112166 
    113167 
    114168= Dynamic Breakpoints = 
    115 The approach followed here has been the well known 'do the simplest thing that could possibly work'. We instrument the code with 'auto' breakpoints at event ''sites''. Current event sites are only binding introductions (at let, where, and top level). 
    116  
    117 The instrumentation is done at the renamer, because we need to know and have the local bindings at a site in order to create the breakpoint. 
    118 The overhead is ...''TODO'' 
    119  
    120 There are several quirks with the current solution: 
    121  * Introduced breakpoints will show up at compile-time errors, confusing the user 
    122  * it does not contemplate interrupting the execution at unexpected conditions (exceptions) 
    123  
     169The approach followed here has been the well known 'do the simplest thing that could possibly work'. We instrument the code with 'auto' breakpoints at event ''sites''. Currently event sites are code locations where names are bound, and statements: 
     170 * let declarations 
     171 * where declarations  
     172 * top level declarations  
     173 * case alternatives  
     174 * lambda abstractions 
     175 * do statements (any variant of them) 
     176 
     177The instrumentation is done at the desugarer too, which has been extended accordingly. We distinguish between 'auto' breakpoints, those introduced by the desugarer, and 'normal' breakpoints user created by using the `breakpoint` function directly. 
     178 
     179== Overhead == 
     180The instrumentation scheme potentially introduces overhead at two stages: compile-time and run-time. Compile-time overhead is unnoticeable for general programs, although there are no benchmarks available to sustain this claim. Run-time overhead is much more noticeable. 
     181Run-time overhead has been measured informally to range in between 9x and 25x, depending on the code of the program under consideration.  
     182 
     183With an always-on breakpoints scenario in mind, we do a number of things to mitigate this overhead in absence of enabled breakpoints. One of these is to allow a ghc-api client to disable auto breakpoints via the ghc-api functions: 
     184{{{  
     185enableAutoBreakpoints  :: Session -> IO () 
     186disableAutoBreakpoints :: Session -> IO () 
     187}}} 
     188 
     189GHCi would keep breakpoints disabled until the user defines the first breakpoint, and thus for normal use we could keep the -fdebugging flag enabled always. 
     190The problem is that to make the implementation of `disableAutoBreakpoints` (`enableAutoBreakpoints resp.)  effective at all we need to implement it by relinking the `breakpointJumpAuto` function to a new "do nothing" lambda (to the user-set bkptHandler resp.).  
     191This would imply a relink, which is quite annoying to a user of GHCi since any top level bindings are lost. This is why this functionality is only a proof of concept and is disabled for now. I wish I had a better understanding of how the dynamic linker and the top level environment in ghci work. 
     192 
     193We also try to do some simple breakpoint coalescing.  
     194 
     195=== Breakpoint coalescing === 
     196''.. implemented, to be documented..'' 
     197 
     198== Modifications in the renamer == 
     199This section is easy. There are NO modifications in the renamer, other than removing Lemmih's original code for the `breakpoint` function. All the stuff that we had originally placed here was moved to the desugarer in the final stage of the project. 
     200 
     201== Modifications to the desugarer == 
     202''summarize the code instrumentation stuff'' 
     203 
     204== Passing the sitelist of a module around == 
     205''summarize the modifications made to thread the site list of a module from the renamer to the ghc-api'' 
     206TcGblEnv is extended with a dictionary of sites and coordinates (TODO: switch the coordinate datatype to the ghc-standard SrcLoc) introduced in the module at the desugarer. 
     207 
     208 
     209== The `Opt_Debugging` flag == 
     210This is activated in command-line via `-fdebugging` and can be disabled with `-fno-debugging`. 
     211This flag simply enables breakpoint instrumentation in the desugarer. 
     212 
     213`-fno-debugging` is different from `-fignore-breakpoints` in that user inserted breakpoints will still work. 
     214 
     215== Interrupting at exceptions == 
     216Ideally, a breakpoint that would witness an exception would stop the execution, no more questions. Sadly, it seems impossible to 'witness' an exception. Throw and catch are essentially primitives (throw#, throwio# and catch#), we could install an exception handler at every breakpoint site but that: 
     217 * Would add more overhead 
     218 * Would require serious instrumentation to embed everything in IO, and thus 
     219 * Would alter the evaluation order 
     220 
     221So it is not doable via this route. 
     222 
     223We could try and use some tricks. For instance, in every 'throw' we spot, we insert a breakpoint based on the condition on this throw. In every 'assert' we do the same. But this would see only user exceptions, missing system exceptions (pattern match failures for instance), asynchronous exceptions and others. Which is not acceptable imho.  
     224 
     225I don't know if a satisfactory solution is possible with the current scheme for dynamic breakpoints. 
    124226 
    125227== The breakpoints api at ghc-api == 
     
    133235'' to be finished'' 
    134236 
    135 == Modifications in the renamer == 
    136 ''summarize the code instrumentation stuff'' 
    137  
    138 == Passing the sitelist of a module around == 
    139 ''summarize the modifications made to thread the site list of a module from the renamer to the ghc-api'' 
    140  
    141 == The `Opt_Debugging` flag == 
    142 This is activated in command-line via `-fdebugging` and can be disabled with `-fno-debugging`. 
    143 When it is enabled: 
    144  * Breakpoint instrumentation takes place in the renamer. 
    145  * the `:breakpoint` command is available in ghci 
    146  
    147 `-fno-debugging` is different from `-fignore-breakpoints` in that user inserted breakpoints will still work. 
    148  
    149 == Interrupting at exceptions == 
    150 Ideally, a breakpoint that would witness an exception would stop the execution, no more questions. Sadly, it seems impossible to 'witness' an exception. Throw and catch are essentially primitives (throw#, throwio# and catch#), we could install an exception handler at every breakpoint site but that: 
    151  * Would add more overhead  
    152  * Would require serious instrumentation to embed everything in IO, and thus 
    153  * Would alter the evaluation order 
    154  
    155 So it is not doable via this route. 
    156  
    157 We could try and use some tricks. For instance, in every 'throw' we spot, we insert a breakpoint based on the condition on this throw. In every 'assert' we do the same. But this would see only user exceptions, missing system exceptions (pattern match failures for instance), asynchronous exceptions and others. Which is not acceptable imho. 
    158  
    159 For now I am stuck :S 
    160  
    161237 
    162238= Pending work = 
    163 The most important is call stack traces 
    164 ''Put together all the small and big todos here'' 
     239Call stack traces. 
     240Interruption at unexpected conditions (expections). 
     241 
     242''Put together all the small todos here''