Changes between Version 9 and Version 10 of Hoopl/Cleanup


Ignore:
Timestamp:
Aug 22, 2013 4:43:27 PM (20 months ago)
Author:
simonpj
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • Hoopl/Cleanup

    v9 v10  
    55Me (Jan Stolarek, JS) and Simon PJ recently had some discussion about cleaning up Hoopl to make its interface more consistent and less confusing. Here are some of the key proposals and ideas: 
    66 
    7   * forward and backward analyses are not dual, but current Hoopl interface does not reflect that in any way. Here's a description of how these passes work currently and how we could modify them:[[BR]] 
    8      * all analysis and rewrite passes defined in `compiler/cmm/Hoopl/Dataflow.hs` take as a parameters a list of entry points to the graph and a list of known facts. These passes are called from the wrappers defined in `compiler/cmm/CmmUtils.hs`, which pass a single entry point to the graph as a list of entry points (`JustC [entry]`) and a list of facts given by the user. It is possible for a user to pass an empty list of known facts. 
    9      * we use `getFact` (`compiler/cmm/Hoopl/Dataflow.hs`) function to look up facts about analyzed blocks:[[BR]] 
     7== The API for forward and backward analysis == 
     8 
     9=== Observations about forward analysis === 
     10 
     11  * Forwards analysis starts with a list of entry labels (typically just one), and it makes sense to have an in-flowing fact for each such label.  Yes, it could default to bottom, but it's probably a user error not to supply such a fact. 
     12 
     13  * If forwards analysis ''does'' is given a fact for each entry label, ''Hoopl never needs to know the bottom of the lattice''; indeed there doesn't need to ''be'' a bottom.  Hoopl treats the block in dependency order, so it always has an in-flowing fact before it starts to analyse a block.  It needs still a join for the lattice, of course. 
     14 
     15  * For some analyses, it's quite clumsy to have a bottom element. Consider constant-propagation, where we want to transform 
    1016{{{ 
    11 -- Fact lookup: the fact `orelse` bottom 
    12 getFact  :: DataflowLattice f -> Label -> FactBase f -> f 
    13 getFact lat l fb = case lookupFact l fb of Just  f -> f 
    14                                            Nothing -> fact_bot lat 
     17  x := 3     ===>    x := 3 
     18  ....               ... 
     19  y = x+2            y = 3+2 
    1520}}} 
    16      * in forward analysis, if user passes in facts about all entry points to the graph, then we don't need `fact_bot` field of `DataflowLattice`, because we will always have some fact about analysed block. What a user needs to pass is `[(L,F)]`, where `L` is type of label of an entry point and `F` is a type of fact about that entry point. If user provides a list of facts about every entry label then we don't need a separate list of entry points to the graph, because we can recover that from a list of facts. 
    17      * in backward analysis we still need user to supply a list `[L]` of entry points to the graph. We use them to determine a group of live blocks and then analyze the graph starting from its exit points. For a graph closed on exit we cannot know anything about these exit points, so we don't supply any known facts. We have to supply a bottom element `F`, which will be default initial value for every exit point in backward analysis. Graphs open on exit are more trickier, because if the graph is open on exit then we should know something about that exit point. Moreover, we still need a bottom element, because graph may have more than one exit point. 
    18   * if the above changes were made, then perhaps we should disallow user from passing an empty list of entry labels for analysis? 
     21  (We might then do constant folding and dead code elim, but ignore that for now.)  If we need a bottom in the lattice, our facts look like 
     22{{{ 
     23  data CPFact = Bot | CP (Map LocaReg Const) 
     24}}} 
     25  When a variable is not in the domain of the map it means it maps to top (ie the variable can hold different values on different control-flow paths).  This is all fine, but the join operation needs to deal with `Bot` etc.  And what is frustrating is that `Bot` is never, ever used!  I don't want to define it and manipulate it when it is never used! 
     26 
     27Conclusion: for fwd analysis we don't need a bottom in the lattice, and it's a pain for (some) clients to supply one. 
     28 
     29=== Observations about backward analysis === 
     30 
     31 * Backwards analysis currently takes a list of entry points, so 
     32   that it find the reachable code and enumerate it in reverse 
     33   order.  But that's ''all'' the entry point list does.  It'd be just fine 
     34   to enumerate ''all the blocks in the graph'' in reverse order, and not supply 
     35   a list of entry points. 
     36 
     37 * Backwards analysis (for a closed-on-entry graph) takes a `(Fact x f)` argument, for  
     38   a graph where `x` describes its open/closed on exit status.  So if x=O we pass one fact;  
     39   and that is entirely reasonable becuse it is the fact flowing backwards into the exit. 
     40   But if x=C we pass a `FactBase`.  At first I thought that was stupid, but now I see  
     41   some sense in it: ''these are facts labels outside (downstream successors of) the graph being analysed''. 
     42   We'd better document this point. 
     43 
     44 * NB: returning to the first bullet, we can't just take code 
     45   reachable from downstream successors (ie behave dually to fwd 
     46   anal), because tail calls, returns, and infinite loops don't 
     47   have any such downstream successors, but we jolly well want to 
     48   analyse them. 
     49 
     50 * Backward analysis ''does'' need a bottom for the lattice, to initialise loops. Example: 
     51{{{ 
     52   L1: ...blah blah... 
     53       CondBranch e L1 L2 
     54 
     55   L2: blah blah 
     56}}} 
     57   When analysing L1 (backwards) we must join the facts flowing back from L2 
     58   (which we will have analysed first) and L1; and on the first iteration, we don't  
     59   have any fact from L1. 
     60 
     61   So for backwards analysis the client really must give us a bottom element. 
     62 
     63=== Conclusions === 
     64 
     65We could reflect these observations in the API for forwards and backward analysis, as follows. 
     66 
     67Current situation: 
     68{{{ 
     69data FwdPass m n f 
     70  = FwdPass { fp_lattice  :: DataflowLattice f 
     71            , fp_transfer :: FwdTransfer n f 
     72            , fp_rewrite  :: FwdRewrite m n f } 
     73 
     74analyzeAndRewriteFwd 
     75   :: forall m n f e x entries. (CheckpointMonad m, NonLocal n, LabelsPtr entries) 
     76   => FwdPass m n f 
     77   -> MaybeC e entries 
     78   -> Graph n e x -> Fact e f 
     79   -> m (Graph n e x, FactBase f, MaybeO x f) 
     80 
     81data BwdPass m n f 
     82  = BwdPass { bp_lattice  :: DataflowLattice f 
     83            , bp_transfer :: BwdTransfer n f 
     84            , bp_rewrite  :: BwdRewrite m n f } 
     85 
     86analyzeAndRewriteBwd 
     87   :: (CheckpointMonad m, NonLocal n, LabelsPtr entries) 
     88   => BwdPass m n f 
     89   -> MaybeC e entries -> Graph n e x -> Fact x f 
     90   -> m (Graph n e x, FactBase f, MaybeO e f) 
     91}}} 
     92Possible refactoring: 
     93{{{ 
     94data FwdPass m n f 
     95  = FwdPass { fp_join     :: JoinFun f   -- No "bottom" for fwd 
     96            , fp_transfer :: FwdTransfer n f 
     97            , fp_rewrite  :: FwdRewrite m n f } 
     98 
     99analyzeAndRewriteFwd 
     100   :: forall m n f e x entries. (CheckpointMonad m, NonLocal n) 
     101   => FwdPass m n f 
     102   -> Fact e f           -- Entry points plus a fact for each 
     103   -> Graph n e x  
     104   -> m (Graph n e x, FactBase f, MaybeO x f) 
     105 
     106data BwdPass m n f 
     107  = BwdPass { bp_bot      :: f        -- Need "bottom" for bwd 
     108            , bp_join     :: JoinFun f 
     109            , bp_transfer :: BwdTransfer n f 
     110            , bp_rewrite  :: BwdRewrite m n f } 
     111 
     112analyzeAndRewriteBwd 
     113   :: (CheckpointMonad m, NonLocal n) 
     114   => BwdPass m n f 
     115   -> Fact x f       -- Facts about successors 
     116   -> Graph n e x 
     117   -> m (Graph n e x, FactBase f, MaybeO e f) 
     118}}} 
     119The differences are not great. But the types say more precisely what is 
     120actually necessary and useful. 
     121 
     122 
     123== Smaller proposals ==  
     124 
     125  * (David Luposchainsky) Hoopl is the only library in GHC that defines its own `<*>` operation,  
     126  which will clash with the AMP. Hoopl's `<*>` is conceptually 
     127  just `mappend`, so if you're doing a large-scale refactoring of the 
     128  module maybe consider adding a suitable Monoid instance to replace `<*>` 
     129  with `<>` (or something) before it even becomes a problem. 
     130 
    19131  * Simon doesn't like the `joinInFacts` function, which is only called to possibly produce some debugging output from the join function. 
     132 
    20133  * Jan doesn't like mess in Hoopl repo. There are unused modules (`Compiler.Hoopl.OldDataflow`, `Compiler.Hoopl.DataflowFold`), older versions of some modules (in `prototypes/` directory) or private correspondence with paper reviewers and between authors. 
    21134 
    22135== A personal note by Jan Stolarek 
     136 
     137(I think this may be subsumed by the above.) 
    23138 
    24139On my first contact with Hoopl I was very confused by some of its behaviour. Here's [http://www.haskell.org/pipermail/ghc-devs/2013-July/001757.html a question I mailed to ghc-devs on 13th July 2013]: