Changes between Version 3 and Version 4 of Ghc/Hooks


Ignore:
Timestamp:
Sep 5, 2013 12:52:36 AM (21 months ago)
Author:
luite
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • Ghc/Hooks

    v3 v4  
    11= GHC API Hooks = 
    22 
    3 This document describes a proposal for a **hooks** interface to customise certain phases of GHC's source transformation facilities.  A hook is simply a callback that is called at certain points in the program. 
     3This document describes a proposal for a **hooks** interface to customise certain phases of GHC's source transformation facilities.  A hook is GHC API-user-supplied callback that overrides the built-in functionality when installed. 
    44 
    55== The Problem == 
    66 
    7 The GHC API can be used for many different purposes, but some of these purposes require deviating behaviour from what GHC would normally do.  For example, if we want to use GHC as a front-end to a Haskell compiler, we may want to replace the code generator with a custom backend (e.g., a byte code format or JavaScript code).  Similarly, some tools want to inject information into the compiler (e.g., during quasi-quoting). 
     7Much of GHC is exposed as a library, making it possible to use the compiler in various new ways, like extracting type information from Haskell code or generating code from one of the intermediate representations. Often, these use cases require some change from the default behaviour at certain points during compilation. For example someone writing a custom code generator would want reuse as much as possible of the GHC pipeline, to get dependency chasing and recompilation avoidance, but replace the part where normally the native code generator would be called. 
     8 
     9Unfortunately the GHC API offers no configuration options for this, and it is hard to customize the compiler manually: For example once {{{load LoadAllTargets}}} is called, GHC runs its own pipeline, replacing specific functionality would require the user to copy large parts of `GhcMake` and `DriverPipeline`. 
    810 
    911== Proposed Solution == 
     
    1113One option is to split GHC's stages into lots of small function calls and allow the user to wire these stages together as needed.  Unfortunately, this is very difficult to do and wouldn't necessarily be very flexible since there are a number of expected invariants not encoded in the types. 
    1214 
    13 Instead we identify "interesting" places inside the compiler and allow users of the GHC API to specify a call-back that gets invoked when execution reaches that place.  For example, instead of calling {{{typecheckRename}}} function directly, we first look up whether there is a hook specified for and if so call that function instead.  The hook may choose to perform its own renaming and type checking passes (unlikely) or call the GHC's {{{typecheckRename}}} function and inspect its output and generate additional files on disk. 
     15Instead we propose hooks: Hooks are an extensible way to replace specific parts of the GHC pipeline by user-supplied callbacks. The goal is to add them to strategic points in the library, to cover most use cases without sprinkling hooks carelessly throughout the GHC code. The API should be considered in flux, since it could require some experimentation to find the best hook locations. 
     16 
     17An example that uses all currently implemented hooks, along with who uses them can be found here: 
     18 
     19[https://gist.github.com/luite/6444273 Hooks demonstration program] 
    1420 
    1521== Example == 
     22{{{ 
     23hooksExample :: [Located String] -> [FilePath] -> IO () 
     24hooksExample args targetFiles = 
     25    defaultErrorHandler defaultFatalMessager defaultFlushOut $ 
     26      runGhc (Just libdir) $ do 
     27        dflags0 <- getSessionDynFlags 
     28        (dflags1, _leftover, _warns) <- parseDynamicFlags dflags0 args 
     29        let ah hk h dfs = dfs { hooks = insertHook hk h (hooks dfs) } 
     30            dflags2 = dflags1 { hooks = insertHook RunQuasiQuoteHook myRunQuasiQuote (hooks dflags1) } 
     31        setSessionDynFlags dflags2 
     32        setTargets =<< mapM (\file -> guessTarget file Nothing) targetFiles 
     33        successFlag <- sourceErrorHandler (load LoadAllTargets) 
     34        when (failed successFlag) (throw $ ExitFailure 1) 
     35 
     36sourceErrorHandler m = handleSourceError (\e -> do 
     37  GHC.printException e 
     38  liftIO $ exitWith (ExitFailure 1)) m 
     39 
     40myRunQuasiQuote :: HsQuasiQuote Name -> RnM (HsQuasiQuote Name) 
     41myRunQuasiQuote q@(HsQuasiQuote name span quoted) = do 
     42  liftIO $ putStrLn ("myRunQuasiQuote: running quasiquoter on\n" ++ show quoted) 
     43  return q -- don't change the quote or quoter for the example 
     44}}} 
     45 
     46`myRunQuasiQuote` is called for every quasiquote 
     47 
     48[https://gist.github.com/luite/6444273 Demonstration program that uses all hooks] 
     49 
     50== Design == 
     51 
     52Each hook has a potentially different type from all the other hooks. Additionally, we need to be able to communicate hooks to all the locations where they may be invoked. We can achieve this by storing the hooks in the {{{DynFlags}}}. 
     53 
     54Unfortunately, it is impossible to store them directly, as a record, since that would lead to huge cyclic imports (the data types used by the hooks would depend on {{{DynFlags}}}, but {{{DynFlags}}} would depend on the modules defining the types in the hooks record) 
     55 
     56Instead we implement {{{Hooks}}} as a heterogeneous map. The public API allows one to insert and lookup hooks safely, correctness of the types is guaranteed by the {{{Hook}}} type family. Internally, the {{{TypeRep}}} of {{{a}}} is used as the actual key for {{{Hook a}}}: 
     57 
     58{{{ 
     59newtype Hooks = Hooks (M.Map TypeRep Any) 
     60 
     61type family Hook a :: * 
     62 
     63insertHook :: Typeable a => a -> Hook a -> Hooks -> Hooks 
     64insertHook tag hook (Hooks m) = 
     65  Hooks (M.insert (typeOf tag) (unsafeCoerce hook) m) 
     66 
     67lookupHook :: Typeable a => a -> Hooks -> Maybe (Hook a) 
     68lookupHook tag (Hooks m) = 
     69  fmap unsafeCoerce (M.lookup (typeOf tag) m) 
     70}}} 
     71 
     72== Installing a hook == 
     73 
     74To use a hook, just add it to the {{{DynFlags}}} for your session, using the Hook's key: 
     75 
     76{{{ 
     77dflags1 = dflags0 { hooks = insertHook SomeHook myImplementation (hooks dflags0) } 
     78}}} 
     79 
     80Keep in mind that Hooks is a low level API, it's easy to break things. Also in some cases, inserting one hook may require inserting another. For example if you use `TcForeignsHook` to accept extra types for your foreign imports, you'll need `DsForeignsHook` to desugar them, otherwise GHC will not know what to do with them. 
     81 
     82== Making a new hook == 
     83 
     84Every hook requires a key, a data type that has to implement {{{Typeable}}}, since we use its {{{TypeRep}}} as a globally unique key for the {{{Hooks}}} map. Additionally, a type family instance of {{{Hook}}} is required, to map the key to the actual hook type. You might want to export the original unhooked function, or extra types and functions that users of the hook will need. 
     85 
     86If you're in a monad with a {{{HasDynFlags}}} instance, you can use the {{{getHooked}}} function from {{{DynFlags}}}: 
     87 
     88{{{ 
     89data HscFrontendHook = HscFrontendHook deriving Typeable 
     90type instance Hook HscFrontendHook = ModSummary -> Hsc TcGblEnv 
     91 
     92genericHscFrontend :: ModSummary -> Hsc TcGblEnv 
     93genericHscFrontend mod_summary = 
     94  getHooked HscFrontendHook genericHscFrontend' >>= ($ mod_summary) 
     95 
     96-- original function 
     97genericHscFrontend' :: ModSummary -> Hsc TcGblEnv 
     98genericHscFrontend' mod_summary 
     99    | ExtCoreFile <- ms_hsc_src mod_summary = 
     100        panic "GHC does not currently support reading External Core files" 
     101    | otherwise = do 
     102         hscFileFrontEnd mod_summary 
     103}}} 
     104 
     105Otherwise, use the hooks field from `DynFlags` directly: 
    16106 
    17107{{{ 
    18108 
    19   withGhc libdir $ do 
    20     dflags0 <- getSessionDynFlags 
    21     let dflags1 = dflags0{ hooks = insertHook LocateLibHook myLocateLib 
    22                                  . insertHook LinkDynLibHook myLinkDynLibHook 
    23                                  $ hooks dflags0 } 
    24     setSessionDynFlags dflags1 
     109data HscCompileOneShotHook = 
     110  HscCompileOneShotHook deriving Typeable 
     111type instance Hook HscCompileOneShotHook = 
     112  HscEnv -> FilePath -> ModSummary -> SourceModified -> IO HscStatus 
    25113 
    26 myLocateLib :: DynFlags -> Bool -> [FilePath] -> String -> IO LibrarySpec 
    27 myLocateLib  
     114 hscCompileOneShot :: HscEnv 
     115                   -> FilePath 
     116                   -> ModSummary 
     117                   -> SourceModified 
     118                   -> IO HscStatus 
     119hscCompileOneShot env = 
     120  fromMaybe hscCompileOneShot' 
     121    (lookupHook HscCompileOneShotHook . hooks . hsc_dflags $ env) env 
    28122 
    29 myLinkDynLibHook :: DynFlags -> [FilePath] -> [PackageId] -> IO () 
    30 myLinkDynLibHook dflags paths ids = do  
     123-- original function 
     124hscCompileOneShot' :: HscEnv 
     125                   -> FilePath 
     126                   -> ModSummary 
     127                   -> SourceModified 
     128                   -> IO HscStatus 
     129hscCompileOneShot' hsc_env extCore_filename mod_summary src_changed 
     130   = ... 
    31131}}} 
    32132 
    33 The two functions will be called whenever GHC needs to locate or link a dynamically loaded library. 
     133== List all currently available hooks == 
    34134 
    35 == The Hook datatype == 
    36  
    37 Each hook has a potentially different type from all the other hooks. Additionally, we need to be able to communicate hooks to all the locations where they may be invoked. This is achieved by storing the list of hooks in the {{{DynFlags}}}.  This, however, means that hooks cannot be defined as an ADT, as that would lead to huge cyclic imports (the data types used by the hooks will depend on {{{DynFlags}}}, but the {{{DynFlags}}} will depend on the hook data type.  Instead we {{{Hooks}}} is an untyped key-value store.  The keys are single constructor types and the {{{Hooks}}} map is indexed by their {{{TypeRep}}}.  We recover the hook type via a type family: 
    38  
    39 {{{ 
    40 --- Implementation Sketch ----------------------- 
    41 data Hook = forall a. Hook TypeRep a 
    42  
    43 type Hooks = Map TypeRep Hook 
    44  
    45 type family HookType a :: * 
    46  
    47 insertHook :: forall a. Typeable a => a -> HookType a -> Hooks -> Hooks 
    48 insertHook key value hooks = 
    49   let hook = Hook (typeOf key) value in 
    50   Map.insert (typeOf key) hook hooks 
    51  
    52 lookupHook :: forall a. Typeable a => Hooks -> Maybe (HookType a) 
    53 lookupHook hooks = 
    54   let key = typeOf (undefined :: a) in 
    55   case Map.lookup key of 
    56      Nothing -> Nothing 
    57      Just (Hook _ h) -> Just (unsafeCoerce h :: HookType a)  -- the tricky bit 
    58 }}} 
    59  
    60 {{{ 
    61 -- Defining a hook type: 
    62 data LocateLibHook = LocateLibHook deriving Typeable 
    63 type instance HookType LocateLibHook = DynFlags -> Bool -> [FilePath] -> String -> IO LibrarySpec 
    64 }}} 
    65  
    66 == TODO: List all currently available hooks == 
    67  
     135Todo, add list here, see [https://gist.github.com/luite/6444273 demo program]