Changes between Version 10 and Version 11 of Design/BuildSystem


Ignore:
Timestamp:
Nov 19, 2008 3:26:22 PM (7 years ago)
Author:
igloo
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • Design/BuildSystem

    v10 v11  
     1 
    12= Re-structuring the GHC build system = 
    23 
    3 This page describes our current plan (Aug 08) for making GHC's build system more malleable. 
    4 The basic plan is this: 
    5  
    6   * Use our own build system for the work of actually 
    7     building packages from source to .a/.so, including 
    8     preprocessing (hsc2hs et. al.). 
    9  
    10   * Use Cabal to preprocess the .cabal file and generate 
    11     metadata in the form of Makefile bindings for our 
    12     build system to use. 
    13  
    14   * Use Cabal to generate the !InstalledPackageInfo and 
    15     do registration. 
    16  
    17   * Use Cabal for Haddocking, installing, and anything else 
    18     we need to do. 
    19  
    20 The advantages of this are: 
    21  
    22   * The critical parts of the build system are under our 
    23     control, and are easily modifiable.     
    24   
    25   * Modifying the build system does not require modifying Cabal. 
    26     We rely on a stable, slowly-varying version of Cabal, not on the 
    27     leading edge.  That take pressure off the Cabal developers, 
    28     and means that GHC can use a version of Cabal that has  
    29     survived quite a bit of testing. 
    30  
    31   * Development is easier, because 'make' will preprocess files 
    32     too.  Right now if you modify a .y or .hsc file, you need 
    33     to tell Cabal to preprocess again before saying 'make' 
    34     (this is a regression from pre-Cabal). 
    35  
    36   * We can make improvements that would be hard in Cabal, such 
    37     as making libraries depend on each other. 
    38  
    39   * It ought to be easier to reinstate HC bootstrapping, 
    40     since we rely less on Cabal to get us to a .a file. 
    41  
    42   * Compared to the pre-Cabal build system, we're not 
    43     duplicating the package metadata or the code that processes it, 
    44     only the build rules. 
    45  
    46 == Detailed plan == 
    47  
    48  
    49   * Modify `cabal-bin.hs` with a new command to generate the 
    50     Makefile bindings for a package, into a file e.g.  
    51     `ghc-build.mk`.  That is, `cabal-bin.hs` imports the Cabal libraries 
    52     and uses them to support all of Cabal's existing features, plus 
    53     the new one.  
    54  
    55   * There is just one `cabal-bin` executable for 
    56     the whole GHC build tree.   
    57  
    58   * Question: should we rename `cabal-bin.hs` to `ghc-cabal.hs`?  (Simon, Manuel say: yes.) 
    59  
    60   * The version of Cabal used by `cabal-bin.hs` does not need to be an up-to-the-minute 
    61     bleeding-edge version.  It should be stable and vary slowly.  We suck a new 
    62     version of Cabal into the GHC build system manually, rather than mirroring the 
    63     Cabal HEAD. 
    64  
    65   * The makefile-generation stuff (which Duncan dislikes) can be removed from Cabal itself 
    66     by the Cabal crew at their leisure.  In effect, that code now lives in `cabal-bin.hs`. 
    67  
    68   * libraries/Makefile puts a GNUmakefile into each library 
    69     subdir, with identical contents, something like 
    70 {{{ 
    71 TOP=../.. 
    72 -include ghc-build.mk 
    73 include $(TOP)/cabal-package.mk 
    74 }}} 
    75     The boilerplate in `cabal-package.mk` is written by hand (not generated), and 
    76     contains all the make rules required implement the desired make targets. 
    77  
    78   * In each subdir we support various make targets, e.g. 
    79     * `make configure`, configures the package and uses `cabal-bin` to generate `ghc-build.mk` 
    80     * `make all`, builds the .a, and registers (using `cabal-bin`).  Builds dependencies  
    81       automatically (or perhaps not: calculating dependencies 
    82       in GHC takes a while, and traditionally we've done this on demand only). 
    83     * `make install` uses `cabal-bin`. 
    84  
    85   * libraries/Makefile just invokes `make` in the subdirs in the  
    86     appropriate order. 
    87  
    88 == Improvements for later == 
    89  
    90   * We want dependencies from every object file on the .a files of the 
    91     packages that it depends on.  This way we can make it possible to 
    92     modify a library module and say 'make' and have everything rebuilt  
    93     that needs to be rebuilt (including the stage2 compiler).  Note that 
    94     we need to know about indirect as well as direct package dependencies. 
    95  
    96   * Build multiple libraries in parallel 
    97  
    98   * It should be possible to "make distclean" without configuring first. 
    99     Mostly this just means that we need to, for example, remove both 
    100     `prog` and `prog.exe` when cleaning, as we don't know if we are on 
    101     Windows or not. 
    102  
    103   * The "vanilla" way should actually have a name. Currently, there's no 
    104     nice way to /only/ build something the profiling way, as you can't 
    105     just not put "v" in WAYS. 
    106  
    107   * We could simplify a lot of stuff if all of the inplace installations 
    108     went into TOP/inplace, and that had the same layout as an installed tree. 
    109     The main thing holding us back is that both 
    110     stage1 and stage2 ghc currently want a binary called "ghc". 
     4This page describes our current plan for making GHC's build system more 
     5malleable.  
     6 
     7== Design goals == 
     8 
     9 * The build system should not only be easy to use for ''users'' (does 
     10   the right thing with the minimum of intervention) but also for 
     11   ''developers''.  This means: 
     12    * when something changes, it is easy to bring the build up to date 
     13      quickly to test the change, with the option of just updating the 
     14      part of the system being changed (library module, GHC module, 
     15      tool, etc.)  In all cases a simple 'make' should bring the part 
     16      of the system in the current directory up to date, or complain 
     17      if that can't be done for some reason. 
     18    * easy to build individual modules, and add extra flags 
     19      (e.g. -v, -ddump-simpl) 
     20    * easy to modify the build system.  If you modify some part of the 
     21      build system, it should take immediate effect: no having to 
     22      remove bits and rebuild them manually to get changes to take 
     23      effect. 
     24    * as little "state" in the build as possible: e.g. avoid stamp 
     25      files unless they are just join points for dependencies.  Stamp 
     26      files should be invisible to the developer if they are needed at 
     27      all. 
     28    * The build system should be tractable: if it doesn't do what you 
     29      expect, there should be a clear path to understanding why and 
     30      how to fix it.  Extra make rules are sometimes good for this: 
     31      e.g. we currently have 'make show VALUE=VAR' which is a godsend. 
     32 
     33 * The build should be as parallel as possible 
     34 
     35 * The build system should support bootstrapping from HC files, 
     36   something that hasn't worked since 6.6.1. 
     37 
     38 * The build should not emit any warnings unless something is actually 
     39   wrong (they cause concerned users). 
     40 
     41 * The build system should be well documented on the wiki.  We should 
     42   pay special attention to the documentation for building on Windows: 
     43   it's currently far too verbose, complex, and out-of-date.  Ideally 
     44   there should be a short section on how to prepare a Windows system 
     45   for building GHC (in the section on pre-requisites), and perhaps 
     46   short Windows-specific notes throughout the rest of the docs. 
     47   Putting it all in one place risks duplication and things getting 
     48   outdated. 
     49 
     50 * The build system should clearly report what it's doing (and 
     51   sometimes why), without being too verbose.  It should emit actual 
     52   command lines as much as possible, so that they can be inspected 
     53   and cut & pasted. 
     54 
     55 * We should express as many dependencies as possible in the build 
     56   system, but occasionally removing a few edges is prudent.  For 
     57   example, if we have rebuilt some libraries, we might not want that 
     58   to immediately invalidate the whole stage 2 GHC build, since the 
     59   binary still works (as long as we're using static libraries...). 
     60   However, if we need to rebuild ''any'' module in stage 2 and a 
     61   library has changed, it's probably a good idea to rebuild them all 
     62   at that point, because otherwise the resulting binary will likely 
     63   be broken.  GHC will probably say "compilation IS NOT required" for 
     64   most modules anyway. 
     65 
     66== Basic plan == 
     67 
     68 * Use our own build system for the work of actually building packages 
     69   from source to .a/.so, including preprocessing (hsc2hs et. al.).  
     70 
     71 * Use Cabal to preprocess the .cabal file and generate metadata in the 
     72   form of Makefile bindings for our build system to use.  
     73 
     74 * Use Cabal to generate the !InstalledPackageInfo. 
     75 
     76 * We do registration and installation using Makefile rules. 
     77 
     78 * Use Cabal for Haddocking, and anything else we need to do.  
     79 
     80 The advantages of this are: 
     81 
     82 * The critical parts of the build system are under our control, and are 
     83   easily modifiable.  
     84 
     85 * Modifying the build system does not require modifying Cabal. We rely 
     86   on a stable, slowly-varying version of Cabal, not on the leading edge. 
     87   That take pressure off the Cabal developers, and means that GHC can use 
     88   a version of Cabal that has survived quite a bit of testing.  
     89 
     90 * Development is easier, because 'make' will preprocess files too. Right 
     91   now if you modify a .y or .hsc file, you need to tell Cabal to 
     92   preprocess again before saying 'make' (this is a regression from 
     93   pre-Cabal).  
     94 
     95 * We can make improvements that would be hard in Cabal, such as making 
     96   libraries depend on each other.  
     97 
     98 * It ought to be easier to reinstate HC bootstrapping, since we rely 
     99   less on Cabal to get us to a .a file.  
     100 
     101 * Compared to the pre-Cabal build system, we're not duplicating the 
     102   package metadata or the code that processes it, only the build rules.  
     103 
     104== Detailed plan ==  
     105 
     106 * Rename cabal-bin.hs to ghc-cabal.hs, and move it into utils/ghc-cabal/ 
     107 
     108 * The version of Cabal used by ghc-cabal.hs does not need to be an 
     109   up-to-the-minute bleeding-edge version. It should be stable and vary 
     110   slowly. We suck a new version of Cabal into the GHC build system 
     111   manually, rather than mirroring the Cabal HEAD.  
     112 
     113 * Rather than installing things in-place all over the build tree, we 
     114   will have a single inplace directory at the root of the tree. The 
     115   structure inside this directory will match that of the normal install, 
     116   which will simplify various things. There are two slight wrinkles: 
     117   * The tree will not be complete; for example, the libraries will be 
     118     registered in-place in their dist directories 
     119   * Rather than inplace/bin/ghc, we will have inplace/bin/ghc-stage[123] 
     120   Tools like genprimopcode, genapply etc. will probably also go into 
     121   inplace/bin, in order to make the makefiles more consistent. 
     122 
     123 * The build order looks something like: 
     124     * With bootstrapping compiler: 
     125       * Build libraries/{filepath,Cabal} 
     126       * Build utils/ghc-cabal 
     127     * With bootstrapping compiler and ghc-cabal: 
     128       * Build utils/hsc2hs 
     129       * Build libraries/hpc 
     130       * Build compiler (stage 1) 
     131     * With stage 1: 
     132       * Build libraries/* 
     133       * Build utils/* (except haddock) 
     134       * Build compiler (stage 2) 
     135     * With stage 2: 
     136       * Build utils/haddock 
     137       * Build compiler (stage 3) 
     138     * With haddock: 
     139       * libraries/* 
     140       * compiler 
     141   Currently, with recursive make, this means we jump around between 
     142   Makefiles a lot, which isn't good for parallelism in the build. 
     143   Instead, we want to move all the logic and dependencies into the root 
     144   Makefile (or files that get included into it) so that make sees all of 
     145   it together. 
     146 
     147   One concern is that make may take a long time thinking if it can see 
     148   the rules for the whole system, even when only asked to build a single 
     149   file. We will have to see how well it performs in practice. 
     150 
     151 * But we still want "make" to work in subdirectories, so for example the 
     152   Makefile (actually GNUmakefile, to avoid colliding with Makefile in 
     153   libraries like Cabal) in libraries/base might look like 
     154 
     155{{{ 
     156.NOTPARALLEL 
     157 
     158.PHONY: default 
     159default: dist/build/libbase.a 
     160    @: 
     161 
     162# Note that this rule also generates ghc.mk if it doesn't exist 
     163%: 
     164    $(MAKE) -C ../.. libraries/base/$@ 
     165}}} 
     166 
     167   (ghc.mk is discussed later). In actual fact, GNUmakefile will want to 
     168   be more complicated, to handle "make way=v", "make way=p", "make doc", 
     169   etc. Where possible, the make code will be "include"d in, rather than 
     170   generated, so as to make it easier to deal with. 
     171 
     172   We need the .NOTPARALLEL or if you say "make foo bar" then the two 
     173   recursive make calls might both make "quux" (a dependency of foo and 
     174   bar) at the same time. The main Makefile will be able to do work in 
     175   parallel when building each of foo and bar, though. The common case, 
     176   where you only specify 0 or 1 targets, doesn't lose any parallelism. 
     177 
     178 * In e.g. utils/ghc-pkg, the default target will be 
     179{{{ 
     180default: 
     181    $(MAKE) -C ../.. inplace/bin/ghc-pkg 
     182}}} 
     183   and inplace/bin/ghc-pkg will in turn depend on something like 
     184   utils/ghc-pkg/dist-inplace/build/ghc-pkg. It would be possible to 
     185   put the binary in inplace/bin directly, but at the cost of diverging 
     186   from Cabal's filesystem layout. Also, it would be a little odd to 
     187   install from inplace/. On *nix machines we can make the inplace files 
     188   symlinks. 
     189 
     190 * It should be possible to "make distclean" without configuring first. 
     191   Distcleaning should be much simpler now: Just remove 
     192   * all the generated makefiles 
     193   * all the dist directories 
     194   * the inplace directory. 
     195   * files generated by configure 
     196   * generated makefiles 
     197   * and doubtless a few other bits and pieces 
     198 
     199 * There are a number of tools, for example ghc and ghc-pkg, which we 
     200   build with both the bootstrapping compiler (for use during the build) 
     201   and the in-tree compiler (to be installed). When you run "make" in the 
     202   tool's directory, only the earliest version will be built by default. 
     203 
     204 * The rules (in ghc.mk) for actually building the foo library (which 
     205   depends on the bar library) will look something like: 
     206 
     207{{{ 
     208all: libraries/foo/dist/build/foo.a 
     209all: libraries/foo/dist/doc/foo.haddock 
     210 
     211LIBRARY_foo_HS_FILES = libraries/foo/dist/build/Bar.hs \ 
     212                       libraries/foo/dist/build/Quux.lhs 
     213LIBRARY_foo_v_O_FILES = libraries/foo/dist/build/Bar.o \ 
     214                        libraries/foo/dist/build/Quux.o 
     215LIBRARY_foo_v_HI_FILES = libraries/foo/dist/build/Bar.hi \ 
     216                         libraries/foo/dist/build/Quux.hi 
     217LIBRARY_foo_v_A_FILE = libraries/foo/dist/build/foo.a 
     218 
     219libraries/foo/dist/build/Quux.o libraries/foo/dist/build/Quux.hi: \ 
     220    libraries/foo/dist/build/Bar.hi 
     221 
     222libraries/foo/dist/build/%.hs: libraries/foo/%.hs 
     223    cp $< $@ 
     224# and a duplicate rule for .lhs. We can use ln instead of cp on 
     225# *nix. 
     226 
     227libraries/foo/dist/build/%.hs: libraries/foo/%.y inplace/bin/happy 
     228    inplace/bin/happy ... 
     229# and alex, hsc2hs, etc 
     230 
     231libraries/foo/dist/build/%.o: libraries/foo/dist/build/%.hs \ 
     232                              $(LIBRARY_bar_v_HI_FILES) \ 
     233                              inplace/bin/ghc-stage1 
     234    inplace/bin/ghc-stage1 -c ^< -o ^@ # -hidir etc 
     235# and a duplicate rule for .lhs. This rule can probably be 
     236# generalised to handle building of all libraries .o files, 
     237# and perhaps even all .o files. 
     238 
     239.DELETE_ON_ERROR: $(LIBRARY_foo_v_A_FILE) 
     240 
     241$(LIBRARY_foo_v_A_FILE): inplace/bin/ghc-pkg \ 
     242                         libraries/foo/dist/inplace-config \ 
     243                         $(LIBRARY_foo_v_O_FILES) \ 
     244                         $(LIBRARY_bar_v_A_FILE) 
     245    ghc $(LIBRARY_foo_v_O_FILES) -package bar -o $@ 
     246    inplace/bin/ghc-cabal --in-dir libraries/foo register 
     247 
     248libraries/foo/dist/doc/foo.haddock: inplace/bin/haddock 
     249    inplace/bin/ghc-cabal --in-dir libraries/foo haddock 
     250}}} 
     251 
     252   and the rules for building tools and GHC will look similar. 
     253   There will also need to be rules for profiling libraries etc. 
     254   For the tools, there will be copies of the rules for the "in-place" 
     255   and "install" dist directories (or perhaps we have ghc-inplace.mk 
     256   and ghc-install.mk). 
     257 
     258 * The above makefile gets generated by something like 
     259 
     260{{{ 
     261include libraries/foo/ghc.mk 
     262 
     263libraries/%/ghc.mk: inplace/bin/ghc-cabal 
     264    inplace/bin/ghc-cabal --in-dir $* generate $@ 
     265}}} 
     266 
     267   in the top-level Makefile. 
     268 
     269 * We don't want to require that the libraries ship part of the GHC build 
     270   system in their tarballs, so instead we will generate the GNUmakefile's 
     271   during ./configure. 
     272 
     273 * The "Setup makefile" command can be removed from Cabal. 
     274 
     275 * ghc-pkg will be extended to be able to register libraries compiled 
     276   different ways separately. As well as making the dependencies workable 
     277   in the makefiles, this will also allow cabal-install to work better, 
     278   at the risk of allowing library variants to desync more easily. 
     279 
     280 * Does ghc-pkg currently cope with being called twice simultaneously? 
     281   We should move to using a directory of package.conf files too.