Changes between Version 18 and Version 19 of LateDmd


Ignore:
Timestamp:
Sep 5, 2013 8:04:37 AM (7 months ago)
Author:
simonpj
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • LateDmd

    v18 v19  
    11[[PageOutline]] 
    22 
     3= Late Demand analysis = 
     4 
     5 
    36Notes about running demand analysis a second time, late in the pipeline. 
     7 
     8== Status ==  
    49 
    510Commits 
     
    914  * 34728de0f059d8e076981448392203f2501aa120 - I updated the documentation and a source Note for -flate-dmd-anal and -ffun-to-thunk 
    1015 
    11 == Commit notes == 
    12  
    13 The -flate-dmd-anal flag runs the demand analysis a second time just before !CorePrep. It's not on by default yet, but we hope -O2 will eventually imply it, perhaps even for the GHC 7.8 release. 
    14  
    15 The bulk of this patch merely simplifies the treatment of wrappers in interface files. 
    16  
    17 == TODO == 
     16The -flate-dmd-anal flag runs the demand analysis a second time just before !CorePrep, with a subsequent run of the Simplifier.  Cf #7782. 
     17 
     18It's not on by default yet, but we hope -O2 will eventually imply it, perhaps even for the GHC 7.8 release. 
     19 
     20The bulk of this patch merely simplifies the treatment of wrappers in interface files; see "Removing the clever .hi files scheme" below. 
     21 
     22=== TODO === 
    1823 
    1924  * Ask the performance czars and community for help in determining if we should make -O2 imply -flate-dmd-anal. 
     
    2227    * To proceed: perhaps measure mode=slow on the !MacBook Pro. Also build the libraries with ticky on the big server to search for the hypothetical library function that is slowing down typecheck. 
    2328 
    24 == Relation to other tickets == 
     29=== Relation to other tickets === 
    2530 
    2631There are some tickets documenting runtime bugs that can be cleaned up by running the demand analyzer (followed by a simplifier run) a second time at the end of the pipeline: #4941, #5302, #6087. #6070 ? Others? 
    2732 
     33---------------------------------- 
    2834== Removing the clever .hi files scheme == 
    2935 
     
    126132If demand analysis removes all the value arguments from a function f in A.hs and B.hs uses that function, compilation of B.hs will crash. The problem is that the regeneration of the body of f in B will attempt to apply f to a `realWorld#` argument because there is no -ffun-to-thunk flag. However, f no longer accepts any arguments, since it was compiled with -ffun-to-thunk. Boom. 
    127133 
    128 == -flate-dmd-anal == 
    129  
    130 -flate-dmd-anal adds a second demand analysis with a subsequent invocation of the simplifier just before !CorePrep. Cf #7782 
    131  
    132 === Effect on .hi file size and .a file size === 
     134---------------------------------- 
     135== Effect on .hi file size and .a file size == 
    133136 
    134137The comparison in this section page uses ef017944600cf4e153aad686a6a78bfb48dea67a as the base commit — after measuring, I rebased my patch to apply it to 33c880b43ed72d77f6b1d95d5ccefbd376c78c78 
     
    191194||237088||libHSCabal-1.17.0.a || 
    192195 
    193 === New performance numbers === 
    194  
    195 These numbers in this section come from c080f727ba5f83921b842fcff71e9066adbdc250, building the libraries/nofib tests with various combinations of -fno-late-dmd-anal and -flate-dmd-anal. 
     196----------------------------------------- 
     197== Performance numbers == 
     198 
     199These numbers in this section come from c080f727ba5f83921b842fcff71e9066adbdc250, building the libraries/nofib tests with various combinations of `-fno-late-dmd-anal` and `-flate-dmd-anal`. 
    196200 
    197201I use these abbreviations in the following tables 
     
    231235}}} 
    232236 
    233 ==== 2.7Ghz Core i7 !MacBook Pro, 16GB memory, 64-bit ====  
    234  
     237=== 64-bit !MacBook Pro ===  
     238 
     2392.7Ghz Core i7 !MacBook Pro, 16GB memory, 64-bit. 
    235240One processor with two cores; each core has 25 KB L2 cache, with a (shared) 4MB L3 cache. 
    236241 
    237 ===== mode=norm !NoFibRuns=30 ===== 
     242==== mode=norm !NoFibRuns=30 ==== 
    238243 
    239244{{{ 
     
    303308}}} 
    304309 
    305 ==== two processors, each 2.40GHz Xeon E5620, 12MB cache, 48GB memory, 64-bit ====  
    306  
    307 cf [http://ark.intel.com/products/47925], both processors have four cores (so eight "threads" via Hyper-Threading). 
    308  
    309 ===== mode=norm !NoFibRuns=30 ===== 
     310=== Dual 64-bit Xeon ===  
     311 
     312Two processors, each 2.40GHz Xeon E5620, 12MB cache, 48GB memory, 64-bit.  cf [http://ark.intel.com/products/47925], both processors have four cores (so eight "threads" via Hyper-Threading). 
     313 
     314==== mode=norm !NoFibRuns=30 ==== 
    310315 
    311316{{{ 
     
    385390}}} 
    386391 
    387 ===== mode=slow !NoFibRuns=30 ===== 
     392==== mode=slow !NoFibRuns=30 ==== 
    388393 
    389394{{{ 
     
    465470}}} 
    466471 
    467 === Old performance numbers === 
     472------------------------------------------ 
     473== Old performance numbers == 
    468474 
    469475NB These were from April 2013.