Changes between Version 18 and Version 19 of LateDmd


Ignore:
Timestamp:
Sep 5, 2013 8:04:37 AM (2 years ago)
Author:
simonpj
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • LateDmd

    v18 v19  
    11[[PageOutline]]
    22
     3= Late Demand analysis =
     4
     5
    36Notes about running demand analysis a second time, late in the pipeline.
     7
     8== Status ==
    49
    510Commits
     
    914  * 34728de0f059d8e076981448392203f2501aa120 - I updated the documentation and a source Note for -flate-dmd-anal and -ffun-to-thunk
    1015
    11 == Commit notes ==
    12 
    13 The -flate-dmd-anal flag runs the demand analysis a second time just before !CorePrep. It's not on by default yet, but we hope -O2 will eventually imply it, perhaps even for the GHC 7.8 release.
    14 
    15 The bulk of this patch merely simplifies the treatment of wrappers in interface files.
    16 
    17 == TODO ==
     16The -flate-dmd-anal flag runs the demand analysis a second time just before !CorePrep, with a subsequent run of the Simplifier.  Cf #7782.
     17
     18It's not on by default yet, but we hope -O2 will eventually imply it, perhaps even for the GHC 7.8 release.
     19
     20The bulk of this patch merely simplifies the treatment of wrappers in interface files; see "Removing the clever .hi files scheme" below.
     21
     22=== TODO ===
    1823
    1924  * Ask the performance czars and community for help in determining if we should make -O2 imply -flate-dmd-anal.
     
    2227    * To proceed: perhaps measure mode=slow on the !MacBook Pro. Also build the libraries with ticky on the big server to search for the hypothetical library function that is slowing down typecheck.
    2328
    24 == Relation to other tickets ==
     29=== Relation to other tickets ===
    2530
    2631There are some tickets documenting runtime bugs that can be cleaned up by running the demand analyzer (followed by a simplifier run) a second time at the end of the pipeline: #4941, #5302, #6087. #6070 ? Others?
    2732
     33----------------------------------
    2834== Removing the clever .hi files scheme ==
    2935
     
    126132If demand analysis removes all the value arguments from a function f in A.hs and B.hs uses that function, compilation of B.hs will crash. The problem is that the regeneration of the body of f in B will attempt to apply f to a `realWorld#` argument because there is no -ffun-to-thunk flag. However, f no longer accepts any arguments, since it was compiled with -ffun-to-thunk. Boom.
    127133
    128 == -flate-dmd-anal ==
    129 
    130 -flate-dmd-anal adds a second demand analysis with a subsequent invocation of the simplifier just before !CorePrep. Cf #7782
    131 
    132 === Effect on .hi file size and .a file size ===
     134----------------------------------
     135== Effect on .hi file size and .a file size ==
    133136
    134137The comparison in this section page uses ef017944600cf4e153aad686a6a78bfb48dea67a as the base commit — after measuring, I rebased my patch to apply it to 33c880b43ed72d77f6b1d95d5ccefbd376c78c78
     
    191194||237088||libHSCabal-1.17.0.a ||
    192195
    193 === New performance numbers ===
    194 
    195 These numbers in this section come from c080f727ba5f83921b842fcff71e9066adbdc250, building the libraries/nofib tests with various combinations of -fno-late-dmd-anal and -flate-dmd-anal.
     196-----------------------------------------
     197== Performance numbers ==
     198
     199These numbers in this section come from c080f727ba5f83921b842fcff71e9066adbdc250, building the libraries/nofib tests with various combinations of `-fno-late-dmd-anal` and `-flate-dmd-anal`.
    196200
    197201I use these abbreviations in the following tables
     
    231235}}}
    232236
    233 ==== 2.7Ghz Core i7 !MacBook Pro, 16GB memory, 64-bit ====
    234 
     237=== 64-bit !MacBook Pro ===
     238
     2392.7Ghz Core i7 !MacBook Pro, 16GB memory, 64-bit.
    235240One processor with two cores; each core has 25 KB L2 cache, with a (shared) 4MB L3 cache.
    236241
    237 ===== mode=norm !NoFibRuns=30 =====
     242==== mode=norm !NoFibRuns=30 ====
    238243
    239244{{{
     
    303308}}}
    304309
    305 ==== two processors, each 2.40GHz Xeon E5620, 12MB cache, 48GB memory, 64-bit ====
    306 
    307 cf [http://ark.intel.com/products/47925], both processors have four cores (so eight "threads" via Hyper-Threading).
    308 
    309 ===== mode=norm !NoFibRuns=30 =====
     310=== Dual 64-bit Xeon ===
     311
     312Two processors, each 2.40GHz Xeon E5620, 12MB cache, 48GB memory, 64-bit.  cf [http://ark.intel.com/products/47925], both processors have four cores (so eight "threads" via Hyper-Threading).
     313
     314==== mode=norm !NoFibRuns=30 ====
    310315
    311316{{{
     
    385390}}}
    386391
    387 ===== mode=slow !NoFibRuns=30 =====
     392==== mode=slow !NoFibRuns=30 ====
    388393
    389394{{{
     
    465470}}}
    466471
    467 === Old performance numbers ===
     472------------------------------------------
     473== Old performance numbers ==
    468474
    469475NB These were from April 2013.