Changes between Version 3 and Version 4 of LateDmd


Ignore:
Timestamp:
Aug 29, 2013 9:06:23 PM (8 months ago)
Author:
nfrisby
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • LateDmd

    v3 v4  
    1818 
    1919  * Ask the community for help in determining if we should make -O2 imply -flate-dmd-anal. 
     20 
     21== Relation to other tickets == 
     22 
     23There are some tickets documenting runtime bugs that can be cleaned up by running the demand analyzer (followed by a simplifier run) a second time at the end of the pipeline: #4941, #5302, #6087. #6070 ? Others? 
    2024 
    2125== Removing the clever .hi files scheme == 
     
    140144||120936||libHSbase-4.7.0.0.a || 
    141145||237088||libHSCabal-1.17.0.a || 
     146 
     147=== Old performance numbers === 
     148 
     149NB These were from April 2013. 
     150 
     151Here's the effects on nofib. Run time didn't seem to change as drastically.  The "X/Y" column headers mean "library-flags/test-flags" given to GHC when compiling the respective bit. 
     152 
     153{{{ 
     154Allocations 
     155 
     156------------------------------------------------------------------------------- 
     157        Program                O2/O2     late-dmd+O2/O2    late-dmd+O2/late-dmd+O2 
     158------------------------------------------------------------------------------- 
     159   cryptarithm2             25078168           +0.0%           +8.0% 
     160       nucleic2             98331744           +0.0%           +3.2% 
     161 
     162       cichelli             80310632           +0.0%          -22.9% 
     163          fasta            401159024           -9.1%           -9.1% 
     164         fulsom            321427240           +0.0%           -2.6% 
     165   k-nucleotide           4125102928           -0.0%           -4.8% 
     166        knights              2037984           +0.0%           -3.7% 
     167        mandel2              1041840           +0.0%          -21.4% 
     168        parstof              3103208           +0.0%           -1.4% 
     169reverse-complem            155188304          -12.8%          -12.8% 
     170         simple            226412800           -0.0%           -1.0% 
     171}}} 
     172All other changes less than 1% allocation. 
     173Note that it improves a couple tests significantly just via changes in the base libraries. 
     174 
     175For cryptarithm2, (cf remarks in #4941) 
     176 * 4% increase allocation is due to reboxing 
     177 * 4% is due to dead closures, because the fix in #4962 isn't working for some reason. 
     178 
     179For nucleic2, in var_most_distant_atom, an let-bound function is inlined after w/w, and hence grows numerous closures by a significant amount. I'm not sure where to lay the blame for this. Note however, that just making nucleic2's data types use strict !Float fields changes its allocation -72.4%, so maybe this "bad practice" corner case is a small issue.