Changes between Version 3 and Version 4 of LateDmd


Ignore:
Timestamp:
Aug 29, 2013 9:06:23 PM (2 years ago)
Author:
nfrisby
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • LateDmd

    v3 v4  
    1818
    1919  * Ask the community for help in determining if we should make -O2 imply -flate-dmd-anal.
     20
     21== Relation to other tickets ==
     22
     23There are some tickets documenting runtime bugs that can be cleaned up by running the demand analyzer (followed by a simplifier run) a second time at the end of the pipeline: #4941, #5302, #6087. #6070 ? Others?
    2024
    2125== Removing the clever .hi files scheme ==
     
    140144||120936||libHSbase-4.7.0.0.a ||
    141145||237088||libHSCabal-1.17.0.a ||
     146
     147=== Old performance numbers ===
     148
     149NB These were from April 2013.
     150
     151Here's the effects on nofib. Run time didn't seem to change as drastically.  The "X/Y" column headers mean "library-flags/test-flags" given to GHC when compiling the respective bit.
     152
     153{{{
     154Allocations
     155
     156-------------------------------------------------------------------------------
     157        Program                O2/O2     late-dmd+O2/O2    late-dmd+O2/late-dmd+O2
     158-------------------------------------------------------------------------------
     159   cryptarithm2             25078168           +0.0%           +8.0%
     160       nucleic2             98331744           +0.0%           +3.2%
     161
     162       cichelli             80310632           +0.0%          -22.9%
     163          fasta            401159024           -9.1%           -9.1%
     164         fulsom            321427240           +0.0%           -2.6%
     165   k-nucleotide           4125102928           -0.0%           -4.8%
     166        knights              2037984           +0.0%           -3.7%
     167        mandel2              1041840           +0.0%          -21.4%
     168        parstof              3103208           +0.0%           -1.4%
     169reverse-complem            155188304          -12.8%          -12.8%
     170         simple            226412800           -0.0%           -1.0%
     171}}}
     172All other changes less than 1% allocation.
     173Note that it improves a couple tests significantly just via changes in the base libraries.
     174
     175For cryptarithm2, (cf remarks in #4941)
     176 * 4% increase allocation is due to reboxing
     177 * 4% is due to dead closures, because the fix in #4962 isn't working for some reason.
     178
     179For nucleic2, in var_most_distant_atom, an let-bound function is inlined after w/w, and hence grows numerous closures by a significant amount. I'm not sure where to lay the blame for this. Note however, that just making nucleic2's data types use strict !Float fields changes its allocation -72.4%, so maybe this "bad practice" corner case is a small issue.