Opened 3 years ago
Closed 3 years ago
#9564 closed bug (invalid)
Floating point subnormals overrounded on output
Reported by: | jrp | Owned by: | |
---|---|---|---|
Priority: | normal | Milestone: | |
Component: | Runtime System | Version: | 7.8.3 |
Keywords: | Cc: | simonmar | |
Operating System: | MacOS X | Architecture: | x86_64 (amd64) |
Type of failure: | Incorrect result at runtime | Test Case: | |
Blocked By: | Blocking: | ||
Related Tickets: | Differential Rev(s): | ||
Wiki Page: |
Description
The following test suggests that GHC is overrounding Float subnormals, when printing them out
Prelude> 2**(-149) 1.401298464324817e-45 Prelude> 2**(-149) :: Float 1.0e-45 Prelude> isDenormalized it True Prelude> Prelude> 2**(-149) * 2**(100) * 2**(49) :: Float 1.0
By comparison
#include <stdio.h> #include <stdlib.h> int main (void) { float clang_dec, clang_hex, libc_dec, libc_hex; clang_dec = 1.401298464324817e-45; clang_hex = 0x0.000002P-126; printf("clang (from decimal) = %a\n",clang_dec); printf("clang (from decimal) = %g\n",clang_dec); printf("clang (from hex) = %a\n",clang_hex); printf("clang (from hex) = %g\n",clang_hex); libc_dec = strtod("1.401298464324817e-45",NULL); libc_hex = strtod("0x0.000002P-126",NULL); printf("libc (from decimal) = %a\n",libc_dec); printf("libc (from decimal) = %g\n",libc_dec); printf("libc (from hex) = %a\n",libc_hex); printf("libc (from hex) = %g\n",libc_hex); }
produces
clang (from decimal) = 0x1p-149 clang (from decimal) = 1.4013e-45 clang (from hex) = 0x1p-149 clang (from hex) = 1.4013e-45 libc (from decimal) = 0x1p-149 libc (from decimal) = 1.4013e-45 libc (from hex) = 0x1p-149 libc (from hex) = 1.4013e-45
Change History (8)
comment:1 Changed 3 years ago by
comment:2 Changed 3 years ago by
Overrounding in the sense that I would expect something like 1.4013e-45 rather than 1.0e-45.
I have not delved into the specs for Show Float / Double.
comment:3 Changed 3 years ago by
the standard defines the read/show for Float/Double as
instance Show Float where showsPrec p = showFloat instance Read Float where readsPrec p = readSigned readFloat instance Show Double where showsPrec p = showFloat instance Read Double where readsPrec p = readSigned readFloat
at the bottom of https://www.haskell.org/onlinereport/haskell2010/haskellch9.html#x16-1710009
then per https://www.haskell.org/onlinereport/haskell2010/haskellch38.html#x46-31400038
showFloat :: RealFloat a => a -> ShowS Show a signed RealFloat value to full precision using standard decimal notation for arguments whose absolute value lies between 0.1 and 9,999,999, and scientific notation otherwise.
then looking at the stuff in 4.7.0.1 base http://hackage.haskell.org/package/base-4.7.0.1/docs/src/GHC-Float.html#showFloat
-- | Show a signed 'RealFloat' value to full precision -- using standard decimal notation for arguments whose absolute value lies -- between @0.1@ and @9,999,999@, and scientific notation otherwise. showFloat :: (RealFloat a) => a -> ShowS showFloat x = showString (formatRealFloat FFGeneric Nothing x)
I'll try to dig into this a teeny bit more, but I think thank as long as
(read . show) :: Float -> Float
acts as the identity function on all floating point values when we roundtrip them. (and as long as they get correctly parsed to that same internal value when read/showed between haskell and another language)
phrased differently, if we dont have roughly that
read_hs . show_clang == read_clang . show_hs == read_clang . show_clang == read_hs . show_hs == id
then yes we have a problem, but we dont quite show that problem with theses tests as above, right? We just demonstrate that the particular choice in default representation for show differs from C, right? Its important to remember that parsing a floating point number itself will do rounding to the nearest floating point value too.
i'm trying to focus on some work work this week, but if someone could test that these roundtripining identities work out ok, that'd be awesome (I may try to do it soon myself, but who knows how long that will take to happen)
comment:4 Changed 3 years ago by
Are we not at risk of overcomplicating this?
Raskell on my iPad (hugs, I think) shows
2**(-149) 1.40129846432483e-45 2**(-149)::Float 1.401298e-45
which is more what I would expect.
I don't know whether GHC generates floating point representations natively or whether it uses a C library, but what I would want would be compatibility with many other language implementations such as C, java, PHP, Python, etc, and use the library that has a pedigree that is nearly as long as Haskell's -- David Gay's http://www.netlib.org/fp/dtoa.c for floating point conversions. As well as greater chance of interoperability, this would have the additional side-benefit of providing hex floating point literals without resorting to Template Haskell when you ned to be able to i/o exact floating point representations.
comment:5 Changed 3 years ago by
These extra digits past the decimal point suggest a degree of precision that doesn't exist in a Float: for instance 1.4013e-45 is the same Float as 1.4014e-45. In fact all the numbers 1.0e-45 through 2.0e-45 (and a bit farther in each direction) round to the same Float, so why not output the simplest one, 1.0e-45? This is the same logic by which we display the Float 1/10 as 0.1 and not, say, 0.10000000149 even though the latter is closer to the true value.
printf's %g is not a format you can round-trip through: by default it always prints 6 digits (in this case the last digit would be 0 and is dropped). So it's not a good analogue of show
. (Also, %g is printing a double, not a float, but it has the same behavior with 2^{-1074}, the smallest positive double.)
Python has similar behavior to GHC: it doesn't have single-precision floating point AFAIK, but for doubles, it displays 2^{-1074} as 5e-324, just like GHC displays 5.0e-324. Arguably we could drop the ".0" part, but I don't think that's worth changing.
comment:6 Changed 3 years ago by
Owner: | simonmar deleted |
---|
comment:7 Changed 3 years ago by
For the record, looking through the code, libraries/base/GHC/Float.lhs seems to use FloatToDigits
to generate the decimal representation. The algorithm used is "based on"
"Printing Floating-Point Numbers Quickly and Accurately"-- by R.G. Burger and R.K. Dybvig in PLDI 96. http://www.cs.indiana.edu/~dyb/pubs/FP-Printing-PLDI96.pdf This version uses a much slower logarithm estimator. It should be improved.
In turn, that algorithm was based on "How to Print Floating-Point Numbers Accurately" by Steele and White http://kurtstephens.com/files/p372-steele.pdf but it does not distinguish between significand and insignifican trailing zeros.
The latest approach to conversion appears to be "Printing Floating-Point Numbers Quickly and Accurately with Integers" by Florian Loitsch http://www.cs.tufts.edu/~nr/cs257/archive/florian-loitsch/printf.pdf (Bryan O'Sullivan http://www.serpentine.com/blog/2011/06/29/here-be-dragons-advances-in-problems-you-didnt-even-know-you-had/ has provided a library for the Double version http://hackage.haskell.org/package/double-conversion.)
For a full set of references, including to source code, see http://www.ryanjuckett.com/programming/printing-floating-point-numbers/
I'll have a further think about this, including rwbarton's comments.
comment:8 Changed 3 years ago by
Resolution: | → invalid |
---|---|
Status: | new → closed |
Revisiting this ticket, I still think that GHC's output 1.0e-45
is correct, so closing.
Overrounding in what sense? Do we have a spec for the Show Float, Show Double instances? (Honest question.)
I thought that we tried to show the fewest number of significant figures that would round unambiguously to the true value of the floating-point number (which is
1.40129846432481707092372958328991613128026194187651577175706828388979108268586060148663818836212158203125e-45
in this case). In that case1.0e-45
would be correct since1.401...e-45
is the smallest positive Float and1.0e-45
is more than half of it. However, I don't recall where I got that impression.