Opened 5 years ago

Last modified 4 weeks ago

#4092 new feature request

Floating point manipulation : ulp and coerce IEEE-754 Double# into Word64#

Reported by: malosh Owned by: bgamari
Priority: normal Milestone: 7.12.1
Component: Compiler Version: 6.12.2
Keywords: Cc: daniel.is.fischer@…, midfield@…, dterei, iridcode@…, johan.tibell@…, acfoltzer@…, me@…, mail@…, jystic@…
Operating System: Unknown/Multiple Architecture: Unknown/Multiple
Type of failure: None/Unknown Test Case:
Blocked By: Blocking:
Related Tickets: Differential Revisions:

Description

There are currently two ways to compute the ulp of double numbers with GHC :

  • Calling a C function, which requires to allocate a pointer. This is way too expensive when using interval arithmetic (for instance), that compute two ULPs at each arithmetic operations.
  • Programming it by hand in haskell with GHC primitive operations, which requires using unsafeCoerce# : this does not work in GHC 6.12.2.

unsafeCoerce# should work, and there should be a primitive ulp# function in GHC, operating on Doubles at least. By the way, here is my haskell code using C for computing it :

foreign import ccall unsafe "math.h frexp" c_frexp::CDouble->(Ptr CInt)->IO ()
foreign import ccall unsafe "math.h ldexp" c_ldexp::CDouble->CInt->IO CDouble
ulp::Double->Double
ulp x=unsafePerformIO $ do
  expon<-alloca (\e->do
                    c_frexp (realToFrac x) e
                    peek e)
  (c_ldexp 0.5 $ expon-52) >>= return.realToFrac

Attachments (1)

partial-4092.patch (8.4 KB) - added by simonmar 4 years ago.

Download all attachments as: .zip

Change History (23)

comment:1 Changed 5 years ago by simonmar

  • Milestone set to 6.14.1

What I had in mind was something like:

coerceDoubleToWord64# :: Double# -> Word64#
coerceFloatToWord32#  :: Float#  -> Word#

(note we don't have a Word32# type)

By the way, you can do this without the FFI right now, by poking the Double into memory, and then peeking it out as a Word64. Not terribly efficient, but better than using the FFI.

comment:2 Changed 5 years ago by daniel.is.fischer

  • Cc daniel.is.fischer@… added

I would like to see those coercions and their inverses, too.

comment:3 Changed 5 years ago by midfield

  • Cc midfield@… added

comment:4 Changed 5 years ago by igloo

  • Milestone changed from 7.0.1 to 7.0.2

comment:5 Changed 4 years ago by igloo

  • Milestone changed from 7.0.2 to 7.2.1

comment:6 Changed 4 years ago by dterei

  • Cc dterei added

comment:7 Changed 4 years ago by SimonMeier

  • Cc iridcode@… added

I would like to see those coercions and their inverses, too.

comment:8 Changed 4 years ago by tibbe

  • Cc johan.tibell@… added

Changed 4 years ago by simonmar

comment:9 Changed 4 years ago by simonmar

partial patch for this attached. I ran into difficulty because there's no way to allocate temporary memory for the conversion in the native code generator. There are various ways to hack around this (use a spare slot in StgReg, allocate a new word in the data segment) but none are very nice. We really just want to allocate a slot from the spill area, but that requires cooperation from the register allocator.

comment:10 Changed 4 years ago by acfoltzer

  • Cc acfoltzer@… added

comment:11 Changed 4 years ago by igloo

  • Milestone changed from 7.2.1 to 7.4.1

comment:12 Changed 3 years ago by igloo

  • Milestone changed from 7.4.1 to 7.6.1

comment:13 Changed 3 years ago by igloo

  • Milestone changed from 7.6.1 to 7.6.2

comment:14 Changed 2 years ago by lelf

  • Cc me@… added
  • difficulty set to Unknown

comment:15 Changed 2 years ago by carter

Wait, unsafecoerce# doesn't work here? Oh, is this because we (in general) need to move things between the registers used for floating point vs word values

Last edited 2 years ago by carter (previous) (diff)

comment:16 Changed 15 months ago by nh2

  • Cc mail@… added

Would be very useful.

comment:17 Changed 13 months ago by thoughtpolice

  • Milestone changed from 7.6.2 to 7.10.1

Moving to 7.10.1.

comment:18 Changed 10 months ago by jystic

  • Cc jystic@… added

comment:19 Changed 7 months ago by thoughtpolice

  • Milestone changed from 7.10.1 to 7.12.1

Moving to 7.12.1 milestone; if you feel this is an error and should be addressed sooner, please move it back to the 7.10.1 milestone.

comment:20 Changed 5 weeks ago by duncan

In the meantime, in the absence of primops, this is the best I can manage. I suggest we add this or similar to GHC.Float:

{-# INLINE castWord2Float #-}
castWord2Float :: Word32 -> Float
castWord2Float (W32# w#) = F# (castWord2Float# w#)

{-# NOINLINE castWord2Float# #-}
castWord2Float# :: Word# -> Float#
castWord2Float# w# =
    case newByteArray# 4# realWorld# of
      (# s', mba# #) ->
        case writeWord32Array# mba# 0# w# s' of
          s'' ->
            case readFloatArray# mba# 0# s'' of
              (# _, f# #) -> f#

{-# INLINE castWord2Double #-}
castWord2Double :: Word64 -> Double
castWord2Double (W64# w#) = D# (castWord2Double# w#)

{-# NOINLINE castWord2Double# #-}
castWord2Double# :: Word# -> Double#
castWord2Double# w# =
    case newByteArray# 8# realWorld# of
      (# s', mba# #) ->
        case writeWord64Array# mba# 0# w# s' of
          s'' ->
            case readDoubleArray# mba# 0# s'' of
              (# _, f# #) -> f#

This is similar to the "cast STUArray" method, but avoids the extra call and closure allocation due to the runSTRep. For the "cast STUArray" method, see:

http://hackage.haskell.org/package/reinterpret-cast-0.1.0/docs/src/Data-ReinterpretCast-Internal-ImplArray.html

The NOINLINE means that the use of realWorld# should be ok, despite newByteArray# 8# realWorld# being a constant.

It'll need a very similar impl for 32bit systems that need the Word64# type.

Compare the CMM of the above:

castWord2Double#_entry() //  [R2]
         { info_tbl: [(c2Qn,
                       label: castWord2Double#_info
                       rep:HeapRep static { Fun {arity: 1 fun_type: ArgSpec 4} })]
           stack_info: arg_space: 8 updfr_space: Just 8
         }
     {offset
       c2Qn:
           Hp = Hp + 24;
           if (Hp > HpLim) goto c2Qr; else goto c2Qq;
       c2Qr:
           HpAlloc = 24;
           R2 = R2;
           R1 = castWord2Double#_closure;
           call (stg_gc_fun)(R2, R1) args: 8, res: 0, upd: 8;
       c2Qq:
           I64[Hp - 16] = stg_ARR_WORDS_info;
           I64[Hp - 8] = 8;
           _s2Q9::P64 = Hp - 16;
           I64[_s2Q9::P64 + 16] = R2;
           D1 = F64[_s2Q9::P64 + 16];
           call (P64[Sp])(D1) args: 8, res: 0, upd: 8;
     }
 }

with the version that uses runST / runSTRep

 sat_s2QX_entry() //  [R1]
         { info_tbl: [(c2Rd,
                       label: sat_s2QX_info
                       rep:HeapRep 1 nonptrs { Fun {arity: 1 fun_type: ArgSpec 3} })]
           stack_info: arg_space: 8 updfr_space: Just 8
         }
     {offset
       c2Rd:
           Hp = Hp + 40;
           if (Hp > HpLim) goto c2Rj; else goto c2Ri;
       c2Rj:
           HpAlloc = 40;
           R1 = R1;
           call (stg_gc_fun)(R1) args: 8, res: 0, upd: 8;
       c2Ri:
           _s2QN::I64 = I64[R1 + 7];
           I64[Hp - 32] = stg_ARR_WORDS_info;
           I64[Hp - 24] = 8;
           _s2QR::P64 = Hp - 32;
           I64[_s2QR::P64 + 16] = _s2QN::I64;
           _s2QV::F64 = F64[_s2QR::P64 + 16];
           I64[Hp - 8] = D#_con_info;
           F64[Hp] = _s2QV::F64;
           R1 = Hp - 7;
           call (P64[Sp])(R1) args: 8, res: 0, upd: 8;
     }
 },
 castWord2Double#_entry() //  [R2]
         { info_tbl: [(c2Rk,
                       label: castWord2Double#_info
                       rep:HeapRep static { Fun {arity: 1 fun_type: ArgSpec 4} })]
           stack_info: arg_space: 8 updfr_space: Just 8
         }
     {offset
       c2Rk:
           Hp = Hp + 16;
           if (Hp > HpLim) goto c2Ro; else goto c2Rn;
       c2Ro:
           HpAlloc = 16;
           R2 = R2;
           R1 = castWord2Double#_closure;
           call (stg_gc_fun)(R2, R1) args: 8, res: 0, upd: 8;
       c2Rn:
           I64[Hp - 8] = sat_s2QX_info;
           I64[Hp] = R2;
           R2 = Hp - 7;
           call runSTRep_info(R2) args: 8, res: 0, upd: 8;
     }
 }

The runSTRep version involves allocating a sat_s2QX closure and calling runSTRep to call that closure.

comment:21 Changed 5 weeks ago by simonmar

Ooh, that's a nice use for the new inline array allocation support in the codegen.

comment:22 Changed 4 weeks ago by bgamari

  • Owner set to bgamari
Note: See TracTickets for help on using tickets.