Opened 10 years ago

Closed 7 years ago

#1434 closed task (fixed)

Missing RULEs for truncate

Reported by: ghc@… Owned by: simonmar
Priority: high Milestone: 7.0.1
Component: libraries/base Version: 6.4.1
Keywords: rules Cc: ghc-bug@…, dons@…, alexey.skladnoy@…
Operating System: Unknown/Multiple Architecture: Unknown/Multiple
Type of failure: Runtime performance bug Test Case:
Blocked By: Blocking:
Related Tickets: Differential Rev(s):
Wiki Page:

Description

I found that the rounding functions from RealFrac class are considerably slower than the low level functions from GHC.Float. This is really a problem for me when doing signal processing, since for writing to a common audio file format or listening to a signal data has to be converted from Double to Int16.

$ ghci +RTS -M256m -c30 -RTS

   ___         ___ _
  / _ \ /\  /\/ __(_)
 / /_\// /_/ / /  | |      GHC Interactive, version 6.4.1, for Haskell 98.
/ /_\\/ __  / /___| |      http://www.haskell.org/ghc/
\____/\/ /_/\____/|_|      Type :? for help.

Loading package base-1.0 ... linking ... done.
Prelude> :set +s
Prelude> sum $ map round [0.1,0.32..100000] :: Int
1252489463
(6.50 secs, 241894764 bytes)
Prelude> sum $ map floor [0.1,0.32..100000] :: Int
1252262188
(6.07 secs, 240099200 bytes)
Prelude> sum $ map ceiling [0.1,0.32..100000] :: Int
1252716734
(6.13 secs, 243795404 bytes)
Prelude> sum $ map truncate [0.1,0.32..100000] :: Int
1252262188
(6.76 secs, 234572324 bytes)
Prelude> sum $ map GHC.Float.double2Int [0.1,0.32..100000] :: Int
1252262188
(1.38 secs, 66137016 bytes)

As far as I can judge, double2Int does the same like truncate. Instead of using the methods from RealFrac I could simply use double2Int but I consider this a work-around.

In GHC-6.6.1 these examples end with a stack overflow, but if I shorten the list, the time relations remain the same.

Attachments (1)

rules1434.dpatch (65.9 KB) - added by daniel.is.fischer 7 years ago.
rewrite rules for sized Int and Word types

Download all attachments as: .zip

Change History (25)

comment:1 Changed 10 years ago by ghc@…

In a more compact example using the GHC compiler and optimizations there are no performance differences:

{-# OPTIONS -O2 -Wall #-}
module Main where

import GHC.Float (double2Int)

main  :: IO ()
main  = print (sum (map double2Int [-0.32,-0.29..(100000::Double)]) :: Int)

main' :: IO ()
main' = print (sum (map truncate   [-0.32,-0.29..(100000::Double)]) :: Int)

However in a more complicated application I have, I generated a signal with round in 4 seconds and with a replacement based on double2Int in only one second. This may indicate an insufficient inlining depth, but increasing it didn't help. Seems like I have to extract a working example to demonstrate that.

comment:2 Changed 10 years ago by simonpj

A repeatable small-ish example would indeed make it much easier for us to find out what is going on. Thanks

Simon

comment:3 Changed 10 years ago by ghc@…

The following example is still certainly not the most compact one. The above examples let me assume that the difference between round and double2Int is eliminated by the optimizer. However, when I use Int16 instead of Int, this optimization seems to fail, again.

{-# OPTIONS -O2 #-}
{-  -package fps -package binary -}
module Main (main) where

import System.Time (getClockTime, diffClockTimes, tdSec, tdPicosec)
import System.Directory (removeFile)

import qualified Data.ByteString.Lazy as B
import qualified Data.Binary.Put as Bin

import Foreign (Int16)

import GHC.Float (double2Int)



type Sample = Int16    -- 'truncate' is slow
-- type Sample = Int     -- 'truncate' is as fast as double2Int


writeSignalMonoBinaryPut ::
   FilePath -> [Sample] -> IO ()
writeSignalMonoBinaryPut fileName =
   B.writeFile fileName .
   Bin.runPut .
   mapM_ (Bin.putWord16host . fromIntegral)



numToSample :: Double -> Sample
numToSample x =
   -- fromIntegral $ (truncate x :: Int)
   truncate x

doubleToSample :: Double -> Sample
doubleToSample x =
   fromIntegral $ double2Int x



{- * driver -}

measureTime :: String -> IO () -> IO ()
measureTime name act =
   do putStr (name++": ")
      timeA <- getClockTime
      act
      timeB <- getClockTime
      let td = diffClockTimes timeB timeA
      print (fromIntegral (tdSec td) +
             fromInteger (tdPicosec td) * 1e-12 :: Double)




exponential2 :: Double -> Double -> [Double]
exponential2 k = iterate (k*)


sawSignalDouble :: Double -> [Double]
sawSignalDouble halfLife =
   take 200000 (exponential2 halfLife 32767)

sawSignal16Trunc :: [Sample]
sawSignal16Trunc =
   map numToSample (sawSignalDouble 0.999)

sawSignal16LowLevel :: [Sample]
sawSignal16LowLevel =
   map doubleToSample (sawSignalDouble 0.999001)


tests :: [(String, FilePath, FilePath -> IO ())]
tests =
   ("saw double2Int", "saw-double2Int.sw", flip writeSignalMonoBinaryPut sawSignal16LowLevel) :
   ("saw round",      "saw-round.sw",      flip writeSignalMonoBinaryPut sawSignal16Trunc) :
   []


main :: IO ()
main =
   do mapM (\(label, fileName, action) ->
              measureTime label (action fileName))
           tests

      mapM_ (\(_,fileName,_) -> removeFile fileName)
           tests

For type Sample = Int16 I get

$ ghc-6.6.1 -package binary RoundTest.hs && a.out
saw double2Int: 8.7173e-2
saw round: 0.430843

For type Sample = Int I get

$ ghc-6.6.1 -package binary RoundTest.hs && a.out
saw double2Int: 8.8279e-2
saw round: 9.8028e-2

comment:4 Changed 10 years ago by igloo

Milestone: 6.8

comment:5 Changed 10 years ago by simonmar

I bet this is the lack of a specialised version of truncate for Int16. We have RULEs for truncate at Float->Int and Double->Int. We could presumably add RULEs for small integral types, of the form truncate = fromIntegral . double2Int (truncate is undefined outside the range of the target anyway).

comment:6 Changed 9 years ago by dons

Cc: dons@… added
Owner: set to dons

I looked into this after the realToFrac issues last week. It's a similar story -- the noop rules are currently only set up for Int conversions.

So, for example, this program:

import Data.Word
import Data.Array.Vector

main = print . sumU
             . mapU (truncate::Double->Word16)
             $ (enumFromToFracU 0.1 (100000000 :: Double))

truncate stays as a call to properFraction.

Now, we can add some rules for this stuff:

{-# RULES
"truncate/Double->Int64"
    truncate = fromIntegral . double2Int :: Double -> Int64
"truncate/Double->Int32"
    truncate = fromIntegral . double2Int :: Double -> Int32
"truncate/Double->Int16"
    truncate = fromIntegral . double2Int :: Double -> Int16
"truncate/Double->Int8"
    truncate = fromIntegral . double2Int :: Double -> Int8
"truncate/Double->Word64"
    truncate = fromIntegral . double2Int :: Double -> Word64
"truncate/Double->Word32"
    truncate = fromIntegral . double2Int :: Double -> Word32
"truncate/Double->Word16"
    truncate = fromIntegral . double2Int :: Double -> Word16
"truncate/Double->Word8"
    truncate = fromIntegral . double2Int :: Double -> Word8
  #-}

And the running time drops from:

$ time ./henning     
28800
./henning  45.84s user 0.11s system 98% cpu 46.448 total

before, to:

$ time ./henning     
28800
./henning  0.39s user 0.00s system 97% cpu 0.398 total

after.

I think this is as good a time as any to do an audit of the Int/noop conversion rules in base, looking for others that also should work for integral types. I'll take care of this.

comment:7 Changed 9 years ago by dons

Keywords: rules added
Type: bugtask

While hunting around, I noticed we have nice RULES defined for the basic math and boolean ops,

"x# ># x#"  forall x#. x# >#  x# = False
"x# >=# x#" forall x#. x# >=# x# = True
"x# ==# x#" forall x#. x# ==# x# = True
"x# /=# x#" forall x#. x# /=# x# = False
"x# <# x#"  forall x#. x# <#  x# = False
"x# <=# x#" forall x#. x# <=# x# = True

For example, but I couldn't see rules for the Word# primitives.

However, if I write test cases,

x =  case int2Word# 7# of x# -> x# `gtWord#` x#

We do get rules firing,

2 RuleFired
    1 gtWord#
    1 int2Word#
M.x = False

Where are the Word# rules defined? Are they wired in?

comment:8 Changed 9 years ago by igloo

Milestone: 6.8 branch6.10.1

comment:9 Changed 9 years ago by simonmar

Architecture: UnknownUnknown/Multiple

comment:10 Changed 9 years ago by igloo

Milestone: 6.10.16.10.2

comment:11 Changed 8 years ago by igloo

Milestone: 6.10.26.12.1

comment:12 Changed 8 years ago by Khudyakov

Cc: alexey.skladnoy@… added

comment:13 Changed 8 years ago by simonmar

Operating System: LinuxUnknown/Multiple
Type of failure: Runtime performance bug

comment:14 Changed 8 years ago by igloo

Milestone: 6.12.16.12 branch
Owner: dons deleted

comment:15 Changed 7 years ago by LouisWasserman

Is there anything going on here, except that we just need to add in all the rules for the various truncated types?

comment:16 Changed 7 years ago by simonpj

I believe that's all it is. See the comment from Don Stewart 23 months ago, in this ticket. It would be great if you could sort this out, and perhaps look for other similar cases in the base library.

There is a collection of tiresome arithmetic-related bugs waiting for a generous person to fix them: http://hackage.haskell.org/trac/ghc/wiki/Status/SLPJ-Tickets#Tiresomearithmeticthings.

comment:17 Changed 7 years ago by isaacdupree

Does this RULE work on 32-bit, where Int is not as big as Int64? (I'm dubious)

"truncate/Double->Int64"

truncate = fromIntegral . double2Int
Double -> Int64

comment:18 Changed 7 years ago by igloo

Milestone: 6.12 branch6.12.3

comment:19 Changed 7 years ago by simonmar

Summary: Slow conversion from Double to IntMissing RULEs for truncate

comment:20 Changed 7 years ago by igloo

Milestone: 6.12.36.14.1
Priority: normallow

Changed 7 years ago by daniel.is.fischer

Attachment: rules1434.dpatch added

rewrite rules for sized Int and Word types

comment:21 Changed 7 years ago by daniel.is.fischer

Status: newpatch

Patch with rewrite rules for the sized Int and Word types whose ranges are contained in Int's range. Unfortunately, we can't have a rule for Word here, that would need its own primop to improve much on the default implementation. It doesn't depend on #2271, but it will only help with truncate without it.

Please review and merge.

comment:22 Changed 7 years ago by simonmar

Owner: set to simonmar
Priority: lowhigh

comment:23 Changed 7 years ago by simonmar

Status: patchmerge

Applied, thanks!

Wed Oct 20 10:10:14 BST 2010  Daniel Fischer <daniel.is.fischer@web.de>
  * FIX #1434
  Rewrite rules for RealFrac methods with sized Int and Word targets.
  For all types whose range is contained in Int's range, there are now
  rewrite rules for properFraction, truncate, floor, ceiling and round
  from Double and Float, going through the specialised methods for Int.
  
  Unfortunately, we can't have a rewrite rule for Word.

comment:24 Changed 7 years ago by igloo

Resolution: fixed
Status: mergeclosed

Merged

Note: See TracTickets for help on using tickets.