Opened 9 years ago
Last modified 4 months ago
#1687 new bug
A faster (^)-function.
Reported by: | Owned by: | ||
---|---|---|---|
Priority: | normal | Milestone: | ⊥ |
Component: | Prelude | Version: | 6.6.1 |
Keywords: | Cc: | mihai.maruseac@…, rwbarton@…, hvr | |
Operating System: | Linux | Architecture: | x86 |
Type of failure: | Runtime performance bug | Test Case: | |
Blocked By: | Blocking: | ||
Related Tickets: | Differential Rev(s): | ||
Wiki Page: |
Description
This function performs better for me than the (^)
-function in GHC. I seem to only be able to test it for the Integer type though and its only tested with ghc 6.6 (and ghc 6.6.1 by byorgey on #haskell).
I'm not sure if you really need this or if it is correct, but after discussion on #haskell i was asked to make a bug report so here it is! Enjoy. :)
module Pow (pow) where import Prelude hiding ((^)) pow = (^) (^) :: (Integral b, Num a) => a -> b -> a x ^ y | y < 0 = error "Negative exponent" | y == 0 = 1 | y == 1 = x | odd y = x * x^(y - 1) | otherwise = let x' = x^(y `div` 2) in x' * x'
Tests
-- TestData.hs module TestData where e = 10^8
-- mytest.hs import Pow import TestData main = print $ (2 `pow` e) `mod` 2
-- ghctest.hs import TestData main = print $ (2 ^ e) `mod` 2
Test results (performance)
$ time ./ghctest 0 real 0m11.744s user 0m11.449s sys 0m0.104s $ time ./mytest 0 real 0m6.794s user 0m6.696s sys 0m0.084s -}
QuickCheck test
-- qc.hs -- $ ./qc -- OK, passed 100 tests. import Test.QuickCheck import Pow main = quickCheck prop prop x y = y >= 0 ==> x `pow` y == x^y
Change History (18)
comment:1 Changed 9 years ago by
comment:2 Changed 9 years ago by
ghc --make -O2 is the flags i use. I tested with this:
import Prelude hiding ((^)) import TestData main = print $ (2 ^ e) `mod` 2 {-# SPECIALISE (^) :: Integer -> Integer -> Integer, Integer -> Int -> Integer, Int -> Int -> Int #-} (^):: (Num a, Integral b) => a -> b -> a _ ^ 0= 1 x ^ n | n > 0 = f x (n-1) x where f _ 0 y = y f a d y = g a d where g b i | even i = g (b*b) (i `quot` 2) | otherwise = f b (i-1) (b*y) _ ^ _= error "Prelude.^: negative exponent"
And got these results:
moonlite@startop ~/code/haskell/pow $ time ./ghctest 0 real 0m11.713s user 0m11.473s sys 0m0.136s moonlite@startop ~/code/haskell/pow $ time ./ghctest 0 real 0m12.233s user 0m11.529s sys 0m0.136s moonlite@startop ~/code/haskell/pow $ time ./ghctest 0 real 0m11.585s user 0m11.453s sys 0m0.104s moonlite@startop ~/code/haskell/pow $ time ./ghctest 0 real 0m11.632s user 0m11.469s sys 0m0.112s
comment:4 Changed 9 years ago by
Architecture: | Unknown → x86 |
---|---|
Component: | Compiler → Prelude |
Operating System: | Unknown → Linux |
comment:6 Changed 9 years ago by
Current version which tries to eliminate some unnecessary tests (don't test if we're odd when we just did). I can potentially see a slight speedup.
module Pow ((^)) where import Prelude hiding ((^)) {-# SPECIALISE (^) :: Integer -> Integer -> Integer, Integer -> Int -> Integer, Int -> Int -> Int #-} (^) :: (Integral b, Num a) => a -> b -> a x ^ y | y < 0 = error "Negative exponent" | y == 0 = 1 | otherwise = x `pow` y where pow :: (Integral b, Num a) => a -> b -> a x `pow` y | y == 1 = x | odd y = x * (f x (y-1)) | otherwise = f x y where f x y = let x' = x `pow` (y `div` 2) in x' * x'
comment:7 Changed 9 years ago by
Ok, so i've managed to test it using only Ints and the operation with both ghc's and mine are so fast that my current test-method can't measure any differences so i made a new test:
main = print (manyTest `mod` 2) manyTest = sum $ map ((2::Int)^) ([1..100000000] :: [Int])
These are the results:
$ time ./mytest real 1m23.989s user 1m23.745s sys 0m0.108s $ time ./ghctest 0 real 0m36.946s user 0m36.878s sys 0m0.028s
So (like quicksilver on #haskell also thought) my function is only faster on the Integer type. Perhaps it's interesting anyways
comment:8 Changed 9 years ago by
Owner: | set to igloo |
---|
comment:9 Changed 9 years ago by
Milestone: | 6.8 branch → _|_ |
---|
I've changed the definition to:
(^) :: (Num a, Integral b) => a -> b -> a x0 ^ y0 | y0 < 0 = error "Negative exponent" | y0 == 0 = 1 | otherwise = f x0 y0 1 where -- x0 ^ y0 = (x ^ y) * z f x y z | even y = f (x * x) (y `quot` 2) z | y == 1 = x * z | otherwise = f (x * x) ((y - 1) `quot` 2) (x * z)
which is as fast as I've seen for Int
, and matches your definition's performance for Integer
when computing (2^(2^27)) `mod` 2
, but it's about half the speed of your definition when computing (2^((2^27)-1)) `mod` 2
.
I don't know why that is - I can't see a reason at the core/STG level. I'll leave the ticket open, but milestone _|_, in case anyone wants to look into it.
comment:10 Changed 9 years ago by
Owner: | igloo deleted |
---|
comment:11 Changed 5 years ago by
Cc: | mihai.maruseac@… added |
---|---|
Type of failure: | → None/Unknown |
comment:13 Changed 4 years ago by
Cc: | rwbarton@… added |
---|
moonlite's algorithm does squaring and multiplication by the small number x
, while GHC's algorithm does squaring and multiplication of those squares together in an accumulator. Squaring big integers is a constant factor faster than multiplying different numbers of similar size, so moonlite's algorithm wins on Integer when the numbers involved are large.
The downside is that it is not tail-recursive, which costs when multiplications are actually very cheap like for Int.
But anyways, how about binding to mpz_pow_ui
and using a RULE in the case of Integer base?
comment:14 Changed 4 years ago by
moonlite, rwbarton: just to say, thank you for working on this. If you need help on GHC aspects just ask, but meanwhile, I'll hope that you'll develop a patch!
Simon
comment:15 Changed 19 months ago by
Type of failure: | None/Unknown → Runtime performance bug |
---|
comment:16 follow-up: 17 Changed 19 months ago by
Cc: | hvr added |
---|
hvr, what do you think about this? Perhaps integer_gmp could wrap mpz_pow_ui
?
comment:17 Changed 19 months ago by
Replying to bgamari:
hvr, what do you think about this? Perhaps integer_gmp could wrap
mpz_pow_ui
?
You may notice that I had actually mpz_pow_ui
wrapped earlier for 0.5.1, see
http://hackage.haskell.org/package/integer-gmp-0.5.1.0/docs/GHC-Integer-GMP-Internals.html#g:3
but I had to drop it again for integer-gmp-1.0
because there's no easy way to predict how large the result will be. But I wanted to revisit this at some later point. It's possible to wrap mpz_pow_ui
but it requires to have GMP allocate the buffer, and then copy it over into the final ByteArray#
.
comment:18 Changed 4 months ago by
rwbarton explained this earlier, but it took a while to wrap my head around it.
I find this a bit clearer to understand:
{-# SPECIALISE [1] (^) :: Integer -> Integer -> Integer, Integer -> Int -> Integer, Int -> Int -> Int #-} {-# INLINABLE [1] (^) #-} -- See Note [Inlining (^)] (^) :: (Num a, Integral b) => a -> b -> a x0 ^ y0 | y0 < 0 = errorWithoutStackTrace "Negative exponent" | y0 == 0 = 1 | otherwise = f x0 y0 where f x y | even y = let k = f x (y `quot` 2) in k * k | y == 1 = x | otherwise = let k = f x ((y - 1) `quot` 2) in k * k * x
This has identical performance to the version in the description above.
This makes use of the fact that x^{2a} = x^{a+a} = x^{a} * x^{a} ; and x^{2a+1} = x^{a} * x^{a} * x
Thanks for the report. Can you please say what commandline you were using to compile when benchmarking, and what platform you were on?
At first glance I can't see why your implementation should be any faster than the implementation in GHC.Real, so my suspicion is different command line flags. What happens if you copy the library code into ghctest and compile that in the same way, rather than using the version from the library?
Thanks
Ian