Opened 9 years ago
Last modified 6 weeks ago
#1687 new bug
A faster (^)-function.
Reported by: | moonlite@… | Owned by: | |
---|---|---|---|
Priority: | normal | Milestone: | ⊥ |
Component: | Prelude | Version: | 6.6.1 |
Keywords: | Cc: | mihai.maruseac@…, rwbarton@…, hvr | |
Operating System: | Linux | Architecture: | x86 |
Type of failure: | Runtime performance bug | Test Case: | |
Blocked By: | Blocking: | ||
Related Tickets: | Differential Rev(s): | ||
Wiki Page: |
Description
This function performs better for me than the (^)-function in GHC. I seem to only be able to test it for the Integer type though and its only tested with ghc 6.6 (and ghc 6.6.1 by byorgey on #haskell). I'm not sure if you really need this or if it is correct, but after discussion on #haskell i was asked to make a bug report so here it is! Enjoy. :)
module Pow (pow) where import Prelude hiding ((^)) pow = (^) (^) :: (Integral b, Num a) => a -> b -> a x ^ y | y < 0 = error "Negative exponent" | y == 0 = 1 | y == 1 = x | odd y = x * x^(y - 1) | otherwise = let x' = x^(y `div` 2) in x' * x'
Tests
-- TestData.hs module TestData where e = 10^8
-- mytest.hs import Pow import TestData main = print $ (2 `pow` e) `mod` 2
-- ghctest.hs import TestData main = print $ (2 ^ e) `mod` 2
Test results (performance)
$ time ./ghctest 0 real 0m11.744s user 0m11.449s sys 0m0.104s $ time ./mytest 0 real 0m6.794s user 0m6.696s sys 0m0.084s -}
QuickCheck test
-- qc.hs -- $ ./qc -- OK, passed 100 tests. import Test.QuickCheck import Pow main = quickCheck prop prop x y = y >= 0 ==> x `pow` y == x^y
Change History (18)
comment:1 Changed 9 years ago by igloo
comment:2 Changed 9 years ago by moonlite
ghc --make -O2 is the flags i use. I tested with this:
import Prelude hiding ((^)) import TestData main = print $ (2 ^ e) `mod` 2 {-# SPECIALISE (^) :: Integer -> Integer -> Integer, Integer -> Int -> Integer, Int -> Int -> Int #-} (^):: (Num a, Integral b) => a -> b -> a _ ^ 0= 1 x ^ n | n > 0 = f x (n-1) x where f _ 0 y = y f a d y = g a d where g b i | even i = g (b*b) (i `quot` 2) | otherwise = f b (i-1) (b*y) _ ^ _= error "Prelude.^: negative exponent"
And got these results:
moonlite@startop ~/code/haskell/pow $ time ./ghctest 0 real 0m11.713s user 0m11.473s sys 0m0.136s moonlite@startop ~/code/haskell/pow $ time ./ghctest 0 real 0m12.233s user 0m11.529s sys 0m0.136s moonlite@startop ~/code/haskell/pow $ time ./ghctest 0 real 0m11.585s user 0m11.453s sys 0m0.104s moonlite@startop ~/code/haskell/pow $ time ./ghctest 0 real 0m11.632s user 0m11.469s sys 0m0.112s
comment:3 Changed 9 years ago by moonlite
The system is Ubuntu 7.04 on an x86-computer (pentium M at 2ghz).
comment:4 Changed 9 years ago by guest
- Architecture changed from Unknown to x86
- Component changed from Compiler to Prelude
- Operating System changed from Unknown to Linux
comment:6 Changed 9 years ago by moonlite
Current version which tries to eliminate some unnecessary tests (don't test if we're odd when we just did). I can potentially see a slight speedup.
module Pow ((^)) where import Prelude hiding ((^)) {-# SPECIALISE (^) :: Integer -> Integer -> Integer, Integer -> Int -> Integer, Int -> Int -> Int #-} (^) :: (Integral b, Num a) => a -> b -> a x ^ y | y < 0 = error "Negative exponent" | y == 0 = 1 | otherwise = x `pow` y where pow :: (Integral b, Num a) => a -> b -> a x `pow` y | y == 1 = x | odd y = x * (f x (y-1)) | otherwise = f x y where f x y = let x' = x `pow` (y `div` 2) in x' * x'
comment:7 Changed 9 years ago by moonlite
Ok, so i've managed to test it using only Ints and the operation with both ghc's and mine are so fast that my current test-method can't measure any differences so i made a new test:
main = print (manyTest `mod` 2) manyTest = sum $ map ((2::Int)^) ([1..100000000] :: [Int])
These are the results:
$ time ./mytest real 1m23.989s user 1m23.745s sys 0m0.108s $ time ./ghctest 0 real 0m36.946s user 0m36.878s sys 0m0.028s
So (like quicksilver on #haskell also thought) my function is only faster on the Integer type. Perhaps it's interesting anyways
comment:8 Changed 9 years ago by igloo
- Owner set to igloo
comment:9 Changed 9 years ago by igloo
- Milestone changed from 6.8 branch to _|_
I've changed the definition to:
(^) :: (Num a, Integral b) => a -> b -> a x0 ^ y0 | y0 < 0 = error "Negative exponent" | y0 == 0 = 1 | otherwise = f x0 y0 1 where -- x0 ^ y0 = (x ^ y) * z f x y z | even y = f (x * x) (y `quot` 2) z | y == 1 = x * z | otherwise = f (x * x) ((y - 1) `quot` 2) (x * z)
which is as fast as I've seen for Int, and matches your definition's performance for Integer when computing (2^(2^27)) `mod` 2, but it's about half the speed of your definition when computing (2^((2^27)-1)) `mod` 2.
I don't know why that is - I can't see a reason at the core/STG level. I'll leave the ticket open, but milestone _|_, in case anyone wants to look into it.
comment:10 Changed 8 years ago by igloo
- Owner igloo deleted
comment:11 Changed 4 years ago by mihai.maruseac
- Cc mihai.maruseac@… added
- Type of failure set to None/Unknown
comment:12 Changed 4 years ago by morabbin
Bump; any progress understanding this?
comment:13 Changed 3 years ago by rwbarton
- Cc rwbarton@… added
moonlite's algorithm does squaring and multiplication by the small number x, while GHC's algorithm does squaring and multiplication of those squares together in an accumulator. Squaring big integers is a constant factor faster than multiplying different numbers of similar size, so moonlite's algorithm wins on Integer when the numbers involved are large.
The downside is that it is not tail-recursive, which costs when multiplications are actually very cheap like for Int.
But anyways, how about binding to mpz_pow_ui and using a RULE in the case of Integer base?
comment:14 Changed 3 years ago by simonpj
moonlite, rwbarton: just to say, thank you for working on this. If you need help on GHC aspects just ask, but meanwhile, I'll hope that you'll develop a patch!
Simon
comment:15 Changed 16 months ago by bgamari
- Type of failure changed from None/Unknown to Runtime performance bug
comment:16 follow-up: ↓ 17 Changed 16 months ago by bgamari
- Cc hvr added
hvr, what do you think about this? Perhaps integer_gmp could wrap mpz_pow_ui?
comment:17 in reply to: ↑ 16 Changed 16 months ago by hvr
Replying to bgamari:
hvr, what do you think about this? Perhaps integer_gmp could wrap mpz_pow_ui?
You may notice that I had actually mpz_pow_ui wrapped earlier for 0.5.1, see
http://hackage.haskell.org/package/integer-gmp-0.5.1.0/docs/GHC-Integer-GMP-Internals.html#g:3
but I had to drop it again for integer-gmp-1.0 because there's no easy way to predict how large the result will be. But I wanted to revisit this at some later point. It's possible to wrap mpz_pow_ui but it requires to have GMP allocate the buffer, and then copy it over into the final ByteArray#.
comment:18 Changed 6 weeks ago by siddhanathan
rwbarton explained this earlier, but it took a while to wrap my head around it.
I find this a bit clearer to understand:
{-# SPECIALISE [1] (^) :: Integer -> Integer -> Integer, Integer -> Int -> Integer, Int -> Int -> Int #-} {-# INLINABLE [1] (^) #-} -- See Note [Inlining (^)] (^) :: (Num a, Integral b) => a -> b -> a x0 ^ y0 | y0 < 0 = errorWithoutStackTrace "Negative exponent" | y0 == 0 = 1 | otherwise = f x0 y0 where f x y | even y = let k = f x (y `quot` 2) in k * k | y == 1 = x | otherwise = let k = f x ((y - 1) `quot` 2) in k * k * x
This has identical performance to the version in the description above.
This makes use of the fact that x^{2a} = x^{a+a} = x^{a} * x^{a} ; and x^{2a+1} = x^{a} * x^{a} * x
Thanks for the report. Can you please say what commandline you were using to compile when benchmarking, and what platform you were on?
At first glance I can't see why your implementation should be any faster than the implementation in GHC.Real, so my suspicion is different command line flags. What happens if you copy the library code into ghctest and compile that in the same way, rather than using the version from the library?
Thanks
Ian