GHC: Ticket Query
http://ghc.haskell.org/trac/ghc/query?status=!closed&component=libraries%2Frandom&order=priority
The Glasgow Haskell Compileren-USGHChttp://ghc.haskell.org/trac/ghc/chrome/site/ghc_logo.png
http://ghc.haskell.org/trac/ghc/query?status=!closed&component=libraries%2Frandom&order=priority
Trac 1.0.1
http://ghc.haskell.org/trac/ghc/ticket/427
http://ghc.haskell.org/trac/ghc/ticket/427#427: Random.StdGen slownessWed, 27 Jul 2005 08:32:05 GMTremit<pre class="wiki">Two (performance) problems in one:
{-# OPTIONS -fffi #-}
module Main (main) where
import Control.Monad
import Random
foreign import ccall unsafe "random" _crandom :: IO Int
randomInt :: (Int, Int) -> IO Int
randomInt (min,max) = do
n <- _crandom
return $ min + n `rem` range
where
range = max - min + 1
main = replicateM_ (5*10^6) $ do
x <- randomRIO (0::Int,1000) :: IO Int
x `seq` return ()
return ()
First, without the "seq" at the end, hardly anything is
evaluated and we're building huge amounts of thunks.
Three ideas about this one:
- Blame the user :)
- data StdGen = StdGen !Int !Int
Use strict fields in StdGen. Doesn't actually help
(at least in this example).
- Force evaluation of the StdGen in getStdRandom.
Does help in this example, but also changes behaviour
of the library:
x <- randomRIO undefined
currently dies only when x (or the result of a later
randomRIO) is evaluated. This change causes it to die
immediately.
Second, even _with_ the "seq", replacing "randomRIO" by
"randomInt" speeds the thing up with a factor of about
30. (2 to 3.6, in a "real world" university practicum
exercise of 900 lines of code)
Even given the fact that they're not really doing the
same thing, this seems rather much :(
</pre>Resultshttp://ghc.haskell.org/trac/ghc/ticket/427#changelog
http://ghc.haskell.org/trac/ghc/ticket/2280
http://ghc.haskell.org/trac/ghc/ticket/2280#2280: randomR too slowTue, 13 May 2008 13:00:42 GMTguest<p>
randomR is considerably slower than implementing it manually. Maybe I have not re-implemented it precisely, maybe it is just not inlined.
</p>
<pre class="wiki">module Main (main) where
import System.Random (RandomGen(..), randomR, )
import qualified Data.ByteString as B
newtype KnuthRandomGen = KnuthRandomGen Int
{-# INLINE knuthFactor #-}
knuthFactor :: Int
knuthFactor = 40692
{-# INLINE knuthModulus #-}
knuthModulus :: Int
knuthModulus = 2^(31::Int)-249
-- we have to split the 32 bit integer in order to avoid overflow on multiplication
knuthSplit :: Int
knuthSplit = succ $ div knuthModulus knuthFactor
knuthSplitRem :: Int
knuthSplitRem = knuthSplit * knuthFactor - knuthModulus
instance RandomGen KnuthRandomGen where
{-# INLINE next #-}
next (KnuthRandomGen s) =
-- efficient computation of @mod (s*knuthFactor) knuthModulus@ without Integer
let (sHigh, sLow) = divMod s knuthSplit
in (s, KnuthRandomGen $ flip mod knuthModulus $
knuthSplitRem*sHigh + knuthFactor*sLow)
{-# INLINE split #-}
split (KnuthRandomGen s) =
(KnuthRandomGen (s*s), KnuthRandomGen (13+s))
{-# INLINE genRange #-}
genRange _ = (1, pred knuthModulus)
main :: IO ()
main =
do
-- for comparison: that's very fast
putStrLn "constant"
B.writeFile "random.out" $ fst $
B.unfoldrN 10000000
(\g0@(KnuthRandomGen s) -> Just (fromIntegral s, g0))
(KnuthRandomGen 1)
-- 3 seconds on my machine
putStrLn "immediate"
B.writeFile "random.out" $ fst $
B.unfoldrN 10000000
(\g0 -> let (w,g1) = next g0
in Just (fromIntegral (mod w 256), g1))
(KnuthRandomGen 1)
-- 10 seconds on my machine
putStrLn "randomR"
B.writeFile "random.out" $ fst $
B.unfoldrN 10000000
(\g0 -> Just (let (w,g1) = randomR (0,255) g0
in (fromIntegral (w::Int), g1)))
(KnuthRandomGen 1)
{-
ghc -o dist/build/randomtest -O -Wall -package bytestring-0.9.0.5 -ddump-simpl-iterations speedtest/RandomTest.hs
-}
{-
bytestring-0.9.0.1 as shipped with GHC-6.8.2 does not inline Data.ByteString.unfoldrN
-}
</pre><p>
Is this related to Ticket 427?
</p>
Resultshttp://ghc.haskell.org/trac/ghc/ticket/2280#changelog
http://ghc.haskell.org/trac/ghc/ticket/3575
http://ghc.haskell.org/trac/ghc/ticket/3575#3575: mkStdGen and split conspire to make some programs predictableMon, 12 Oct 2009 04:45:04 GMTrwbarton<p>
The following program <tt>random.hs</tt> is intended to produce a line containing 30 random 0s or 1s. It is not an example of the best way to use System.Random, but it looks innocuous enough.
</p>
<pre class="wiki">import Control.Monad
import System.Random
print0or1 = do
g <- newStdGen
putStr . show . fst $ randomR (0, 1 :: Int) g
main = replicateM_ 30 print0or1 >> putStrLn ""
</pre><p>
Let's try running it a thousand times:
</p>
<pre class="wiki">rwbarton@functor:/tmp$ ghc-6.10.1 -O2 --make random
[1 of 1] Compiling Main ( random.hs, random.o )
Linking random ...
rwbarton@functor:/tmp$ for i in `seq 1 1000` ; do ./random >> random.out ; done
rwbarton@functor:/tmp$ sort random.out | uniq | wc -l
60
</pre><p>
That's odd... there are 2<sup>30</sup> possible output lines, but when I tried to generate 1000 random ones, I only got 60 distinct outputs. Why did that happen?
</p>
<p>
One might think this is due to poor initial seeding of the random number generator (due to the time not changing very much during the test), but this is not the case. Attached is a fancier version of the program which reads an initial seed from <tt>/dev/urandom</tt>; it exhibits the same behavior.
</p>
<p>
This phenomenon is not too hard to explain. It is ultimately due to a poor interaction between <tt>mkStdGen</tt> and <tt>split</tt>. First, we need to know a bit about the design of System.Random (some statements simplified slightly for this discussion).
</p>
<ul><li>The state of the RNG consists of two <tt>Int32</tt>s, <tt>s1</tt> and <tt>s2</tt>.
</li></ul><ul><li>The initial state produced by mkStdGen almost always has <tt>s2</tt> equal to 1. (Extremely rarely, it might have <tt>s2</tt> equal to 2. We'll ignore this as it doesn't affect the argument.)
</li></ul><ul><li>To generate a random 0 or 1, we first generate a new state using some simple functions <tt>s1' = next1(s1)</tt>, <tt>s2' = next2(s2)</tt>. (Note that <tt>s1</tt> and <tt>s2</tt> "evolve" independently.) The random value returned is the lowest bit of <tt>s1'</tt> minus <tt>s2'</tt>.
</li></ul><ul><li>Splitting the generator <tt>(s1, s2)</tt> yields the two generators <tt>(s1+1, next2(s2))</tt> and <tt>(next1(s1), s2-1)</tt>.
</li></ul><p>
Our program functions as follows.
</p>
<ul><li>Initialize the generator stored in <tt>theStdGen</tt> (<tt>s1</tt> is some varying value <tt>a</tt>, <tt>s2</tt> is 1).
</li></ul><ul><li>Repeatedly split the generator, replacing it with the first output, and use the second output to generate a 0 or 1.
</li></ul><p>
If we watch <tt>theStdGen</tt> while our program runs, we will see that <tt>s1</tt> is incremented by 1 at each step, while <tt>s2</tt> follows the fixed sequence <tt>1</tt>, <tt>next2(1)</tt>, <tt>next2(next2(1))</tt>, etc. The 0 or 1 we output at the <tt>k</tt>th step is thus the lowest bit of <tt>next1(next1(a+k-1))</tt> minus <tt>b</tt><sub>k</sub>, where <tt>b</tt><sub>k</sub> is some fixed sequence. And as <tt>k</tt> varies, <tt>next1(next1(a+k-1))</tt> turns out to be just an arithmetic sequence with fixed difference modulo a fixed prime so its lowest bits are extremely predictable even without knowing <tt>a</tt>.
</p>
<p>
This issue can be fixed to some extent, without breaking backwards compatibility, by adding another method (besides <tt>mkStdGen</tt>) to create a generator, which does not have predictable <tt>s2</tt>, and using it to initialize the system RNG.
</p>
Resultshttp://ghc.haskell.org/trac/ghc/ticket/3575#changelog
http://ghc.haskell.org/trac/ghc/ticket/3620
http://ghc.haskell.org/trac/ghc/ticket/3620#3620: The seeds generated by split are not independentWed, 28 Oct 2009 15:07:41 GMTNickSmallbone<p>
Suppose we split a seed into two like this:
</p>
<pre class="wiki">split' :: StdGen -> (StdGen, StdGen)
split' g = (g12, g21)
where (_, g12) = split g1
(g21, _) = split g2
(g1, g2) = split g
</pre><p>
Since g1 and g2 are independent, g12 and g21 ought to be too. But they're not! If we use both of g12 and g21 to generate a random number in the range [0..10], then the two numbers ought to be equal 1/11 of the time. In fact, they're never equal.
Here is a test program that ought to return True 1/11 of the time but
actually always returns False:
</p>
<pre class="wiki">sample :: StdGen -> (Int, Int)
sample g = (fst (randomR (0, 10) g1),
fst (randomR (0, 10) g2))
where (g1, g2) = split' g
test :: StdGen -> Bool
test g = fst (sample g) == snd (sample g)
</pre><p>
I attached a program that prints the distribution of values from
<tt>test</tt> for other ranges than [0..10]. The distribution is
always quite bad no matter what the range is.
</p>
<p>
The upshot of all this is that the following <a class="missing wiki">QuickCheck?</a> (version 2)
property always passes, even though it's false:
</p>
<pre class="wiki">import Test.QuickCheck
import Text.Show.Functions
newtype Eleven = Eleven Int deriving Eq
instance Arbitrary Eleven where
arbitrary = fmap Eleven (choose (0, 10))
prop :: (Int -> Eleven) -> Bool
prop f = f 5 /= f 6
</pre>Resultshttp://ghc.haskell.org/trac/ghc/ticket/3620#changelog
http://ghc.haskell.org/trac/ghc/ticket/5278
http://ghc.haskell.org/trac/ghc/ticket/5278#5278: System.Random.randomIvalInteger makes invalid assumptions about RandomGenMon, 27 Jun 2011 01:46:31 GMTrrnewton<p>
The existing API for <tt>System.Random.RandomGen</tt> allows a random number generator (RNG) to produce Ints within an arbitrary range specified by <tt>genRange</tt>.
</p>
<p>
For example, the following <tt>RandomGen</tt> produces only zeros and ones, but should be legitimate:
</p>
<pre class="wiki">import System.Random
data BinRNG = BinRNG StdGen
instance RandomGen BinRNG where
next (BinRNG g) = (x `mod` 2, BinRNG g')
where (x,g') = next g
split (BinRNG g) = (BinRNG g1, BinRNG g2)
where (g1,g2) = split g
genRange _ = (0,1)
ls :: [Int]
ls = randoms (BinRNG$ mkStdGen 38388)
main = print $ take 20 ls
</pre><p>
But <tt>System.Random.randomIvalInteger</tt> makes invalid assumptions about the amount of randomness produced by <tt>next</tt>. (Specifically, assuming that it creates the same amount as <tt>StdGen</tt>.) Thus, the above program will create an output with only a couple of unique ints (rather than 2<sup>64).
</sup></p>
<p>
For example:
</p>
<pre class="wiki">[4611734781337924537,4611734781337924537,-9223323645458902796,
-9223323645458902797,4611734783485408099,4611734783485408098,
-9223323645458902796,-9223323647606386357,4611734781337924538,
-9223323645458902796,-9223323645458902797,
-9223323647606386357,4611734783485408098,4611734783485408098,
-9223323647606386357,4611734781337924538,4611734781337924537,
-9223323645458902796,4611734783485408099,4611734781337924538]
</pre>Resultshttp://ghc.haskell.org/trac/ghc/ticket/5278#changelog
http://ghc.haskell.org/trac/ghc/ticket/5280
http://ghc.haskell.org/trac/ghc/ticket/5280#5280: System.Random commits (rand `mod` base) error.Mon, 27 Jun 2011 02:41:22 GMTrrnewton<p>
You have probably at some point come across the C code "<tt>rand() % base</tt>"'. It is very intuitive, but unfortunately creates non-uniform random numbers, which is easy to see if you imagine <tt>rand()</tt> producing numbers in say <tt>[0,15)</tt> and base being <tt>10</tt>.
</p>
<p>
In the function <tt>System.Random.randomIvalInteger</tt> you can see the same thing happening.
</p>
<p>
The only way I know how to deal with it and generate uniform integers within a range is to artificially restrict the range of the source of randomness to be a multiple of the desired base. It can be done simply by throwing out some random results.
</p>
<p>
This strategy appears to be used by GMP's mpz_urandomm function:
</p>
<blockquote>
<p>
<a class="ext-link" href="http://gmplib.org/manual/Integer-Random-Numbers.html#Integer-Random-Numbers"><span class="icon"></span>http://gmplib.org/manual/Integer-Random-Numbers.html#Integer-Random-Numbers</a>
</p>
</blockquote>
<p>
The file <tt>urandomm.c</tt> has the code.
</p>
Resultshttp://ghc.haskell.org/trac/ghc/ticket/5280#changelog
http://ghc.haskell.org/trac/ghc/ticket/7379
http://ghc.haskell.org/trac/ghc/ticket/7379#7379: rangeTest test fails on WindowsWed, 31 Oct 2012 16:02:41 GMTigloo<p>
The <tt>CWchar</tt> type on Win32 is unsigned:
</p>
<pre class="wiki">Prelude> minBound :: Foreign.C.Types.CWchar
0
Prelude> (-100) :: Foreign.C.Types.CWchar
65436
</pre><p>
This causes the <tt>rangeTest</tt> test to fail:
</p>
<pre class="wiki">CWchar R:
Stderr:
rangeTest.exe: broke lower bound: 100
</pre><p>
Ryan, could you take a look please?
</p>
Resultshttp://ghc.haskell.org/trac/ghc/ticket/7379#changelog
http://ghc.haskell.org/trac/ghc/ticket/7936
http://ghc.haskell.org/trac/ghc/ticket/7936#7936: newStdGen leaks memory when result is not usedSun, 26 May 2013 20:38:19 GMTryantrinkle<p>
When newStdGen is invoked repeatedly without making any use of the resulting <a class="missing wiki">StdGen?</a> values, it leaks memory, as in the following:
</p>
<pre class="wiki">-- Leaks memory
main = forever newStdGen
</pre><p>
When a random number produced by the result is evaluated, no leak occurs:
</p>
<pre class="wiki">-- Does not leak memory
main = forever $ do
r <- newStdGen
evaluate $ fst $ next r
</pre><p>
However, evaluating the result itself is not sufficient to eliminate the leak:
</p>
<pre class="wiki">-- Leaks memory
main = forever $ do
r <- newStdGen
evaluate r
</pre><p>
After sufficient iterations of newStdGen, subsequent use will cause a stack overflow:
</p>
<pre class="wiki">-- Causes stack overflow
main = do
replicateM 1000000 newStdGen
r <- newStdGen
evaluate $ fst $ next r
</pre>Resultshttp://ghc.haskell.org/trac/ghc/ticket/7936#changelog
http://ghc.haskell.org/trac/ghc/ticket/8704
http://ghc.haskell.org/trac/ghc/ticket/8704#8704: Use GHC.Exts.build in randoms, randomRs to achieve fusionMon, 27 Jan 2014 08:12:55 GMTion1<p>
randoms, randomRs could take advantage of list fusion.
</p>
<p>
A commit is attached for consideration.
</p>
Resultshttp://ghc.haskell.org/trac/ghc/ticket/8704#changelog
http://ghc.haskell.org/trac/ghc/ticket/8899
http://ghc.haskell.org/trac/ghc/ticket/8899#8899: StdGen does not generate 0Sat, 15 Mar 2014 15:57:00 GMTnovadenizen<p>
<tt>genRange</tt> for <tt>StdGen</tt> returns <tt>(0,2147483562)</tt>. However, as far as I can tell, <tt>StdGen</tt> doesn't generate 0.
</p>
<p>
This code performs 200 billion iterations of <tt>next</tt> on a <tt>StdGen</tt>. I ran it and it output <tt>Nothing</tt>. The probability that no 0 was generated by chance is approximately <em>e<sup>-200/2.147</sup></em> =~ <em>10<sup>-40</sup></em>.
</p>
<div class="code"><pre><span class="kr">import</span> <span class="nn">System.Random</span>
<span class="kr">import</span> <span class="nn">Data.Int</span>
<span class="nf">find0</span> <span class="ow">::</span> <span class="kt">StdGen</span> <span class="ow">-></span> <span class="kt">Int64</span> <span class="ow">-></span> <span class="kt">Maybe</span> <span class="kt">Int64</span>
<span class="nf">find0</span> g0 n0 <span class="ow">=</span> aux g0 n0 <span class="kr">where</span>
aux <span class="kr">_</span> <span class="mi">0</span> <span class="ow">=</span> <span class="kt">Nothing</span>
aux g r <span class="ow">=</span> <span class="kr">let</span> <span class="p">(</span>v<span class="p">,</span>g'<span class="p">)</span> <span class="ow">=</span> next g <span class="kr">in</span>
<span class="kr">if</span> v <span class="o">==</span> <span class="mi">0</span> <span class="kr">then</span> <span class="kt">Just</span> <span class="p">(</span>n0 <span class="o">-</span> r <span class="o">+</span> <span class="mi">1</span><span class="p">)</span>
<span class="kr">else</span> aux g' <span class="p">(</span>r<span class="o">-</span><span class="mi">1</span><span class="p">)</span>
<span class="nf">main</span> <span class="ow">::</span> <span class="kt">IO</span> <span class="nb">()</span>
<span class="nf">main</span> <span class="ow">=</span> print <span class="o">$</span> find0 <span class="p">(</span>mkStdGen <span class="mi">1234</span><span class="p">)</span> <span class="mi">200000000000</span>
</pre></div>Resultshttp://ghc.haskell.org/trac/ghc/ticket/8899#changelog