Opened 3 years ago

Closed 22 months ago

#9058 closed bug (fixed)

System.IO.openTempFile does not scale

Reported by: slyfox Owned by:
Priority: normal Milestone: 8.0.1
Component: Compiler Version: 7.8.2
Keywords: Cc:
Operating System: Unknown/Multiple Architecture: Unknown/Multiple
Type of failure: None/Unknown Test Case:
Blocked By: Blocking:
Related Tickets: #10940 Differential Rev(s):
Wiki Page:


In search of a bug in darcs i've notice very bad property of openTempFile: it's pattern is very predictable and has O(n2) of already created temp files.

Predictability allows very fun bugs survive in buggy programs, like:

    (fn, fh) <- openTempFile "." "hello"
    renameFile fn "something"
    -- some time after
    when (some_rare_buggy_condition) $
        -- oops, reused temp name, but too late, other thread killed it
        writeFileFile fn
    (fn, fh) <- openTempFile "." "hello"
    workWithFn fn -- nobody should touch it, right?

It's _very_ hard to debug data corruption when all temp files are named "foo${pid}" and sometimes "foo${pid+1}".

And more serious bug: the more threads you have trying to create similar temps performance drops significantly:

Attached program shows the following numbers:

$ time ./bench-temps same 2000

real    0m2.795s
user    0m1.516s
sys     0m1.190s

$ time ./bench-temps diff 2000

real    0m0.161s
user    0m0.043s
sys     0m0.115s

It's O(N2) growing open() storm.

    FileExists -> findTempName (x + 1)

This is the source of the problem. I'd suggest always using random name for it. For portability reasons I suggest adding at least insecure random rand() value from C library.

That way we will succeed in opening temp file at the first attempt.

Attachments (3)

bench-temps.hs (985 bytes) - added by slyfox 3 years ago.
base-openTempFile-random-name-untested.patch (2.2 KB) - added by slyfox 3 years ago.
T9058-do-openTempFile-be-more-random.patch (2.7 KB) - added by slyfox 3 years ago.
T9058-do-openTempFile-be-more-random.patch - this one is tested, survives bootstrap

Download all attachments as: .zip

Change History (10)

Changed 3 years ago by slyfox

Attachment: bench-temps.hs added


Changed 3 years ago by slyfox


comment:1 Changed 3 years ago by slyfox

Here is one simple solution. it basically changes

prefix ++ getpid() ++ seq_no ++ suffix


prefix ++ rand() ++ rand() ++ suffix

Technically rand() might be thread-unsafe (but not on glibc, there random_r() is used silently) but i don't think it's a troblem at all.

Otherwise we could use time-based timestamp, but picking precise time source (to use it more, than 1000 times a second) might be a problem.

Last edited 3 years ago by slyfox (previous) (diff)

Changed 3 years ago by slyfox

T9058-do-openTempFile-be-more-random.patch - this one is tested, survives bootstrap

comment:2 Changed 3 years ago by slyfox

Status: newpatch

comment:3 Changed 3 years ago by thoughtpolice

Milestone: 7.10.1
Resolution: fixed
Status: patchclosed

Merged, thanks!

comment:4 Changed 3 years ago by hvr

For the record, here's the commit that failed to use the proper ticket-reference syntax: f510c7cac5b2e9afe0ebde2766a671c59137f3cc

comment:5 Changed 2 years ago by rwbarton

Resolution: fixed
Status: closednew

This does now mean that the sequence of filenames is completely fixed for a given program (since we don't seed the random number generator) which has some obvious downsides.

It would be nice to just use mkstemps, but maybe that is not portable enough? I wonder what other languages' standard libraries do here?

comment:6 Changed 23 months ago by thoughtpolice


Moving to 8.0.1 (since the semantics aren't going to change in a minor point release).

comment:7 Changed 22 months ago by thomie

Resolution: fixed
Status: newclosed

Closing this in favor of #10940, which has an example showing the problem mentined in comment:5.

Note: See TracTickets for help on using tickets.