Opened 5 years ago

Last modified 19 months ago

#3231 new bug

Permission denied error with runProcess/openFile

Reported by: NeilMitchell Owned by: simonmar
Priority: normal Milestone: 7.6.2
Component: libraries/base Version: 6.10.4
Keywords: Cc: ndmitchell@…, lennart@…
Operating System: Windows Architecture: Unknown/Multiple
Type of failure: None/Unknown Difficulty: Unknown
Test Case: Blocked By:
Blocking: Related Tickets:

Description

Given this program:

module Main() where

import Control.Concurrent
import System.IO
import System.Process

main = do
    hSetBuffering stdout NoBuffering
    forkIO $ f "foo1.txt"
    forkIO $ f "foo2.txt"
    threadDelay $ 100*1000000
    putStrLn "Finished successfully"

f file = do
    h <- openFile file AppendMode
    hPutStrLn h "fakdjsklj"
    putChar '.'
    pid <- runProcess "sh" ["-c","sleep 0.1s"] Nothing Nothing Nothing (Just h) (Just h)
    waitForProcess pid
    f file

Running under Cygwin, in GHC 6.10.2, I get:

$ runhaskell Test.hs
..Test.hs: foo1.txt: openFile: permission denied (Permission denied)

It shouldn't - the openFile calls should always succeed. This bug is a reduced test case from a real system, which I papered over with:

retryIO :: IO a -> IO a
retryIO act = catchIO act $ \x -> do
    threadDelay $ 1 * 1000000 -- 1 second
    performGC
    act

Now calling retryIO $ openFile ... works reliably. These problems are occurring sufficiently often that retryIO is about to go in to our standard library :-)

This may be related to #2924, but has the advantage of replicating easily.

Change History (47)

comment:1 Changed 5 years ago by NeilMitchell

  • Cc ndmitchell@… added

comment:2 Changed 5 years ago by NeilMitchell

I have just tested, and this bug does not occur under Linux.

comment:3 follow-up: Changed 5 years ago by NeilMitchell

Another version, this time not tying the process to a handle:

module Main() where

import Control.Concurrent
import System.IO
import System.Cmd
import System.Directory

main = do
    hSetBuffering stdout NoBuffering
    forkIO $ f "foo1.txt"
    forkIO $ f "foo2.txt"
    threadDelay $ 100*1000000
    putStrLn "Finished successfully"

f file = do
    h <- openFile file WriteMode
    hPutStrLn h "fjkladsf"
    system "sleep 1s"
    putChar '.'
    hClose h
    removeFile file
    f file

This version fails under both Cygwin and from the Windows command prompt with:

$ runhaskell Test.hs
.Test.hs: DeleteFile: permission denied (The process cannot access the file beca
use it is being used by another process.)

This version works fine if the system call is removed. That call shouldn't have any difference to the usage of any handles. It seems that system calls mess with handles and resources in painful ways. This is now almost certainly the same underlying cause as the problems I reported in bug #2924.

comment:4 Changed 5 years ago by NeilMitchell

The removeFile call is actually redundant. Without it I get:

$ runhaskell Test.hs
.Test.hs: foo1.txt: openFile: permission denied (Permission denied)

comment:5 in reply to: ↑ 3 Changed 5 years ago by duncan

Replying to NeilMitchell:

This version works fine if the system call is removed. That call shouldn't have any difference to the usage of any handles. It seems that system calls mess with handles and resources in painful ways. This is now almost certainly the same underlying cause as the problems I reported in bug #2924.

Presumably it is the system call that is crucial. Delaying by one second in Haskell code presumably works ok?

If so, my guess is that it's related to the open file handles being inherited by the child process and then the use of the handle in the child conflicts with attempts to remove or re-open the same file in the parent.

Of course we wait for the child process to terminate so in principle this should not be a problem, the handles used in the child process should now be closed. However I have a strong suspicion that Windows is using delayed deallocation/unlocking of handles when a process terminates.

I've seen behaviour in cabal-install where we wait for a program to terminate, which had an open file in the directory we're about to delete, and when we try to remove the file we often get a permission error.

comment:6 follow-up: Changed 5 years ago by NeilMitchell

Yes, the system call is crucial - a similar threadDelay has no problems. I only used sleep because it's a simple process that was available, I have no reason to think it's the delay causing a problem.

In the sleep example, the child process isn't actually using the open file handle - it was in the original example but that turned out to be irrelevant.

BTW, this error is pretty severe for certain users - threads+windows+files+system is a combination that crops up everywhere when using Haskell as a powerful scripting language.

comment:7 in reply to: ↑ 6 Changed 5 years ago by duncan

Replying to NeilMitchell:

In the sleep example, the child process isn't actually using the open file handle - it was in the original example but that turned out to be irrelevant.

It doesn't matter if it uses it or not. If the handle is set to be inheritable then all child processes will get it and hold it open.

So there are apparently two problems:

  • waiting for a process to terminate does not appear to be enough to ensure that handles it had open are now closed.
  • openFile appears to create handles that are inheritable. The default should almost certainly be non-inheritable (we have similar issues on unix but they're less severe because unix lacks strong file locking).

I suggest that both of these need confirming or refuting.

comment:8 Changed 5 years ago by claus

Thanks for bringing this from rumour to concrete code. Sounded like a good example to try out ProcessExplorer? on!-) I tried the following variation, to give me some control and time to see the open handles:

module Main() where

import Control.Concurrent
import System.IO
import System.Directory
import System.Process
import Control.Exception(bracket)

my_system str = do
  (_,_,_,p) <- createProcess c
  waitForProcess p
  where c = CreateProcess { cmdspec = ShellCommand str,
                            cwd = Nothing,
                            env = Nothing,
                            std_in = Inherit,
                            std_out = Inherit,
                            std_err = Inherit,
                            close_fds = True }

main = do
   v <- newEmptyMVar
   hSetBuffering stdout NoBuffering
   forkIO $ f v "foo1.txt"
   forkIO $ f v "foo2.txt"
   threadDelay $ 100*1000000
   putStrLn "Finished successfully"

f v file = do 
   bracket (openFile file WriteMode)
           (hClose)
           (\h->do hPutStrLn h file
                   hPutStr stderr (">"++file++"< ")
                   my_system "sleep 5s"
           )
   -- takeMVar v
   hPutStr stderr ("<"++file++"> ")
   my_system "sleep 5s"
   removeFile file
   f v file

It seems that no matter what I set close_fds to (default False?), both GHC and one of its two children sometimes hang on to both files, while the other child hangs on to foo1 only?? Also, there is occasionally a new pair of children, before the old pair is gone (this tends to preceed the access error). Are these just ProcessExplorer? sampling artifacts, am I misreading the data, or is there something else going on (ghc 6.11.20090320)?

Btw, after Duncan's remark, I looked up DeleteFile and found these two -seemingly contradictory- remarks:

The DeleteFile? function fails if an application attempts to delete a file that is open for normal I/O or as a memory-mapped file.

The DeleteFile? function marks a file for deletion on close. Therefore, the file deletion does not occur until the last handle to the file is closed. Subsequent calls to CreateFile? to open the file fail with ERROR_ACCESS_DENIED.

Could anyone please explain what that second remark means, given the first? What is the non-normal I/O it seems to apply to?

comment:9 Changed 5 years ago by NeilMitchell

Claus: Remove the removeFile call, it's not needed, and the error still occurs without it. I thought that DeleteFile could delete a file that was being executed, but not one that is just normally open - that may be where the confusion in MSDN lies. Either way, by removing the removeFile that point becomes less important.

comment:10 Changed 5 years ago by NeilMitchell

FWIW, I just had a failure occur where retryIO (a one second wait and GC) wasn't sufficient to solve the problem.

comment:11 follow-up: Changed 5 years ago by Deewiant

This appears to be a duplicate of #2650? I have some partial fixes for the issue there.

comment:12 in reply to: ↑ 11 ; follow-up: Changed 5 years ago by simonmar

  • Difficulty set to Unknown

Replying to Deewiant:

This appears to be a duplicate of #2650? I have some partial fixes for the issue there.

So the fix to add _O_NOINHERIT when opening files could be applied, but I'm not sure whether it will have any undesirable consequences. Anyone?

comment:13 Changed 5 years ago by augustss

  • Cc lennart@… added
  • Component changed from Compiler to Runtime System
  • severity changed from normal to major

Since this seems to be similar to #2650 I've bumped it to major severity.
There are also patches in #2650, so it would be nice to see those applied.
(It's already been moved forward several times.)

comment:14 in reply to: ↑ 12 ; follow-up: Changed 5 years ago by claus

This appears to be a duplicate of #2650? I have some partial fixes for the issue there.

So the fix to add _O_NOINHERIT when opening files could be applied, but I'm not sure whether it will have any undesirable consequences. Anyone?

If there is no way to implement the close_fds parameter to createProcess properly on Windows, that should at least be mentioned in the documentation! Since the parameter is there, I'd prefer to have it implemented, rather than ignored, though.

Anyway, that means we can simplify the example, right?

module Main() where

import Control.Concurrent
import System.IO
import System.Process

my_system str = do
  (_,_,_,p) <- createProcess c
  waitForProcess p
  where c = CreateProcess { cmdspec = ShellCommand str,
                            cwd = Nothing,
                            env = Nothing,
                            std_in = Inherit,
                            std_out = Inherit,
                            std_err = Inherit,
                            close_fds = True } -- close_fds is ignored on windows!-(

main = do
  va <- newEmptyMVar
  vb <- newEmptyMVar
  forkIO $ p va vb "foo1.txt"
  takeMVar va
  forkIO $ my_system "sleep 60s" >> return () 
  putMVar vb ()
  putStrLn "Finished"

p va vb file = do
  h <- openFile file WriteMode
  putMVar va ()
  takeMVar vb
  hClose h
  h <- openFile file WriteMode
  putStrLn "Success?"
  hClose h

Seems to fail quite reliably here, while it works when the system call is commented out.

comment:15 Changed 5 years ago by duncan

I originally thought I saw two problems in Neil's example, but looking at it again I see that he's doing the same thing in two threads separately, so each child process is holding the file from both threads open. At some point as the two threads get out of sync then one will have it open while the parent tries to open it again.

So perhaps there's nothing happening here more than #2650. Which is good because we pretty much understand that one.

comment:16 in reply to: ↑ 14 Changed 5 years ago by duncan

Replying to claus:

If there is no way to implement the close_fds parameter to createProcess properly on Windows, that should at least be mentioned in the documentation! Since the parameter is there, I'd prefer to have it implemented, rather than ignored, though.

There is a way on XP at least for the case where you're not setting stdin/out. On Vista it looks like it can be made to work in all cases. See #2650.

comment:17 follow-up: Changed 5 years ago by simonmar

It's pretty clear we need to get this resolved. Here's my proposed plan, please tell me if any of this will have untoward consequences:

  • we use O_NOHINHERIT when opening files with System.IO.openFile.
  • when close_fds is True, and we are not redirecting any standard HANDLEs, we set bInheritHandles to FALSE when calling CreateProcess
  • we document the fact that close_fds only works on Windows when no Handles are being redirected.
  • we audit the libraries for other places that might need to create non-inheritable HANDLEs.

We can aim to get these fixes into 6.10.4, but since 6.10.4 is supposed to be the last stable release on the 6.10 branch (like 6.10.3 was :-) we'll need a lot of testing.

comment:18 Changed 5 years ago by simonmar

  • Milestone set to 6.10.4

comment:19 in reply to: ↑ 17 Changed 5 years ago by duncan

Replying to simonmar:

It's pretty clear we need to get this resolved. Here's my proposed plan, please tell me if any of this will have untoward consequences:

All looks fine to me.

  • when close_fds is True, and we are not redirecting any standard HANDLEs, we set bInheritHandles to FALSE when calling CreateProcess

We should also keep a ticket open on this one to fix it fully in Vista using new APIs. In Vista we can pass CreateProcess a list of handles that will be inherited, ie the stdin/out pipe handles.

comment:20 follow-up: Changed 5 years ago by NeilMitchell

So, to clarify:

  • What is the effect of close_fds on Windows?
  • If I redirect stdout and stdin to handles, will this bug still occur?

comment:21 in reply to: ↑ 20 ; follow-up: Changed 5 years ago by simonmar

Replying to NeilMitchell:

So, to clarify:

  • What is the effect of close_fds on Windows?

If you use close_fds and do not redirect stdin, stdout or stderr, then the child will not inherit any HANDLEs from the child. If you do redirect one or more standard handles, then HANDLEs created other than by System.IO and System.Process functions might be inherited, depending on how they were created.

  • If I redirect stdout and stdin to handles, will this bug still occur?

Not with respect to HANDLEs created by System.IO and System.Process, but it might occur if you manage to create HANDLEs using some other library. In that case you'll have to talk to the maintainer of said library and ask them to create their HANDLEs non-inheritable, or otherwise use Vista and wait until we implement the Vista-specific support for close_fds.

Ok, everybody clear now? :)

comment:22 in reply to: ↑ 21 Changed 5 years ago by duncan

Replying to simonmar:

Replying to NeilMitchell:

So, to clarify:

  • What is the effect of close_fds on Windows?

If you use close_fds and do not redirect stdin, stdout or stderr, then the child will not inherit any HANDLEs from the child. If you do redirect one or more standard handles, then HANDLEs created other than by System.IO and System.Process functions might be inherited, depending on how they were created.

This is the future intended behaviour. The current behaviour of course is that all file handles are opened in the inheritable state and all inheritable handles are always inherited by child processes, irrespective of the value of close_fds.

  • If I redirect stdout and stdin to handles, will this bug still occur?

Not with respect to HANDLEs created by System.IO and System.Process, but it might occur if you manage to create HANDLEs using some other library. In that case you'll have to talk to the maintainer of said library and ask them to create their HANDLEs non-inheritable, or otherwise use Vista and wait until we implement the Vista-specific support for close_fds.

It will also be the case that the pipes created to talk to one child process may be inherited by another child process started at a similar time. This is a problem if you are spawning processes in different security contexts. Sithin the same security context it should be benign because pipes have no equivalent to exclusive file locking which is what causes the permission denied errors.

comment:23 Changed 5 years ago by simonmar

  • Owner set to igloo
  • Type changed from bug to merge

Done. In base:

Wed May 20 06:09:26 PDT 2009  Simon Marlow <marlowsd@gmail.com>
  * add _O_NOINHERIT when opening files on Windows (see #2650)

In process (this also fixes the documentation for close_fds):

Wed May 20 07:07:26 PDT 2009  Simon Marlow <marlowsd@gmail.com>
  * partially implement close_fds on Windows (#3231)

comment:24 Changed 5 years ago by igloo

  • Resolution set to fixed
  • Status changed from new to closed

Both merged.

comment:25 Changed 5 years ago by yugr

  • Resolution fixed deleted
  • severity changed from major to normal
  • Status changed from closed to reopened
  • Type changed from merge to bug
  • Version changed from 6.10.2 to 6.10.4

I think the problem remains. This

module Main where

import System
import System.IO
import System.Process
import System.Directory

import Control.Monad

import Data.List

tempFile = "mytempfile.txt"

run :: FilePath? -> IO String
run exe = do

h <- openFile tempFile WriteMode?

pid <- runProcess exe [] Nothing Nothing Nothing (Just h) (Just h)
exitCode <- waitForProcess pid

hClose h

if exitCode /= ExitSuccess?

then error $ exe ++ " failed"
else do

readFile tempFile

main = replicateM_ 10 (run "dir")

gives me

* Exception: mytempfile.txt: openFile: permission denied (Permission denied)

comment:26 Changed 5 years ago by simonmar

  • Milestone changed from 6.10.4 to 6.12.1
  • Owner changed from igloo to simonmar
  • Status changed from reopened to new

We need to look at this before 6.12.1

comment:27 Changed 5 years ago by simonmar

  • Resolution set to invalid
  • Status changed from new to closed

Repeating the program, to get the markup right:

module Main where
import System
import System.IO
import System.Process
import System.Directory
import Control.Monad
import Data.List
import Control.Concurrent

tempFile = "mytempfile.txt"

run :: FilePath -> IO String
run exe = do
    h <- openFile tempFile WriteMode
    pid <- runProcess exe [] Nothing Nothing Nothing (Just h) Nothing
    exitCode <- waitForProcess pid
    hClose h

    if exitCode /= ExitSuccess
        then error $ exe ++ " failed"
        else readFile tempFile

main = replicateM_ 10 $ run "ls"

The program does indeed fail with openFile: permission denied, but it's not a problem with runProcess. In the else branch of the if you have a readFile which opens a lazy stream to read tempfile, and then the next iteration attempts to open the file for writing. It's illegal to have the same file open for both reading and writing, hence the error.

Unfortunately neither myself nor Duncan Coutts spotted the problem immediately, because we were looking in the wrong place. readFile considered harmful, IMO!

comment:28 follow-up: Changed 5 years ago by yugr

Fine but can I fix my code? I tried to use readFile $! tempFile (strict application) but it did not work... I think the task I want to solve here is rather general and needs solution.

comment:29 Changed 5 years ago by NeilMitchell

There are many solutions - a really simple one is to use http://hackage.haskell.org/packages/archive/bytestring/0.9.1.4/doc/html/Data-ByteString.html#v%3AreadFile

That will read the file strictly, so you won't have a problem.

comment:3 in reply to: ↑ 28 ; follow-up: Changed 5 years ago by duncan

Replying to yugr:

Fine but can I fix my code? I tried to use readFile $! tempFile (strict application) but it did not work...

The point is readFile opens the file but does not read the content until you consume the String. The solution is either to 1) consume all the data, 2) use System.IO.withFile, or as Neil suggests, 3) to read the whole file into memory up front.

comment:31 Changed 5 years ago by yugr

Hi all,

Sorry to bother you once again but I think we may still have problem with #3231 (http://hackage.haskell.org/trac/ghc/ticket/3231).

Before I go into details have a look at the code below - is it free from problem with lazy IO which you mentioned above?

I am still running into errors with a small multi-process program:
bug.exe: mytempfile.txt: openFile: permission denied (Permission denied)
or if I remove if-then-else
bug.exe: DeleteFile?: permission denied ...

module Main where

import System
import System.IO
import System.Process (runProcess, waitForProcess)
import System.Directory (removeFile)

import Control.Monad (replicateM_)
import Control.Parallel (pseq)

import qualified Data.ByteString as B

run :: FilePath -> IO ()
run exe = do
  let tempFile = "mytempfile.txt"

  h <- openFile tempFile WriteMode

  exitCode <- waitForProcess =<< runProcess exe [] Nothing Nothing Nothing (Just h) (Just h)

  hClose h >> (if exitCode /= ExitSuccess then return () else B.readFile tempFile >>= B.putStr) >> removeFile tempFile

main = replicateM_ 100 (putStrLn "Next:" >> run "a.exe")

comment:32 Changed 5 years ago by simonmar

I don't see any problems with the code, although I don't know what the "a.exe" program is supposed to be. I substitued "ls" and it seemed to work for me with 6.12.

comment:33 Changed 5 years ago by yugr

Thanks, Simon.

Now goes the tricky part. Below you will find the text of a.exe. It is a simple WinAPI program which starts a few processes, sleeps a bit, terminates them and exits. With this program I receive the errors which I mentioned above:

bug.exe: mytempfile.txt: openFile: permission denied (Permission denied)

or if I remove if-then-else

bug.exe: DeleteFile

Here is my idea of what is happening. As you may know TerminateProcess does not necesseraly terminate the process immediatelly (or even terminate it at all). This is especially common if process holds some system resource (pipe handle in this case). So when main process exits some children may still be alive and keep handles for stdout and stderr open for writing (for a few milliseconds). This causes errors when I try to read file in Haskell (via B.readFile) or even if I try to remove it (via removeFile).

Now the question - should this problem (error when trying to read stdout/stderr of multiprocess program) be considered a feature or a bug? If it is a feature - can I handle it from inside Haskell (e.g. imagine that I do not have access to source of a.exe)?

Here is the program text. Error occurs both when compiled with Cygwin's gcc (just gcc tmp.c) and with Visual Studio's cl. If someone needs a Makefile or M$VS project - let me know and I will send you one.

#include <windows.h>

#include <stdlib.h>
#include <stdio.h>
#include <assert.h>

#define PIPE_NAME "\\\\.\\pipe\\MySmallPipe"
#define PIPE_BUF_SIZE 10
#define TIMEOUT 1000

#define NCLIENTS 30

int main(int argc, char *argv[]) {
	if( 1 == argc ) {
		//Caller

		struct {
			HANDLE hProcess;
			HANDLE hPipe;
			OVERLAPPED ov;
		} data[NCLIENTS];

		//Start clients
		int i;
		for(i = 0; i < NCLIENTS; ++i) {
			fprintf(stderr, "Starting %d\n", i);

			data[i].hPipe = CreateNamedPipe(
				PIPE_NAME,
				PIPE_ACCESS_DUPLEX | FILE_FLAG_OVERLAPPED,
				PIPE_TYPE_BYTE | PIPE_READMODE_BYTE | PIPE_WAIT,
				100,
				PIPE_BUF_SIZE,
				PIPE_BUF_SIZE,
				TIMEOUT,
				0
			);
			assert(INVALID_HANDLE_VALUE != data[i].hPipe);

			STARTUPINFO si;
			memset(&si, 0, sizeof (STARTUPINFO));
			si.cb = sizeof(STARTUPINFO);    

			PROCESS_INFORMATION pi;

			BOOL res = CreateProcess(0, strdup("a.exe 1"), 0, 0, TRUE, 0, 0, 0, &si, &pi);
			assert(res);

			data[i].hProcess = pi.hProcess;

			ConnectNamedPipe(data[i].hPipe, &data[i].ov);
		} //i

		fprintf(stderr, "Delay\n");
		Sleep(TIMEOUT);

		//Kill clients
		for(i = 0; i < NCLIENTS; ++i) {
			fprintf(stderr, "Terminating %d\n", i);
			BOOL res = TerminateProcess(data[i].hProcess, 0);

			DisconnectNamedPipe(data[i].hPipe);
			CloseHandle(data[i].hPipe);
		} //i
	} else {
		//Callee

		fprintf(stderr, "Started\n");

		BOOL res = WaitNamedPipe(PIPE_NAME, TIMEOUT);
		assert(res);

		HANDLE hPipe = CreateFile(
			PIPE_NAME,
			GENERIC_READ | GENERIC_WRITE,
			0,
			0,
			OPEN_EXISTING,
			0,
			0
		);

		//Sleep forever
		char buf[PIPE_BUF_SIZE];
		DWORD n;
		res = ReadFile(
			hPipe,
			buf,
			PIPE_BUF_SIZE,
			&n,
			0
		);
	}

	return 0;
}

Best regards,
Yuri

comment:34 Changed 5 years ago by yugr

  • Resolution invalid deleted
  • Status changed from closed to reopened

Report as bug until proven otherwise.

comment:35 Changed 4 years ago by igloo

  • Milestone changed from 6.12.1 to 6.12 branch
  • Type of failure set to None/Unknown

comment:36 Changed 4 years ago by igloo

  • Milestone changed from 6.12 branch to 6.12.3

comment:37 Changed 4 years ago by igloo

  • Milestone changed from 6.12.3 to 6.14.1
  • Priority changed from normal to low

comment:38 Changed 3 years ago by igloo

  • Milestone changed from 7.0.1 to 7.0.2

comment:39 Changed 3 years ago by igloo

  • Milestone changed from 7.0.2 to 7.2.1

comment:40 Changed 3 years ago by claudio

The basic problem is, that you have to deal with fatal and non-fatal errors.

The code in Win32/File.hsc regards non-fatal errors as fatal and the user doesn't have any chance to check if the error was indeed fatal because there's no way of getting the error code.

See http://support.microsoft.com/kb/316609

Here's a patch which fixes this problem for removeFile:

 -- File operations
 ----------------------------------------------------------------

+foreign import stdcall unsafe "windows.h Sleep"
+  sleep_priv :: DWORD -> IO ()
+
 deleteFile :: String -> IO ()
 deleteFile name =
   withTString name $ \ c_name ->
-  failIfFalse_ (unwords ["DeleteFile",show name]) $
-    c_DeleteFile c_name
+    let
+      retries = 20
+      doDelete = c_DeleteFile c_name
+      deleteFailed = errorWin (unwords ["DeleteFile",show name])
+      when p s = if p then s else return ()
+      unless p s = when (not p) s
+      retryOrFail :: Int -> IO Bool -> IO ()
+      retryOrFail times action = do
+        ret <- action
+        unless ret (do
+                       when (times <= 0) deleteFailed
+                       err_code <- getLastError
+                       when (err_code /= (# const ERROR_SHARING_VIOLATION )) deleteFailed
+                       sleep_priv 100 >> retryOrFail (times - 1) action)
+    in
+      retryOrFail retries doDelete
 foreign import stdcall unsafe "windows.h DeleteFileW"
   c_DeleteFile :: LPCTSTR -> IO Bool

This made gitit and darcs work under Windows for me.

Just my 2 cents.

comment:41 Changed 3 years ago by simonmar

Thanks, I'll look into this. Incidentally, if applications really do have to manually sleep and retry a few times after getting ERROR_SHARING_VIOLATION, that is a completely broken design. The call should block, possibly with a timeout.

comment:42 Changed 3 years ago by simonmar

Just pushed to Win32:

commit 085b11285b6adbc6484d9c21f5e0b8105556869c
Author: Simon Marlow <marlowsd@gmail.com>
Date:   Tue Aug 16 13:53:39 2011 +0100

    If a file operation fails with ERROR_SHARING_VIOLATION, wait and retry
    a few times as per recommendations in
    
    http://support.microsoft.com/kb/316609
    
    thanks to claudio on #3231 for the pointer and an initial patch, which
    I've refactored and extended to cover more operations.

It validates, but I'd like to test it a bit more before closing the ticket.

comment:43 Changed 3 years ago by igloo

  • Milestone changed from 7.2.1 to 7.4.1
  • Priority changed from low to high

Bumping priority/milestone so we don't forget to test + close the ticket.

comment:44 Changed 2 years ago by simonmar

  • Component changed from Runtime System to libraries/base
  • Milestone changed from 7.4.1 to 7.6.1
  • Priority changed from high to normal

Current status of this ticket

  • We have implemented automatic delay+retry for many of the filesystem operations in the win32 package (085b11285b6adbc6484d9c21f5e0b8105556869c), and this behaviour is inherited by System.Directory. This means that accidental sharing violations for things like removeFile do not happen any more.
  • However, we have not implemented delay+retry for System.IO.openFile. It is not clear how to do that easily, because openFile calls into msvcrt rather than Win32 directly. Ideally we should rewrite the IO library to call Win32 rather than msvcrt or mingw APIs.

I'll reproduce the test case from above. First, compile this C file with gcc tmp.c:

#include <windows.h>

#include <stdlib.h>
#include <stdio.h>
#include <assert.h>

#define PIPE_NAME "\\\\.\\pipe\\MySmallPipe"
#define PIPE_BUF_SIZE 10
#define TIMEOUT 1000

#define NCLIENTS 30

int main(int argc, char *argv[]) {
	if( 1 == argc ) {
		//Caller

		struct {
			HANDLE hProcess;
			HANDLE hPipe;
			OVERLAPPED ov;
		} data[NCLIENTS];

		//Start clients
		int i;
		for(i = 0; i < NCLIENTS; ++i) {
			fprintf(stderr, "Starting %d\n", i);

			data[i].hPipe = CreateNamedPipe(
				PIPE_NAME,
				PIPE_ACCESS_DUPLEX | FILE_FLAG_OVERLAPPED,
				PIPE_TYPE_BYTE | PIPE_READMODE_BYTE | PIPE_WAIT,
				100,
				PIPE_BUF_SIZE,
				PIPE_BUF_SIZE,
				TIMEOUT,
				0
			);
			assert(INVALID_HANDLE_VALUE != data[i].hPipe);

			STARTUPINFO si;
			memset(&si, 0, sizeof (STARTUPINFO));
			si.cb = sizeof(STARTUPINFO);    

			PROCESS_INFORMATION pi;

			BOOL res = CreateProcess(0, strdup("a.exe 1"), 0, 0, TRUE, 0, 0, 0, &si, &pi);
			assert(res);

			data[i].hProcess = pi.hProcess;

			ConnectNamedPipe(data[i].hPipe, &data[i].ov);
		} //i

		fprintf(stderr, "Delay\n");
		Sleep(TIMEOUT);

		//Kill clients
		for(i = 0; i < NCLIENTS; ++i) {
			fprintf(stderr, "Terminating %d\n", i);
			BOOL res = TerminateProcess(data[i].hProcess, 0);

			DisconnectNamedPipe(data[i].hPipe);
			CloseHandle(data[i].hPipe);
		} //i
	} else {
		//Callee

		fprintf(stderr, "Started\n");

		BOOL res = WaitNamedPipe(PIPE_NAME, TIMEOUT);
		assert(res);

		HANDLE hPipe = CreateFile(
			PIPE_NAME,
			GENERIC_READ | GENERIC_WRITE,
			0,
			0,
			OPEN_EXISTING,
			0,
			0
		);

		//Sleep forever
		char buf[PIPE_BUF_SIZE];
		DWORD n;
		res = ReadFile(
			hPipe,
			buf,
			PIPE_BUF_SIZE,
			&n,
			0
		);
	}

	return 0;
}

Now, compile this Haskell file with ghc bug.hs:

module Main where

-- import System
import System.IO
import System.Process (runProcess, waitForProcess)
import System.Directory (removeFile)

import Control.Monad (replicateM_)

import qualified Data.ByteString as B

run :: FilePath -> IO ()
run exe = do
  let tempFile = "mytempfile.txt"

  h <- openFile tempFile WriteMode

  exitCode <- waitForProcess =<< runProcess exe [] Nothing Nothing Nothing (Just h) (Just h)

  hClose h >> {- (if exitCode /= ExitSuccess then return () else B.readFile tempFile >>= B.putStr) >> -} removeFile tempFile

main = replicateM_ 100 (putStrLn "Next:" >> run "a.exe")

And run with ./bug.exe.

  • As is, the program will succeed with GHC 7.4.1 (but not earlier), because we fixed removeFile.
  • If the (if exitCode ...) is uncommented, then the program will fail, because we have not fixed openFile yet.

comment:45 follow-up: Changed 2 years ago by duncan

Possibly related here is: http://blogs.msdn.com/b/oldnewthing/archive/2012/03/29/10288686.aspx

It says that TerminateProcess is async and you have to wait for the process to really terminate before file locks are released. Of course here we think we are waiting for processes to terminate but perhaps we should double check how we're doing that.

comment:46 in reply to: ↑ 45 Changed 2 years ago by duncan

Replying to duncan:

Of course here we think we are waiting for processes to terminate but perhaps we should double check how we're doing that.

Hmm, seems we do call WaitForSingleObject on the process handle.

comment:47 Changed 19 months ago by igloo

  • Milestone changed from 7.6.1 to 7.6.2
Note: See TracTickets for help on using tickets.