Opened 8 years ago

Last modified 2 years ago

#4144 new bug

Exception: ToDo: hGetBuf - when using custom handle infrastructure

Reported by: AntoineLatter Owned by:
Priority: high Milestone:
Component: Core Libraries Version: 7.6.1
Keywords: Cc: AntoineLatter, joeyadams3.14159@…, core-libraries-committee@…, ekmett
Operating System: Unknown/Multiple Architecture: Unknown/Multiple
Type of failure: Runtime crash Test Case:
Blocked By: Blocking:
Related Tickets: Differential Rev(s):
Wiki Page:

Description (last modified by bgamari)

When trying to use the custom handle infrastructure, hGetContents fails like so:

*** Exception: ToDo: hGetBuf

This exception occurs twice in GHC.IO.Handle.Text

The handle implementation I'm using is attached.

It would be neat if I could pass along some witness that my device implements RawDevice, then we could just run the same code that we use for file-descriptors. But I'd be happy enough with a general solution, as I just plan to use this for testing.

Attachments (1)

ByteStringHandle.hs (3.4 KB) - added by AntoineLatter 8 years ago.
ByteString handle

Download all attachments as: .zip

Change History (25)

Changed 8 years ago by AntoineLatter

Attachment: ByteStringHandle.hs added

ByteString handle

comment:1 Changed 8 years ago by AntoineLatter

Code to demonstrate crash, given the above module and -XOverloadedStrings:

bsHandle "test" "<fake file>" >>= Data.ByteString.Char8.hGetContents

comment:2 Changed 8 years ago by AntoineLatter

Cc: AntoineLatter added

comment:3 Changed 8 years ago by simonmar

Resolution: fixed
Status: newclosed

This should fix it:

Thu Jun 24 14:04:25 BST 2010  Simon Marlow <marlowsd@gmail.com>
  * make the hGetBuf/hPutBuf family work with non-FD Handles (#4144)

if it doesn't find an FD in the Handle, it uses the normal buffered operations. Perhaps we could improve matters by allowing Handles to optionally provide RawIO, but that's something for another day.

Prelude ByteStringHandle> bsHandle "test" "<fake file>" >>= Data.ByteString.Char8.hGetContents
Loading package bytestring-0.9.1.7 ... linking ... done.
"test"

comment:4 Changed 6 years ago by joeyadams

Architecture: x86_64 (amd64)Unknown/Multiple
Cc: joeyadams3.14159@… added
Component: Compilerlibraries/base
Operating System: MacOS XUnknown/Multiple
Resolution: fixed
Status: closednew
Version: 6.12.37.6.1

The fix for this is incomplete. I implemented a custom handle to make network I/O interruptible on Windows, but Holger Reinhardt revealed a case where `Todo: hPutBuf` comes up.

It looks like bufWrite (in GHC.IO.Handle.Text), when told to write a chunk that is larger than the buffer size, does a RawIO writeChunk, which only works for FD. hGetBufSome has the same problem.

Another problem is that the RawIO methods are sometimes called with a size argument of zero. This can lead to failed system calls and unexpected side effects unless the implementation filters out zero-length calls. Also, if RawIO.read buf 0 returns 0, the callee might treat it as EOF.

comment:5 Changed 6 years ago by joeyadams

Owner: set to joeyadams

I plan to fix this by adding the following methods to BufferedIO:

  readBuffered :: dev -> Buffer Word8 -> Ptr Word8 -> Int -> IO (Int, Buffer Word8)

  readBuffered0 :: dev -> Buffer Word8 -> Ptr Word8 -> Int -> IO (Maybe Int, Buffer Word8)

  writeBuffered :: dev
                -> Bool             -- ^ If 'True', flush after writing.
                -> Buffer Word8
                -> Ptr Word8
                -> Int
                -> IO (Buffer Word8)

  writeBuffered0 :: dev
                 -> Bool -> Buffer Word8 -> Ptr Word8 -> Int
                 -> IO (Int, Buffer Word8)

The default implementations will use the fill and flush methods of BufferedIO, but an instance may provide more efficient substitutes. RawIO-based methods will be provided, which will go directly to the device when possible and efficient.

This will move a lot of the complexity of the hGetBuf, hPutBuf, etc. functions out of GHC.IO.Handle.Text and into GHC.IO.BufferedIO (in the form of default methods and RawIO drop-ins).

I will be changing the behavior of hGetBufSome slightly: if the buffer is not empty, it will fetch its contents and return immediately, rather than trying to squeeze more bytes out of the device with non-blocking fills and reads. See #5843

comment:6 Changed 6 years ago by joeyadams

Here are my changes, if anyone wants to review them:

https://github.com/joeyadams/packages-base/tree/custom-device-fix

sh validate passes on Ubuntu 10.04 64-bit (except for some unrelated tests which fail because my installed LLVM is too old). I'm still working on a test case.

comment:7 Changed 6 years ago by simonpj

difficulty: Unknown
Owner: changed from joeyadams to simonmar

Simon M, you look like the most plausible reviewer, since you did the earlier patch.

Simon

comment:8 Changed 6 years ago by simonpj

Milestone: 7.8.1
Priority: normalhigh
Status: newpatch

comment:9 Changed 6 years ago by simonmar

In principle this is a good change, however I don't agree with the changes in semantics of hGetBufNonBlocking and hGetBufSome. As it stands right now, hGetBuf, hGetBufNonBlocking and hGetBufSome will all read the same amount of data, if the data is available without blocking. But in your version (unless I've misread it), hGetBufNonBlocking will read at most a buffer of data, which seems to me to be an abstraction violation - the buffer size is supposed to be invisible to callers of the hGetBuf family.

Why make this change? It should be possible to loop and read more data in the same way as hGetBuf.

The only other minor quibble I have with this patch is that the documentation for readBuffered and friends could be improved - it isn't clear what the purpose of the buffer argument is (without reading the code).

comment:10 Changed 6 years ago by joeyadams

Thanks for the review.

As it stands right now, hGetBuf, hGetBufNonBlocking and hGetBufSome will all read the same amount of data, if the data is available without blocking.

I don't think this is the case anymore; see #5843 and commit 370fc0b.

The issue is that some devices don't support non-blocking I/O (e.g. file handles on Windows). Here's the problematic sequence:

  • Call hWaitForInput. It returns True when the buffer is not empty.
  • Call hGetBufSome, which we expect will not block. It first reads bytes from the buffer, then tries to do a "non-blocking" read. Since the non-blocking IO methods block on Windows, hGetBufSome blocks, even after it has read some data.

So here are some of our options:

  • Change semantics of hGetBufSome so that it will only block if there is no input. However, there is no guarantee that hGetBufSome will return all immediately available bytes.
  • On Windows, simulate nonblocking and IODevice.ready by continuously issuing reads in a forked thread to a buffer, and having I/O functions read from that buffer.
  • Deprecate hGetBufNonBlocking, hPutBufNonBlocking, hWaitForInput, and the non-blocking RawIO and BufferedIO methods. Come up with an alternative design that does not require the system to support non-blocking operation.

comment:11 Changed 6 years ago by simonmar

Ok, I see that hGetBufSome already has the behaviour that I claimed is erroneous. That's sad.

I'd really like to declare this to be a bug in the Windows implementation of Handles, but perhaps that's impractical. I think it might be possible to implement non-blocking I/O on Windows, but you have to do it differently for every type of Handle (console, socket, com port etc.). We do have a working implementation of hWaitForInput (see libraries/base/cbits/inputReady.c) but it's horrible.

Maybe deprecating the non-blocking APIs is the right way, since we are providing the ability to do asynchronous I/O using threads, and that's much nicer. But before we do that, we should look at how people are using these APIs. The only user of hGetBufSome that I know of, lazy bytestring, works just fine with the "read a random amount of data" semantics.

comment:12 Changed 6 years ago by joeyadams

For Windows, couldn't we implement non-blocking I/O correctly by first calling the inputReady.c function? That is, change readRawBufferPtrNoBlock and writeRawBufferPtrNoBlock to first call fdReady to make sure the operation won't block before proceeding.

comment:13 Changed 6 years ago by simonmar

You can't do an atomic non-blocking read, e.g. if another thread happened to read from the console and grabbed the character after you had called fdReady but before calling read, then the read would block. It is unlikely to happen, because someone would have to use raw System.Win32 operations to get separate access to the console or whatever, but it's possible.

comment:14 Changed 6 years ago by simonmar

see also #806

comment:15 Changed 5 years ago by igloo

So what's the proposal here exactly? Is it to deprecate these?:

System.IO.hGetBufSome
System.IO.hGetBufNonBlocking
System.IO.hPutBufNonBlocking
Data.ByteString.hGetNonBlocking
Data.ByteString.hPutNonBlocking

There are a couple of uses of hGetNonBlocking in haskeline.

comment:16 Changed 5 years ago by joeyadams

Replying to igloo:

So what's the proposal here exactly? Is it to deprecate these?:

System.IO.hGetBufSome
System.IO.hGetBufNonBlocking
System.IO.hPutBufNonBlocking
Data.ByteString.hGetNonBlocking
Data.ByteString.hPutNonBlocking

There are a couple of uses of hGetNonBlocking in haskeline.

No, keep hGetBufSome. It's useful for reading lines of text from a socket, without waiting for some arbitrary limit to be reached, and without reading a byte at a time.

haskeline only uses System.IO.hWaitForInput and Data.ByteString.hGetNonBlocking for compatibility with GHC < 7.4.1, which does not have working encoding support (see #5436). Thanks for pointing these out.

comment:17 Changed 5 years ago by thoughtpolice

Owner: simonmar deleted
Status: patchnew

comment:18 Changed 5 years ago by thoughtpolice

Milestone: 7.8.17.10.1

comment:19 Changed 4 years ago by thoughtpolice

Component: libraries/baseCore Libraries
Owner: set to ekmett

Moving over to new owning component 'Core Libraries'.

comment:20 Changed 4 years ago by thoughtpolice

Milestone: 7.10.17.12.1

Moving to 7.12.1

comment:21 Changed 4 years ago by ekmett

Cc: core-libraries-committee@… added
Owner: ekmett deleted

comment:22 Changed 3 years ago by thoughtpolice

Milestone: 7.12.18.0.1

Milestone renamed

comment:23 Changed 3 years ago by bgamari

Cc: ekmett added
Description: modified (diff)
Milestone: 8.0.18.2.1

This won't be happening for 8.0.

comment:24 Changed 2 years ago by bgamari

Milestone: 8.2.1

De-milestoning since the solution isn't clear and no one has stepped up to handle this.

If you are affected by this issue please do pick this up!

Note: See TracTickets for help on using tickets.