Opened 2 years ago

Closed 2 years ago

#7522 closed bug (fixed)

segfault when ignoring invalid byte sequence when decoding UTF8//IGNORE

Reported by: ganesh Owned by: batterseapower
Priority: high Milestone: 7.6.2
Component: Compiler Version: 7.6.1
Keywords: Cc:
Operating System: Unknown/Multiple Architecture: Unknown/Multiple
Type of failure: Runtime crash Test Case:
Blocked By: Blocking:
Related Tickets: Differential Revisions:

Description

The code below segfaults on a variety of platforms and GHC versions - I've tried 7.4.1 and 7.6.1, on Windows and Linux.

It seems to be related to (a) the specific choice of UTF8 - doesn't happen with UTF16 or UTF32 etc and (b) having the invalid byte sequence at the end of the thing being decoded.

import qualified Data.ByteString as B
import System.IO

tempFile = "temp"

main = do
   utf8Ignore <- mkTextEncoding "UTF8//IGNORE"
   B.writeFile tempFile (B.pack [128])
   h <- openFile tempFile ReadMode
   hSetEncoding h utf8Ignore
   hGetContents h >>= putStrLn

Change History (2)

comment:1 Changed 2 years ago by simonmar

  • difficulty set to Unknown
  • Milestone set to 7.6.2
  • Owner set to batterseapower
  • Priority changed from normal to high

I suggest Max might be the best person to look at this, he implemented the support for //IGNORE.

comment:2 Changed 2 years ago by batterseapower

  • Resolution set to fixed
  • Status changed from new to closed

Fixed and tested in 1ac38ef6e9decc3f4763848f3d43c0cc68d1d390 of the base library.

Note: See TracTickets for help on using tickets.