Opened 3 years ago

Closed 2 years ago

Last modified 2 years ago

#10298 closed bug (duplicate)

Infinite loop when shared libraries are unavailable

Reported by: snoyberg Owned by: simonmar
Priority: normal Milestone: 7.10.2
Component: Runtime System Version: 7.10.1
Keywords: Cc: simonmar, trommler
Operating System: Linux Architecture: x86_64 (amd64)
Type of failure: Runtime crash Test Case:
Blocked By: Blocking:
Related Tickets: #7695 Differential Rev(s): Phab:D898
Wiki Page:

Description (last modified by thoughtpolice)

Originally discussed at: https://groups.google.com/d/msg/haskell-cafe/5ZTv5mCG_HI/hBJ-VkdpxdoJ. Note that this was originally discussed as a static linking and Docker issue, but in fact affects dynamically linked executables without any containerization.

Other examples of the same bug: #7695, #8977, #8928

I've put together the following script that reproduces my problem:

cat > hello.hs <<EOF
main :: IO ()
main = putStrLn "Hello World"
EOF
ghc hello.hs

rm -rf tmp
mkdir tmp

cp hello tmp

mkdir -p tmp/usr/lib/x86_64-linux-gnu
cp /usr/lib/x86_64-linux-gnu/libgmp.so.10 tmp/usr/lib/x86_64-linux-gnu

mkdir -p tmp/lib/x86_64-linux-gnu
cp \
    /lib/x86_64-linux-gnu/libm.so.6 \
    /lib/x86_64-linux-gnu/librt.so.1 \
    /lib/x86_64-linux-gnu/libdl.so.2 \
    /lib/x86_64-linux-gnu/libc.so.6 \
    /lib/x86_64-linux-gnu/libpthread.so.0 \
    tmp/lib/x86_64-linux-gnu

mkdir -p tmp/lib64
cp /lib64/ld-linux-x86-64.so.2 tmp/lib64

#mkdir -p tmp/usr/lib/x86_64-linux-gnu/gconv/
#cp \
#    /usr/lib/x86_64-linux-gnu/gconv/UTF-32.so \
#    /usr/lib/x86_64-linux-gnu/gconv/gconv-modules \
#    tmp/usr/lib/x86_64-linux-gnu/gconv

sudo chroot tmp /hello

If I uncomment the block that copies the gconv files, the program runs as expected. However, without those files copied, the program burns CPU and consumes memory until killed by the OS. I ran strace on a similar executable, and got the results at:

https://gist.github.com/snoyberg/095efb17e36acc1d6360

Note that this problem also occurs with statically linked executables when some of the other dynamically linked libraries are not available in the chroot environment.

Expected behavior: ideal would be not to require the gconv files and other shared libraries be present, especially when statically linked. Barring that, it would be much better if the RTS could produce a meaningful error message about the missing file. Note that strace does demonstrate that a open system call is failing, e.g.:

open("/usr/lib/x86_64-linux-gnu/gconv/gconv-modules.cache", O_RDONLY) = -1 ENOENT (No such file or directory)

Reproduced on GHC 7.8.4 and 7.10.1, on Ubuntu 14.04 64-bit.

Change History (13)

comment:1 Changed 3 years ago by rwbarton

I was able to reproduce the loop, but rather than copying /usr/lib/x86_64-linux-gnu/gconv/ into the chroot, I tried copying in /usr/lib/locale/ instead, and that was also sufficient to let the program run normally. It's to be expected that a program built with GHC needs those locale files, since String IO is locale-aware. Of course an infinite loop is not so easy to debug, and it'd be nice to have an error message.

In fact the error status from iconv_open is being correctly checked, and converted to an exception, which is then caught and displayed by the default exception handler. The trouble is that the display of exceptions is also locale-aware...

Curiously even an empty main = return () triggers this behavior with 7.8.4, but it runs successfully on 7.10.1. I couldn't figure out why, perhaps some change in the IO manager?

I don't have any good ideas about how to improve this situation. Maybe try to set up the locale for IO at some point, catch the exception if it fails and barf() rather than using regular IO to display the exception. But when exactly?

comment:2 Changed 3 years ago by rwbarton

Actually there are more hidden traps. The IOError that is raised from the failure of iconv_open is produced by errnoToIOError, which is defined as

errnoToIOError loc errno maybeHdl maybeName = unsafePerformIO $ do
    str <- strerror errno >>= peekCString
    return (IOError maybeHdl errType loc str (Just errno') maybeName)
    where -- ...

This means that matching on the IOError constructor will recursively raise another exception, since peekCString uses the user's locale.

In GHC.TopHandler, real_handler matches on the IOError constructor to decide how to exit. So a possible fix is

diff --git a/libraries/base/GHC/TopHandler.hs b/libraries/base/GHC/TopHandler.hs
index d7c0038..5d4094a 100644
--- a/libraries/base/GHC/TopHandler.hs
+++ b/libraries/base/GHC/TopHandler.hs
@@ -157,14 +157,25 @@ real_handler exit se = do
            Just (ExitFailure n) -> exit n
 
            -- EPIPE errors received for stdout are ignored (#2699)
-           _ -> case fromException se of
+           _ -> catch (case fromException se of
                 Just IOError{ ioe_type = ResourceVanished,
                               ioe_errno = Just ioe,
                               ioe_handle = Just hdl }
                    | Errno ioe == ePIPE, hdl == stdout -> exit 0
                 _ -> do reportError se
                         exit 1
-
+                ) (disasterHandler exit)
+
+-- don't use errorBelch() directly, because we cannot call varargs functions
+-- using the FFI.
+foreign import ccall unsafe "HsBase.h errorBelch2"
+   errorBelch :: CString -> CString -> IO ()
+
+disasterHandler :: (Int -> IO a) -> IOError -> IO a
+disasterHandler exit _ =
+  withCAString "%s" $ \fmt ->
+  withCAString "encountered an exception while trying to report an exception" $ \msg ->
+  errorBelch fmt msg >> exit 1
 
 -- try to flush stdout/stderr, but don't worry if we fail
 -- (these handles might have errors, and we don't want to go into

though it feels a bit artificial.

comment:3 Changed 3 years ago by trommler

Cc: trommler added

comment:4 Changed 3 years ago by rwbarton

I suppose another approach would be to store the result of strerror in a new field the IOError as a ByteString, and display it using non-locale-aware IO (currently we do a round-trip through String, which should be the identity). Then put the field with the String version behind its own unsafePerformIO.

comment:5 Changed 3 years ago by simonpj

Description: modified (diff)

comment:6 Changed 3 years ago by thoughtpolice

Description: modified (diff)

After discussion with Simon & Simon, the patch above looks OK to basically work around #7695 and this ticket sensibly - Reid, would you like to put a patch on Phabricator perhaps, just so we can be sure nothing gets broken by this?

comment:7 Changed 3 years ago by thoughtpolice

Description: modified (diff)

After discussion with Simon & Simon, the patch above looks OK to basically work around #7695 and this ticket sensibly - Reid, would you like to put a patch on Phabricator perhaps, just so we can be sure nothing gets broken by this?

comment:8 Changed 3 years ago by thoughtpolice

Oh, also, we should definitely have a Note explaining exactly what's going on here, just in case we change it later.

comment:9 Changed 3 years ago by hsyl20

I have an additional fix to propose for this bug: handle Unicode to ASCII conversions in GHC without Iconv:

diff --git a/libraries/base/GHC/IO/Encoding.hs b/libraries/base/GHC/IO/Encoding.hs
index 31683b4..c67f317 100644
--- a/libraries/base/GHC/IO/Encoding.hs
+++ b/libraries/base/GHC/IO/Encoding.hs
@@ -243,6 +243,7 @@ mkTextEncoding' cfm enc = case [toUpper c | c <- enc, c /= '-'] of
     "UTF32"   -> return $ UTF32.mkUTF32 cfm
     "UTF32LE" -> return $ UTF32.mkUTF32le cfm
     "UTF32BE" -> return $ UTF32.mkUTF32be cfm
+    "ANSI_X3.41968" -> return char8 -- match "ANSI_X3.4-1968" (ASCII)
 #if defined(mingw32_HOST_OS)
     'C':'P':n | [(cp,"")] <- reads n -> return $ CodePage.mkCodePageEncoding cfm cp
     _ -> unknownEncodingErr (enc ++ codingFailureModeSuffix cfm)

It seems that static binaries fall back to ASCII even if the current locale is UTF-8. ASCII is identified with the string "ANSI_X3.4-1968" on my system (Linux 4.0, glibc 2.21). Maybe we should match other possible aliases?

I tested this patch with a single static binary in a initramfs and it works fine now. It should fix #7695 too (single static binary in a chrooted environment).

comment:10 Changed 3 years ago by thoughtpolice

Differential Rev(s): Phab:D898
Milestone: 7.10.2
Status: newpatch

comment:11 Changed 2 years ago by Austin Seipp <austin@…>

In e28462de700240288519a016d0fe44d4360d9ffd/ghc:

base: fix #10298 & #7695

Summary:
This applies a patch from Reid Barton and Sylvain Henry, which fix a
disasterous infinite loop when iconv fails to load locale files, as
specified in #10298.

The fix is a bit of a hack but should be fine - for the actual reasoning
behind it, see `Note [Disaster and iconv]` for more info.

In addition to this fix, we also patch up the IO Encoding utilities to
recognize several variations of the 'ASCII' encoding (including its
aliases) directly so that GHC can do conversions without iconv. This
allows a static binary to sit in an initramfs.

Authored-by: Reid Barton <rwbarton@gmail.com>
Authored-by: Sylvain Henry <hsyl20@gmail.com>
Signed-off-by: Austin Seipp <austin@well-typed.com>

Test Plan: Eyeballed it.

Reviewers: rwbarton, hvr

Subscribers: bgamari, thomie

Differential Revision: https://phabricator.haskell.org/D898

GHC Trac Issues: #10298, #7695

comment:12 Changed 2 years ago by thoughtpolice

Resolution: duplicate
Status: patchclosed

Fixed; I'm closing this as a duplicate of #7695, which I'll move to merge for 7.10.2.

I couldn't unfortunately think of a way to introduce a reliable test here, without chroot/sudo, which is unsuitable for the testsuite. But I can confirm it fixes the above program - in fact, it runs successfully now, thanks to the extra patch from Sylvain, and in the case iconv errors, an exception should be thrown properly.

comment:13 Changed 2 years ago by Ben Gamari <ben@…>

In d69dfba4e27c4ec33459906fd87c9a56a371f510/ghc:

Fix self-contained handling of ASCII encoding

D898 was primarily intended to fix hangs in the event that iconv was
unavailable (namely #10298 and #7695). In addition to this fix, it also
introduced self-contained handling of ANSI terminals to allow compiled
executables to run in minimal environments lacking iconv.

However, the behavior that the patch introduced is highly suspicious.
Specifically, it gives the user a UTF-8 encoding even if they requested
ASCII.

This has the potential to break quite a lot of code. At very least it
breaks GHC's Unicode terminal detection logic, which attempts to catch
an invalid character when encoding a pair of smart-quotes. Of course,
this exception will never be thrown if a UTF-8 encoder is used.

Here we use the `char8` encoding to handle requests for ASCII encodings
in the event that we find iconv to be non-functional.

Fixes #10623.

Test Plan: Validate with T8959a

Reviewers: rwbarton, hvr, austin, hsyl20

Subscribers: thomie

Differential Revision: https://phabricator.haskell.org/D1059

GHC Trac Issues: #10623
Note: See TracTickets for help on using tickets.