Opened 2 years ago

Closed 2 months ago

Last modified 7 days ago

#7695 closed bug (fixed)

Hang when locale-archive and gconv-modules are not there

Reported by: hpd Owned by:
Priority: highest Milestone: 7.10.2
Component: None Version: 7.8.1
Keywords: Cc:
Operating System: Unknown/Multiple Architecture: Unknown/Multiple
Type of failure: Runtime crash Test Case:
Blocked By: Blocking:
Related Tickets: #8977, #10298 Differential Revisions: Phab:D898

Description

Running a (statically) compiled program in an environment where /usr/lib/locale or /usr/lib/gconv (or ../lib64/..) are not there, causes it to hog the CPU and rapidly allocate memory (this happens for example in an empty chroot).

Steps to reproduce:

echo "main = return ()" > test.hs
ghc -static -optl-static -optl-pthread -o test test.hs
chroot `pwd` /test

Strace output:

$ /strace -e trace=file -e signal= /test
execve("/test", ["/test"], [/* 17 vars */]) = 0
open("/usr/lib64/locale/locale-archive", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
open("/usr/share/locale/locale.alias", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
open("/usr/lib64/locale/de_DE.UTF-8/LC_CTYPE", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
open("/usr/lib64/locale/de_DE.utf8/LC_CTYPE", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
open("/usr/lib64/locale/de_DE/LC_CTYPE", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
open("/usr/lib64/locale/de.UTF-8/LC_CTYPE", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
open("/usr/lib64/locale/de.utf8/LC_CTYPE", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
open("/usr/lib64/locale/de/LC_CTYPE", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
open("/usr/lib64/gconv/gconv-modules.cache", O_RDONLY) = -1 ENOENT (No such file or directory)
open("/usr/lib64/gconv/gconv-modules", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)

Strace output (with LC_ALL=C):

$ LC_ALL=C strace -e trace=file -e signal= /test
execve("/test", ["/test"], [/* 18 vars */]) = 0
open("/usr/lib64/gconv/gconv-modules.cache", O_RDONLY) = -1 ENOENT (No such file or directory)
open("/usr/lib64/gconv/gconv-modules", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)

In both cases, there are no further system calls (apart from timer ticks and mmap).

I was also able to reproduce this behavior without a chroot, by replacing the paths from strace's output with something nonexistent in the binary (manually, with a hex-editor). I didn't try dynamically linked binaries, yet.

One problem may be, that iconv_open constantly returns -1 if it doesn't find the files listed above.

Change History (24)

comment:1 Changed 2 years ago by igloo

  • difficulty set to Unknown

It looks like you're right; iconv_open returns -1, so we throw an exception, but printing the exception means we try opening iconv again, and loop forever.

comment:2 Changed 2 years ago by igloo

  • Milestone set to 7.8.1

comment:3 Changed 16 months ago by trommler

  • Milestone changed from 7.8.1 to 7.8.2
  • Operating System changed from Linux to Unknown/Multiple
  • Version changed from 7.6.2 to 7.8.1

This was also reported on Solaris. See #8977.

comment:4 Changed 16 months ago by thoughtpolice

  • Milestone changed from 7.8.2 to 7.8.3

comment:5 Changed 16 months ago by trommler

  • difficulty changed from Unknown to Moderate (less than a day)
  • Owner set to trommler
  • Type of failure changed from None/Unknown to Runtime crash

OK, I'll take a stab at that one.

comment:6 Changed 14 months ago by thoughtpolice

  • Milestone changed from 7.8.3 to 7.8.4

Moving to 7.8.4.

comment:7 Changed 10 months ago by thoughtpolice

  • Milestone changed from 7.8.4 to 7.10.1

Moving (in bulk) to 7.10.4

comment:8 follow-up: Changed 8 months ago by thomie

This bug seems to be fixed in HEAD (ghc-7.9.20141121), although I can not find the commit that might have fixed it.

We should probably add a regression test before closing.

comment:9 in reply to: ↑ 8 ; follow-up: Changed 8 months ago by trommler

  • difficulty changed from Moderate (less than a day) to Unknown
  • Owner trommler deleted

Replying to thomie:

This bug seems to be fixed in HEAD (ghc-7.9.20141121), although I can not find the commit that might have fixed it.

No, the bug ist still present in HEAD @96d29b5. I just verified on openSUSE 13.2 i586.

Unfortunately, I don't have time to work on a fix right now.

comment:10 in reply to: ↑ 9 ; follow-up: Changed 8 months ago by thomie

No, the bug ist still present in HEAD @96d29b5. I just verified on openSUSE 13.2 i586.

Unfortunately, I don't have time to work on a fix right now.

Strange. When I run the commands from the description with ghc-7.8.3, cpu goes indeed to 100% and the process never finishes. When I do the same with HEAD, everything works fine. I'm on Ubuntu 14.04 x86_64.

trommler: do you maybe have a better test for this bug?

comment:11 in reply to: ↑ 10 ; follow-up: Changed 8 months ago by trommler

Replying to thomie:

Strange. When I run the commands from the description with ghc-7.8.3, cpu goes indeed to 100% and the process never finishes. When I do the same with HEAD, everything works fine. I'm on Ubuntu 14.04 x86_64.

Strange. Perhaps I will try again with a freshly checked out tree. I am never sure if the submodules worked out right. But blame that on my lack of git skills.

trommler: do you maybe have a better test for this bug?

I use ghc --version.

comment:12 in reply to: ↑ 11 Changed 8 months ago by trommler

Replying to trommler:

Replying to thomie:

Strange. When I run the commands from the description with ghc-7.8.3, cpu goes indeed to 100% and the process never finishes. When I do the same with HEAD, everything works fine. I'm on Ubuntu 14.04 x86_64.

Strange. Perhaps I will try again with a freshly checked out tree. I am never sure if the submodules worked out right. But blame that on my lack of git skills.

On a fresh clone of HEAD at 41c3545 I still see the issue on openSUSE 13.2 i586.

comment:13 Changed 7 months ago by thoughtpolice

  • Milestone changed from 7.10.1 to 7.12.1

Moving to 7.12.1 milestone; if you feel this is an error and should be addressed sooner, please move it back to the 7.10.1 milestone.

comment:14 Changed 3 months ago by simonmar

  • Milestone changed from 7.12.1 to 7.10.2
  • Priority changed from normal to highest

Let's get this one fixed, Michael Snoyman recently ran into an issue that sounds like the same thing when working with Docker.

comment:15 Changed 3 months ago by rwbarton

Oh, do you mean #10298? Indeed it looks the same.

comment:16 Changed 3 months ago by simonpj

See other dups (beyond #10298): #8977, #8928

comment:17 Changed 3 months ago by thoughtpolice

comment:18 Changed 2 months ago by hsyl20

Fixed with my patch for #10298 (see my comment there).

Can you check that you obtain the same result on your system with this code:

$> cat test.c
#include <langinfo.h>
#include <locale.h>
#include <stdio.h>

int main() {
   setlocale(LC_CTYPE, "");
   printf("LC_CTYPE: %s\n", nl_langinfo(CODESET));
   return 0;
}

$> gcc -static -L/path/to/static/glibc test.c -o test
$> sudo chroot `pwd` /test
LC_CTYPE: ANSI_X3.4-1968

I don't know if the default LC_CTYPE is the same everywhere...

(You should get the same result if you have compiled the glibc with --disable-shared when you call ./test directly without the chroot)

comment:19 Changed 2 months ago by thoughtpolice

  • Differential Revisions set to Phab:D898
  • Status changed from new to patch

comment:20 Changed 2 months ago by Austin Seipp <austin@…>

In e28462de700240288519a016d0fe44d4360d9ffd/ghc:

base: fix #10298 & #7695

Summary:
This applies a patch from Reid Barton and Sylvain Henry, which fix a
disasterous infinite loop when iconv fails to load locale files, as
specified in #10298.

The fix is a bit of a hack but should be fine - for the actual reasoning
behind it, see `Note [Disaster and iconv]` for more info.

In addition to this fix, we also patch up the IO Encoding utilities to
recognize several variations of the 'ASCII' encoding (including its
aliases) directly so that GHC can do conversions without iconv. This
allows a static binary to sit in an initramfs.

Authored-by: Reid Barton <[email protected]>
Authored-by: Sylvain Henry <[email protected]>
Signed-off-by: Austin Seipp <[email protected]>

Test Plan: Eyeballed it.

Reviewers: rwbarton, hvr

Subscribers: bgamari, thomie

Differential Revision: https://phabricator.haskell.org/D898

GHC Trac Issues: #10298, #7695

comment:21 Changed 2 months ago by thoughtpolice

  • Status changed from patch to merge

comment:22 Changed 2 months ago by thoughtpolice

  • Resolution set to fixed
  • Status changed from merge to closed

Merged to ghc-7.10.

comment:23 Changed 3 weeks ago by Ben Gamari <ben@…>

In d69dfba4e27c4ec33459906fd87c9a56a371f510/ghc:

Fix self-contained handling of ASCII encoding

D898 was primarily intended to fix hangs in the event that iconv was
unavailable (namely #10298 and #7695). In addition to this fix, it also
introduced self-contained handling of ANSI terminals to allow compiled
executables to run in minimal environments lacking iconv.

However, the behavior that the patch introduced is highly suspicious.
Specifically, it gives the user a UTF-8 encoding even if they requested
ASCII.

This has the potential to break quite a lot of code. At very least it
breaks GHC's Unicode terminal detection logic, which attempts to catch
an invalid character when encoding a pair of smart-quotes. Of course,
this exception will never be thrown if a UTF-8 encoder is used.

Here we use the `char8` encoding to handle requests for ASCII encodings
in the event that we find iconv to be non-functional.

Fixes #10623.

Test Plan: Validate with T8959a

Reviewers: rwbarton, hvr, austin, hsyl20

Subscribers: thomie

Differential Revision: https://phabricator.haskell.org/D1059

GHC Trac Issues: #10623

comment:24 Changed 7 days ago by Ben Gamari <ben@…>

In dbe6dac/ghc:

When iconv is unavailable, use an ASCII encoding to encode ASCII

D898 and D1059 implemented a fallback behavior to handle the case
that the end user's iconv installation is broken (typically due to
running inside a chroot in which the necessary locale files and/or
gconv modules have not been installed). In this case, if the
program requests an ASCII locale, GHC's char8 encoding is used
rather than the program failing.

However, silently mangling data like char8 does when the programmer
did not ask for it is poor behavior, for reasons described in D1059.

This commit implements an ASCII encoding and uses it in the fallback
case when iconv is unavailable and the user has requested ASCII.

Test Plan:
Added tests for the encodings defined in Latin1.
Also, manually ran a statically-linked executable of that test
in a chroot and the tests passed (up to the ones that call
mkTextEncoding "LATIN1", since there is no fallback from iconv
for that case yet).

Reviewers: austin, hvr, hsyl20, bgamari

Reviewed By: hsyl20, bgamari

Subscribers: thomie

Differential Revision: https://phabricator.haskell.org/D1085

GHC Trac Issues: #7695, #10623
Note: See TracTickets for help on using tickets.