Opened 2 months ago

Closed 8 weeks ago

#15892 closed bug (fixed)

Segmentation fault with ByteString

Reported by: akio Owned by:
Priority: highest Milestone: 8.6.3
Component: Compiler Version: 8.6.2
Keywords: Cc: osa1, simonmar, maoe, fuuzetsu, fumieval
Operating System: Linux Architecture: x86_64 (amd64)
Type of failure: Runtime crash Test Case:
Blocked By: Blocking:
Related Tickets: Differential Rev(s): Phab:D5334
Wiki Page:

Description

The attached program consistently segfaults (within a few seconds) when compiled with ghc-8.6.1 or ghc-8.6.2. It runs forever (as expected) when compiled with ghc-8.4.

To reproduce:

ghc segfault.hs

then,

./segfault >/dev/null

Attachments (1)

segfault.hs (3.1 KB) - added by akio 2 months ago.

Download all attachments as: .zip

Change History (16)

Changed 2 months ago by akio

Attachment: segfault.hs added

comment:1 Changed 2 months ago by osa1

Confirmed on GHC HEAD.

comment:2 Changed 2 months ago by osa1

Cc: osa1 simonmar added

This seems to be related with the GC. I realized two things:

  • If I run with debug runtime I get this error:
segfault: internal error: Evaluated a CAF (0x851768) that was GC'd!
    (GHC version 8.7.20181113 for x86_64_unknown_linux)
    Please report this as a GHC bug:  http://www.haskell.org/ghc/reportabug
zsh: abort (core dumped)  ./segfault +RTS > /dev/null
  • If I play around with GC params sometimes this programs can run for much longer. E.g. on my system +RTS -A1G or +RTS -G5 seem to make the problem disappear.

Could this be related with the recent SRT work? CCing simonmar.

comment:3 Changed 2 months ago by fumieval

Here's an equivalent code with bytestring operations unrolled: https://github.com/tsurucapital/segfault/blob/fumieval/app/Main.hs

comment:4 Changed 2 months ago by bgamari

Priority: normalhighest

Yikes, this looks awful.

comment:5 Changed 2 months ago by maoe

Cc: maoe added

comment:6 Changed 2 months ago by osa1

Summary: Segmentation fault with ByteString and -OSegmentation fault with ByteString

(You don't need -O for this, updating summary)

comment:7 Changed 2 months ago by Fuuzetsu

Cc: fuuzetsu added

comment:8 Changed 2 months ago by fumieval

Cc: fumieval added

comment:9 Changed 2 months ago by simonmar

I'm looking into this.

comment:10 Changed 2 months ago by hsyl20

Could it be another manifestation of #14375? Maybe try to NOINLINE withForeignPtr.

Edit: Nevermind. I've tested it and it still fails.

Last edited 2 months ago by hsyl20 (previous) (diff)

comment:11 Changed 2 months ago by simonmar

Differential Rev(s): Phab:D5334
Status: newpatch

Thanks for the bug report @akio and for the very nice small test case @fumieval. I managed to further reduce the code and remove most of the Iteratee stuff, which was making the bug really hard to narrow down. I believe I've now found the problem, the fix was to add a pair of parentheses(!), see Phab:D5334.

comment:12 Changed 2 months ago by fumieval

Nice, that's amazingly fast! I confirmed that D5334 fixes the crash.

comment:13 Changed 2 months ago by Ömer Sinan Ağacan <omeragacan@…>

In eb46345d/ghc:

Fix a bug in SRT generation (#15892)

Summary:
The logic in `Note [recursive SRTs]` was correct. However, my
implementation of it wasn't: I got the associativity of
`Set.difference` wrong, which led to an extremely subtle and difficult
to find bug.

Fortunately now we have a test case. I was able to cut down the code
to something manageable, and I've added it to the test suite.

Test Plan:
Before (using my stage 1 compiler without the fix):

```
====> T15892(normal) 1 of 1 [0, 0, 0]
cd "T15892.run" &&  "/home/smarlow/ghc/inplace/bin/ghc-stage1" -o T15892
T15892.hs -dcore-lint -dcmm-lint -no-user-package-db -rtsopts
-fno-warn-missed-specialisations -fshow-warning-groups
-fdiagnostics-color=never -fno-diagnostics-show-caret -Werror=compat
-dno-debug-output  -O
cd "T15892.run" && ./T15892  +RTS -G1 -A32k -RTS
Wrong exit code for T15892(normal)(expected 0 , actual 134 )
Stderr ( T15892 ):
T15892: internal error: evacuate: strange closure type 0
    (GHC version 8.7.20181113 for x86_64_unknown_linux)
    Please report this as a GHC bug:  http://www.haskell.org/ghc/reportabug
Aborted (core dumped)
*** unexpected failure for T15892(normal)
=====> T15892(g1) 1 of 1 [0, 1, 0]
cd "T15892.run" &&  "/home/smarlow/ghc/inplace/bin/ghc-stage1" -o T15892
T15892.hs -dcore-lint -dcmm-lint -no-user-package-db -rtsopts
-fno-warn-missed-specialisations -fshow-warning-groups
-fdiagnostics-color=never -fno-diagnostics-show-caret -Werror=compat
-dno-debug-output  -O
cd "T15892.run" && ./T15892 +RTS -G1 -RTS +RTS -G1 -A32k -RTS
Wrong exit code for T15892(g1)(expected 0 , actual 134 )
Stderr ( T15892 ):
T15892: internal error: evacuate: strange closure type 0
    (GHC version 8.7.20181113 for x86_64_unknown_linux)
    Please report this as a GHC bug:  http://www.haskell.org/ghc/reportabug
Aborted (core dumped)
```

After (using my stage 2 compiler with the fix):

```
=====> T15892(normal) 1 of 1 [0, 0, 0]
cd "T15892.run" &&  "/home/smarlow/ghc/inplace/test   spaces/ghc-stage2"
-o T15892 T15892.hs -dcore-lint -dcmm-lint -no-user-package-db -rtsopts
-fno-warn-missed-specialisations -fshow-warning-groups
-fdiagnostics-color=never -fno-diagnostics-show-caret -Werror=compat
-dno-debug-output
cd "T15892.run" && ./T15892  +RTS -G1 -A32k -RTS
=====> T15892(g1) 1 of 1 [0, 0, 0]
cd "T15892.run" &&  "/home/smarlow/ghc/inplace/test   spaces/ghc-stage2"
-o T15892 T15892.hs -dcore-lint -dcmm-lint -no-user-package-db -rtsopts
-fno-warn-missed-specialisations -fshow-warning-groups
-fdiagnostics-color=never -fno-diagnostics-show-caret -Werror=compat
-dno-debug-output
cd "T15892.run" && ./T15892 +RTS -G1 -RTS +RTS -G1 -A32k -RTS
```

Reviewers: bgamari, osa1, erikd

Reviewed By: osa1

Subscribers: rwbarton, carter

GHC Trac Issues: #15892

Differential Revision: https://phabricator.haskell.org/D5334

comment:14 Changed 2 months ago by osa1

Status: patchmerge

comment:15 Changed 8 weeks ago by bgamari

Resolution: fixed
Status: mergeclosed
Note: See TracTickets for help on using tickets.