Opened 5 years ago

Last modified 6 months ago

#4505 new bug

Segmentation fault on long input (list of pairs)

Reported by: cathper Owned by: simonmar
Priority: high Milestone: 7.12.1
Component: Compiler Version: 7.0.1
Keywords: Segmentation fault, segfault, long input Cc: cathper@…, snoyberg, merijn
Operating System: Linux Architecture: x86_64 (amd64)
Type of failure: Runtime crash Test Case:
Blocked By: #4258 Blocking:
Related Tickets: Differential Revisions:

Description

When compiling LongList.hs with

ghc --make LongList.hs

then running the output gives a segfault:

$ ./LongList
8595
Segmentation fault

It also segfaults with ghc 6.12.1 (without printing 8595) and with ghc 6.12.3.

On ghc 6.12.3 running on Mac OS 10.5, no segfault is seen. (I haven't tried much longer input, since it already eats up approx. 1.2 GB of RAM.)

Attachments (1)

LongList.hs (100.7 KB) - added by cathper 5 years ago.
A file with a very long line.

Download all attachments as: .zip

Change History (34)

Changed 5 years ago by cathper

A file with a very long line.

comment:1 Changed 5 years ago by cathper

  • Cc cathper@… added

comment:2 Changed 5 years ago by simonmar

  • Owner set to simonmar

comment:3 Changed 5 years ago by simonmar

  • Milestone set to 7.0.2
  • Priority changed from normal to highest

comment:4 Changed 5 years ago by simonmar

Did you manage to compile this file? I left it compiling overnight last night, and it still hasn't finished.

comment:5 Changed 5 years ago by cathper

Yes. The compilation time varies quite much, though.

> ghc --version; wget -q http://hackage.haskell.org/trac/ghc/raw-attachment/ticket/4505/LongList.hs && time nice ghc --make LongList.hs
The Glorious Glasgow Haskell Compilation System, version 6.12.1
[1 of 1] Compiling Main             ( LongList.hs, LongList.o )
Linking LongList ...
nice ghc --make LongList.hs  63,91s user 1,11s system 99% cpu 1:05,28 total
> rm LongList*
> /tmp/ghc7/bin/ghc --version; wget -q http://hackage.haskell.org/trac/ghc/raw-attachment/ticket/4505/LongList.hs && time nice /tmp/ghc7/bin/ghc --make LongList.hs
The Glorious Glasgow Haskell Compilation System, version 7.0.1
[1 of 1] Compiling Main             ( LongList.hs, LongList.o )
nice /tmp/ghc7/bin/ghc --make LongList.hs  515,28s user 2,06s system 99% cpu 8:40,01 total
> uname -a
Linux freja 2.6.32-25-server #44-Ubuntu SMP Fri Sep 17 21:13:39 UTC 2010 x86_64 GNU/Linux
> cat /proc/cpuinfo|grep "model name"|head -n 1
model name      : Intel(R) Xeon(R) CPU           E5420  @ 2.50GHz

It uses > 1 GB of RAM.

comment:6 Changed 5 years ago by simonmar

I understand what the problem is here. The program compiles to code that allocates >1MB in one go, and the runtime doesn't currently support that (we support allocating >1MB only for a single object, such as an array, not for multiple objects). This is a zero-day bug: it's been around for ever, but nobody has run into it until now.

Worst case we can make the RTS emit an error message rather than just segfaulting, but I hope we'll be able to find a way to fix it properly. I'll look into it.

Meanwhile, it seems the HEAD can't compile the test program at all, unless a type signature is added, and 7.0.1 is significantly slower than 6.12.3 to compile it. I've created #4801 for the compiler performance regression.

comment:7 Changed 5 years ago by simonmar

  • Priority changed from highest to high

Not a blocker, because it's not a regression.

comment:8 follow-up: Changed 5 years ago by simonmar

  • Priority changed from high to highest
  • Status changed from new to merge

I improved the error message. Please merge these for 7.0.2, and then re-milestone to 7.2.1 for a proper fix.

Wed Dec  8 08:32:12 PST 2010  Simon Marlow <[email protected]>
  * Tweak the "sorry" message a bit
  
  -		"sorry! (this is work in progress)\n"
  +		"sorry! (unimplemented feature or known bug)\n"

Thu Dec  9 03:40:05 PST 2010  Simon Marlow <[email protected]>
  * Catch too-large allocations and emit an error message (#4505)
  
  This is a temporary measure until we fix the bug properly (which is
  somewhat tricky, and we think might be easier in the new code
  generator).
  
  For now we get:
  
  ghc-stage2: sorry! (unimplemented feature or known bug)
    (GHC version 7.1 for i386-unknown-linux):
          Trying to allocate more than 1040384 bytes.
  
  See: http://hackage.haskell.org/trac/ghc/ticket/4550
  Suggestion: read data from a file instead of having large static data
  structures in the code.

Thu Dec  9 12:04:04 GMT 2010  Simon Marlow <[email protected]>
  * fix ticket number (#4505)

comment:9 Changed 5 years ago by simonmar

(the priority bump was an accident, in case you were wondering)

comment:10 in reply to: ↑ 8 ; follow-up: Changed 5 years ago by cathper

Replying to simonmar:

I improved the error message. Please merge these for 7.0.2, and then re-milestone to 7.2.1 for a proper fix.

I'm a Trac novice. You mean, merge it as fixed (so the ticket is closed), and open a new ticket that has 7.2.1 as milestone?

See: http://hackage.haskell.org/trac/ghc/ticket/4550

s/4550/4505/ I guess.

comment:11 in reply to: ↑ 10 Changed 5 years ago by simonmar

Replying to cathper:

I'm a Trac novice. You mean, merge it as fixed (so the ticket is closed), and open a new ticket that has 7.2.1 as milestone?

Sorry for the confusion, that message was intended for Ian, who does all the merging.

See: http://hackage.haskell.org/trac/ghc/ticket/4550

s/4550/4505/ I guess.

Yes, I fixed that in a subsequent patch.

comment:12 Changed 5 years ago by cathper

Just perfect :-)

comment:13 Changed 5 years ago by simonmar

One further patch to merge:

Fri Dec 10 09:40:02 GMT 2010  Simon Marlow <[email protected]>
  * warning fix: don't redefine BLOCKS_PER_MBLOCK

comment:14 Changed 5 years ago by igloo

First 3 merged.

comment:15 Changed 5 years ago by igloo

  • Milestone changed from 7.0.2 to 7.2.1
  • Owner simonmar deleted
  • Status changed from merge to new

Last one also merged.

comment:16 Changed 5 years ago by igloo

  • Owner set to simonmar

comment:17 Changed 5 years ago by simonmar

  • Priority changed from highest to normal

comment:18 Changed 4 years ago by simonmar

  • Owner simonmar deleted

no proper fix in sight until we have the new code generator.

comment:19 Changed 4 years ago by igloo

  • Blocked By 4258 added

comment:20 Changed 3 years ago by igloo

  • Milestone changed from 7.4.1 to 7.6.1
  • Priority changed from normal to low

comment:21 Changed 3 years ago by igloo

  • Milestone changed from 7.6.1 to 7.6.2

comment:22 Changed 13 months ago by thoughtpolice

  • Milestone changed from 7.6.2 to 7.10.1

Moving to 7.10.1.

comment:23 Changed 9 months ago by Yuras

  • difficulty set to Unknown
  • Priority changed from low to normal

The compile time check (introduced in a278f3f02d09bc32b0a75d4a04d710090cde250f) was removed when replacing the old code generator. So now it crashes at runtime again. It was report a number of times, see #9647 and #8568.

comment:24 Changed 9 months ago by thomie

Nice find Yuras. Should be easy to put that check back in then.

comment:25 Changed 8 months ago by simonpj

  • Owner set to simonmar
  • Priority changed from normal to high

Simon M: might you re-instate the previous fix, helpfully identified by Yuras, pending a more glorious resolution. (With a pointer from the fix back to this ticket.)

I'll make the priority high, because it seems easy and much better than a runtime crash. Simon

comment:26 Changed 8 months ago by snoyberg

  • Cc snoyberg added

I just ran into this on our code base. Is there any chance of getting a fix for this into the 7.8 series?

comment:27 Changed 8 months ago by simonpj

  • Priority changed from high to highest

I wish this had come up 3 weeks ago when we were triaging tickets for 7.8. The release candidate for 7.8.4 is out, so it's tough to get anything else in. Unless you say "this makes 7.8 unusable for my purposes". (This is the case for 7.8.3, which is why we went ahead with 7.8.4.)

Putting it in would mean a delay of a couple of weeks while (a) Simon re-instates the previous fix, (b) Austin makes a new release candidate (c) everyone tests it.

Of course, it's better to delay 7.8.4 than to have to put out 7.8.5. But if we keep doing that, we never make a release.

My instinct is "too late". But is there strong user sentiment about this one?

Simon

comment:28 Changed 8 months ago by simonmar

The code that implements the runtime check is still in place, it's only the compile-time check that we lost. So you don't get a segfault, but you do get a runtime failure:

            barf("allocation of %ld bytes too large (GHC should have complained at compile-time)", (long)cap->r.rHpAlloc);

@snoyberg: do you see this?

Given this, I don't think it's worth holding up 7.8.4. We wouldn't be fixing the bug, only moving the failure to compile-time.

comment:29 Changed 8 months ago by snoyberg

@simonmar: Yes, that's the error message we're seeing at runtime.

Moving this bug from runtime to compile time would be a *huge* advantage to me. But I agree that it's not worth holding up the release. To clarify: I would still be very happy to see a fix for this on the 7.8 branch, as we could build our own GHC binary with the patch.

comment:30 Changed 8 months ago by rafael-felix

Would it help to create a "should fail" test case inside the array directory?

comment:31 Changed 7 months ago by simonmar

  • Priority changed from highest to high

Not release-blocking

comment:32 Changed 7 months ago by thoughtpolice

  • Milestone changed from 7.10.1 to 7.12.1

Moving to 7.12.1

comment:33 Changed 6 months ago by merijn

  • Cc merijn added
Note: See TracTickets for help on using tickets.