Opened 6 years ago

Last modified 3 years ago

#3215 new feature request

Valgrind support

Reported by: cmcq Owned by:
Priority: normal Milestone:
Component: Runtime System Version: 6.10.3
Keywords: valgrind Cc: rrnewton@…
Operating System: Linux Architecture: x86
Type of failure: None/Unknown Test Case: yes
Blocked By: Blocking:
Related Tickets: Differential Revisions:

Description (last modified by igloo)

This is based on code in gtk2hs.

$ valgrind -q ./finalizer_queue

finalizer_queue: internal error: stg_ap_v_ret
    (GHC version 6.10.3 for i386_unknown_linux)
    Please report this as a GHC bug:  http://www.haskell.org/ghc/reportabug
Killed

Unfortunately this test doesn't crash without Valgrind.

My guess is that this bit is the problem:

    finalizer <- fixIO $ \dPtr -> mkThunk $ do
        freeHaskellFunPtr callback
        freeHaskellFunPtr dPtr

Perhaps the documentation should say not to do this?

Attachments (1)

FinalizerQueue.tar (10.0 KB) - added by cmcq 6 years ago.
test case

Download all attachments as: .zip

Change History (10)

Changed 6 years ago by cmcq

test case

comment:1 Changed 6 years ago by cmcq

  • Resolution set to invalid
  • Status changed from new to closed

This was just a Valgrind problem with self-modifying code. "--smc-check=all" fixes it. I think that due to the LIFO free list used by rts/Stable.c, the functions are reversed the second time round. Because of Valgrind's caching, the finalizer was being called first.

I noticed this paragraph in rts/Adjustor.c, so I assume that calling freeHaskellFunPtr on the current function IS allowed.

When generating an adjustor thunk that uses the C calling
convention, we have to make sure that the thunk kicks off
the process of jumping into Haskell with a tail jump. Why?
Because as a result of jumping in into Haskell we may end
up freeing the very adjustor thunk we came from using
freeHaskellFunctionPtr(). Hence, we better not return to
the adjustor code on our way out, since it could by then
point to junk.

comment:2 Changed 6 years ago by cmcq

  • Keywords valgrind added
  • Resolution invalid deleted
  • severity changed from normal to trivial
  • Status changed from closed to reopened
  • Summary changed from Calling freeHaskellFunPtr on the current function to Valgrind support
  • Test Case set to yes
  • Type changed from bug to feature request

This is now a feature request.

Valgrind is a useful tool when working with the FFI. However, it needs
a bit of help to work efficiently. Trac 3215 has a test case where
Valgrind's instruction cache causes a RTS crash. One solution is for
users to pass --smc-check=all but this is slow.

The RTS could use the "Client Request mechanism" to tell Valgrind to
update its instruction cache. These normally have no effect.
http://valgrind.org/docs/manual/manual-core-adv.html

Note "You are encouraged to copy the valgrind/*.h headers into your
project's include directory, so your program doesn't have a
compile-time dependency on Valgrind being installed. The Valgrind
headers, unlike most of the rest of the code, are under a BSD-style
license so you may include them without worrying about license
incompatibility."

Should these headers just go in the "includes" directory?

diff -rN -u old-ghc/rts/Adjustor.c new-ghc/rts/Adjustor.c
--- old-ghc/rts/Adjustor.c	2009-05-11 11:30:09.000000000 +0100
+++ new-ghc/rts/Adjustor.c	2009-05-11 11:30:09.000000000 +0100
@@ -143,6 +143,8 @@
 #define UNDERSCORE ""
 #endif
 #if defined(i386_HOST_ARCH) && !defined(darwin_HOST_OS)
+#include <valgrind/valgrind.h>
+
 /*
   Now here's something obscure for you:

@@ -411,6 +413,8 @@

 	adj_code[0x0f] = (unsigned char)0xff; /* jmp *%eax */
 	adj_code[0x10] = (unsigned char)0xe0;
+
+	VALGRIND_DISCARD_TRANSLATIONS(code,0x11);
     }
 #elif defined(i386_HOST_ARCH) && defined(darwin_HOST_OS)
     {

comment:3 Changed 6 years ago by duncan

Is rts/Adjustor.c still used? I thought it was all libffi now. Does libffi have valgrind support?

comment:4 Changed 6 years ago by cmcq

On x86 Linux, libffi is only used for allocating code. From "darcs changes rts/Adjustor.c":

Tue Apr  8 19:34:34 BST 2008  Simon Marlow <[email protected]>
  * Import libffi-3.0.4, and use it to provide FFI support in GHCi
  ...
  There is also code in the RTS to use libffi in place of
  rts/Adjustor.c, but it is currently not enabled if we already have
  support in Adjustor.c for the current platform.  We need to assess the
  performance impact before using libffi here too (in GHCi we don't care
  too much about performance).

The libffi source doesn't mention Valgrind so there's no explicit support. I couldn't get a test that fails. From a quick look at its source, a libffi trampoline's code is based on its address. Hence the only way Valgrind's code cache could become incoherent is if ffi_closure_alloc returns addresses such that a trampoline overlaps previous trampolines.

comment:5 Changed 6 years ago by cmcq

Also see #793 regarding use of libffi.

I get this unrelated problem using UseLibFFIForAdjustors=YES on GHC from darcs:

$ gdb -q ./finalizer_queue 
(gdb) break MallocFailHook 
Breakpoint 1 at 0x8194826: file rts/hooks/MallocFail.c, line 14.
(gdb) r
Starting program: /home/colin/FinalizerQueue/finalizer_queue 
[Thread debugging using libthread_db enabled]
[New Thread 0xb7ce86d0 (LWP 13368)]
[New Thread 0xb7affb90 (LWP 13371)]
[New Thread 0xb72feb90 (LWP 13372)]
[Switching to Thread 0xb72feb90 (LWP 13372)]

Breakpoint 1, MallocFailHook (request_size=4294967292, msg=0x81d3870 "createAdjustor")
    at rts/hooks/MallocFail.c:14
14	    fprintf(stderr, "malloc: failed on request for %lu bytes; message: %s\n", request_size, msg);
(gdb) where
#0  MallocFailHook (request_size=4294967292, msg=0x81d3870 "createAdjustor")
    at rts/hooks/MallocFail.c:14
#1  0x08187b87 in stgMallocBytes (n=-4, msg=0x81d3870 "createAdjustor")
    at rts/RtsUtils.c:201
#2  0x08182800 in createAdjustor (cconv=1, hptr=0xc, wptr=0x804b9c0 <Main_dDP>, 
    typeString=0x81c0008 "") at rts/Adjustor.c:97
#3  0x0804ac7f in sG9_info ()
#4  0x00000001 in ?? ()
#5  0x080750bc in base_GHCziIOBase_a23_info ()
#6  0xb7b8004a in ?? ()
Backtrace stopped: previous frame inner to this frame (corrupt stack?)

comment:6 Changed 6 years ago by igloo

  • Description modified (diff)
  • difficulty set to Unknown

comment:7 Changed 6 years ago by igloo

  • Milestone set to 6.12.1

comment:8 Changed 5 years ago by igloo

  • Milestone changed from 6.12.1 to _|_
  • Type of failure set to None/Unknown

comment:9 Changed 3 years ago by rrnewton

  • Cc rrnewton@… added
Note: See TracTickets for help on using tickets.