Opened 2 years ago

Closed 18 months ago

#5991 closed bug (fixed)

regression: huge number of wakeups in xmonad

Reported by: dmwit Owned by:
Priority: high Milestone: 7.6.2
Component: Runtime System Version: 7.6.1
Keywords: Cc: gammelholte, dennis@…
Operating System: Linux Architecture: x86_64 (amd64)
Type of failure: Runtime performance bug Difficulty: Unknown
Test Case: Blocked By:
Blocking: Related Tickets:

Description

People using GHC 7.4 have been complaining that xmonad causes a huge number of wakeups (about 100 wakeups/s, according to one complaint, and 30% of all wakeups on the entire machine according to another). A workaround of making xmonad compile with -rtsopts and adding +RTS -V0 seems to hide this problem, but this is very inconvenient because it means the user can't use xmonad's own recompilation and restart mechanisms.

Change History (19)

comment:1 Changed 2 years ago by simonmar

  • Difficulty set to Unknown
  • Milestone set to 7.6.1
  • Status changed from new to infoneeded

Is xmonad normally compiled with -threaded? Does it make a difference?

comment:2 Changed 2 years ago by dmwit

xmonad is usually not compiled with -threaded; compiling with -threaded does seem to be another valid workaround.

16:42 < dylan> dmwit: -threaded also fixes the wakeups problem
16:42 < dmwit> dylan: Awesome, I'll say so on the bug.
16:42 < dmwit> Perhaps we should just be passing -threaded when we build in xmonad.
16:43 < dylan> ultimately the RTS should not be doing what it is doing though
16:43 < dmwit> Yes, and also -threaded is a bit scary with X code.
16:43 < dylan> I'm going back to -V0 for the time being
16:44 < dylan> the amusing bit
16:44 < dylan> either either solution
16:45 < dylan> there are still several hundred SIGVTALRM/rt_sigreturn pairs before it settles down into a poll()

comment:3 Changed 2 years ago by simonmar

  • Component changed from Compiler to Runtime System
  • Milestone changed from 7.6.1 to 7.4.2
  • Priority changed from normal to high
  • Status changed from infoneeded to new

fix coming...

comment:4 Changed 2 years ago by simonmar

  • Owner set to simonmar

comment:5 Changed 2 years ago by marlowsd@…

commit dd24d6bc37879c6b32a3d5ac4ee765e59e13501c

Author: Simon Marlow <marlowsd@gmail.com>
Date:   Wed Apr 11 10:12:14 2012 +0100

    Disable the timer signal while blocked in select() (#5991)
    
    The threaded RTS had a fix for this a long time ago (#1623) but this
    patch applies a similar fix to the non-threaded RTS.

 rts/posix/Select.c |   17 +++++++++++++----
 1 files changed, 13 insertions(+), 4 deletions(-)

comment:6 Changed 2 years ago by simonmar

  • Status changed from new to merge

Please also merge 9d26519c20194abc3832578a55fbe31327519eeb, as it fixes a regression in threadDelay in the non-threaded RTS that I noticed while investigating this bug.

comment:7 Changed 2 years ago by pcapriotti

  • Resolution set to fixed
  • Status changed from merge to closed

Merged as 27c2f292d23983fac0e1f333a43424264c2e290b and 6aee474574e0aff10602a6fd17d37826b04a996d.

The first commit (threadDelay resolution change) broke threadDelay on 7.4 because of a broken unit conversion macro, which was fixed in the HEAD, so I merged that as well (a73d0a21c875ff539c2240e431eaf87173a92bd3).

comment:8 Changed 23 months ago by nomeata

I have applied all three patches to the GHC-7.4.1 package in Debian, but from my testing it seems that the bug is really fixed, I still see much of this when stracing xmonad:

--- SIGVTALRM (Virtual timer expired) @ 0 (0) ---
rt_sigreturn(0x1a)                      = -1 EINTR (Interrupted system call)
poll([{fd=3, events=POLLIN}], 1, -1)    = ? ERESTART_RESTARTBLOCK (To be restarted)
--- SIGVTALRM (Virtual timer expired) @ 0 (0) ---
rt_sigreturn(0x1a)                      = -1 EINTR (Interrupted system call)
poll([{fd=3, events=POLLIN}], 1, -1)    = ? ERESTART_RESTARTBLOCK (To be restarted)

Did I miss something? Did someone confirm with GHC-7.4.2 that this bug is really fixed?

comment:9 Changed 22 months ago by simonmar

I just tried with 7.4.2 and a simple threadDelay example and the bug looks fixed to me.

Is your xmonad compiled with -threaded or not? If it is, you can expect to see the SIGVTALRM signals for 0.3 seconds until the idle GC occurs, then they should stop.

comment:10 Changed 20 months ago by gammelholte

  • Architecture changed from Unknown/Multiple to x86_64 (amd64)
  • Cc gammelholte added
  • Owner simonmar deleted
  • Resolution fixed deleted
  • Status changed from closed to new
  • Version changed from 7.4.1 to 7.4.2

Hello,

the bug is still there in 7.4.2---as noted by several users at http://www.reddit.com/r/xmonad/comments/yp86g/help_with_xmonad_wakeups_bug/

That is, running vanilla xmonad on 7.4.2 leads to ~100 wakeups/s. Recompiling manually with -threaded fixes the problem, though.

comment:11 Changed 20 months ago by simonmar

Does xmonad have a background thread doing a threadDelay? I think that might prevent the RTS from turning off its timer signal. If that's the case I know how to fix it.

comment:12 Changed 20 months ago by gammelholte

How do I check if there's a background thread doing a threadDelay?

Thanks

comment:13 Changed 20 months ago by dmwit

The xmonad code base itself does not call threadDelay. However, the xmonad-contrib packages do include three calls in various places:

  • The XMonad.Prompt hierarchy use threadDelay after a prompt autocompletion to wait for windows to finish drawing themselves. It is not called in a background loop as far as I can tell; I expect it to be called about once per keypress.
  • XMonad.Util.Timer provides a timer. Possibly some configs use this module and create a timer which renews itself every time it fires; however, I think this is pretty uncommon. Many other modules import this one, and may have loops, I didn't check.
  • XMonad.Util.Run provides a function to spawn a process, wait a bit, and then terminate it. I very much don't expect this to be called in a background loop.

So, perhaps the next question is: does this problem occur with a stock xmonad config (i.e. nothing more than "import XMonad; main = defaultConfig")? If so, then we can confidently say there are no background loops calling threadDelay.

comment:14 Changed 20 months ago by def

  • Cc dennis@… added

This problem also occurs with a stock xmonad config. I tracked it down even further and have a minimal example now:

import Graphics.X11.Xlib

main = do
  dpy <- openDisplay ""
  allocaXEvent $ \xevent ->
    nextEvent dpy xevent

100 wakeups per second using GHC 7.4.2. The same program in C does not have this problem (compile with gcc -lX11 -o min-x11 min-x11.c):

#include <X11/Xlib.h>

int main(int argc, char **argv) {
  Display* dpy = XOpenDisplay(NULL);
  XEvent xevent;
  XNextEvent(dpy, &xevent);
}

The problem is related to blocking in a foreign call (in this case waiting on the socket /tmp/.X11-unix/X0). When the nextEvent' from http://www.haskell.org/haskellwiki/X_window_programming_in_Haskell is used instead of nextEvent the problem disappears. It waits using Haskell's threadWaitRead instead of doing it inside the foreign call.

I have also found another instance of this problem when using GTK (it's also /tmp/.X11-unix/X0):

import Graphics.UI.Gtk

main = do
  initGUI
  mainGUI

Again no problems with the C version (compile with gcc $(pkg-config --cflags --libs gtk+-2.0) -o min-gtk min-gtk.c):

#include <gtk/gtk.h>

int main(int argc, char **argv) {
  gtk_init(&argc, &argv);
  gtk_main();
}

I could upload the strace outputs of the C and Haskell versions which show what they do differently. Tell me if there's anything I can do.

comment:15 Changed 20 months ago by def

  • Version changed from 7.4.2 to 7.6.1

I'm still seeing the same behaviour in GHC 7.6.1 as tested with the min-x11 examples from above.

comment:16 Changed 19 months ago by marlowsd@…

commit 0b79d5cd4687dacf7efd430df7fba9d9a5a5ce32

Author: Simon Marlow <marlowsd@gmail.com>
Date:   Fri Sep 21 15:49:22 2012 +0100

    Another overhaul of the recent_activity / idle GC handling (#5991)
    
    Improvements:
    
     - we now turn off the timer signal in the non-threaded RTS after
       idleGCDelay.  This should make the xmonad users on #5991 happy.
    
     - we now turn off the timer signal after idleGCDelay even if the
       idle GC is disabled with +RTS -I0.
    
     - we now do *not* turn off the timer when profiling.
    
     - more comments to explain the meaning of the various ACTIVITY_*
       values

 includes/rts/Flags.h |    1 +
 rts/RtsFlags.c       |   14 ++++++++++++--
 rts/Schedule.c       |   16 ++++++++++++----
 rts/Schedule.h       |   28 +++++++++++++++++++++++-----
 rts/Timer.c          |   29 +++++++++++++++--------------
 5 files changed, 63 insertions(+), 25 deletions(-)

comment:17 Changed 19 months ago by marlowsd@…

commit 673b6f50eca6f53cfb13b00e587c403c716baba1

Author: Simon Marlow <marlowsd@gmail.com>
Date:   Fri Sep 21 15:26:46 2012 +0100

    Revert "Disable the timer signal while blocked in select() (#5991)"
    
    This reverts commit dd24d6bc37879c6b32a3d5ac4ee765e59e13501c.
    
    This attempt to fix the problem was misguided: the program might be
    stuck in a foreign call rather than awaitEvent(), and then the timer
    signal will never get disabled.  The only way to turn off the timer
    signal in this case is in the timer interrupt handler itself.

 rts/posix/Select.c |   17 ++++-------------
 1 files changed, 4 insertions(+), 13 deletions(-)

comment:18 Changed 19 months ago by simonmar

  • Milestone changed from 7.4.2 to 7.6.2
  • Status changed from new to merge

comment:19 Changed 18 months ago by pcapriotti

  • Resolution set to fixed
  • Status changed from merge to closed
Note: See TracTickets for help on using tickets.