Opened 14 months ago

Closed 14 months ago

Last modified 14 months ago

#7674 closed task (fixed)

Separate StablePtr table from StableName table.

Reported by: EyalLotem Owned by:
Priority: normal Milestone: 7.8.1
Component: Runtime System Version: 7.6.2
Keywords: rts stable_ptr stable_name performance Cc: eyal.lotem@…
Operating System: Unknown/Multiple Architecture: Unknown/Multiple
Type of failure: None/Unknown Difficulty: Unknown
Test Case: Blocked By:
Blocking: Related Tickets:

Description

Currently, there is one table for these two difference concepts which makes each one of them less efficient.

The patches that address this improve some benchmarks' performance by a factor of ~3.

Change History (20)

comment:1 Changed 14 months ago by EyalLotem

  • Status changed from new to patch

The benchmark is the small_hash_hs FFI binding to the small_hash library. They're at:

https://github.com/Peaker/small_hash
https://github.com/Peaker/small_hash_hs

(To build the latter, you need to clone the former directly inside it).

If you run small_hash_hs, it's runtime will be dominated by GC's, because of #7670. (~8 sec on my machine).

However, if you set the min heap size to 200M (via "+RTS -H200M"), the run time becomes dominated by the stable ptr hash table/etc. To run the benchmark on ordinary GHC, you need to use small_hash_hs's "unpatched_ghc" branch.

Patches at: https://github.com/Peaker/ghc/commits/separate_stable_tables

On my hardware:

Before patch After patch

Hash insertion time 2.4 0.8

Total time 3.8 1.2

Hash insertion time 2.9 1.2
(-threaded)

Total time 4.8 1.6
(-threaded)

comment:2 Changed 14 months ago by EyalLotem

Before patch After patch
Hash insertion time 2.4 0.8
Total time 3.8 1.2
Hash insertion time (-threaded) 2.9 1.2
Total time (-threaded) 4.8 1.6

comment:3 Changed 14 months ago by EyalLotem

Also ran the benchmark without the specified heap size:

unpatched patched
-threaded, no heap size 24.5s 10.1
no heap size 23.1s 8.3s

comment:4 Changed 14 months ago by EyalLotem

As an interesting reference point, this benchmark is doing 10 million insertions of unboxed ints to boxed ints and then destruction of the hash table.

The small_hash benchmark, which inserts 10 million unboxed ints to unboxed ints takes 0.39sec.

It does bulk allocations of the hash nodes, which is hard to do in the FFI binding because the allocation happens in general-purpose code, so it should support releasing back memory.

The 1.2 sec is about a factor of 3 of overhead which may be reasonable given the extra indirection due to stable ptrs, the int boxing and the more fine-grained allocation.

Changed 14 months ago by EyalLotem

Changed 14 months ago by EyalLotem

comment:5 Changed 14 months ago by EyalLotem

  • Owner EyalLotem deleted

comment:6 Changed 14 months ago by simonmar

  • Difficulty set to Unknown

I'm afraid something is broken with StableNames in your patch: test memo001 in libraries/base fails for me (this is a stress test for StableNames, it isn't run by default in validate, but it should be).

comment:7 Changed 14 months ago by simonmar

  • Status changed from patch to new

comment:8 Changed 14 months ago by EyalLotem

  • Cc eyal.lotem@… added

Changed 14 months ago by EyalLotem

The memo001 test seems to work now

comment:9 Changed 14 months ago by EyalLotem

  • Status changed from new to patch

I replaced all 7 patches with "StableTables?.patch", it is more manageable.

comment:10 Changed 14 months ago by marlowsd@…

commit 7e7a4e4d7e9e84b2c57d3d55e372e738b5f8dbf5

Author: Simon Marlow <marlowsd@gmail.com>
Date:   Thu Feb 14 08:46:55 2013 +0000

    Separate StablePtr and StableName tables (#7674)
    
    To improve performance of StablePtr.

 includes/HsFFI.h                         |    4 +
 includes/rts/Stable.h                    |   17 +-
 includes/stg/MiscClosures.h              |    2 +-
 rts/Hash.c                               |    5 +
 rts/Hash.h                               |    5 +-
 rts/HsFFI.c                              |   18 ++
 rts/Linker.c                             |    8 +-
 rts/PrimOps.cmm                          |   22 +-
 rts/RetainerProfile.c                    |    2 +-
 rts/RtsStartup.c                         |    4 +-
 rts/Stable.c                             |  460 +++++++++++++++++-------------
 rts/Stable.h                             |   29 ++-
 rts/sm/Compact.c                         |    2 +-
 rts/sm/GC.c                              |   12 +-
 utils/deriveConstants/DeriveConstants.hs |    3 +
 15 files changed, 355 insertions(+), 238 deletions(-)

comment:11 Changed 14 months ago by simonmar

  • Resolution set to fixed
  • Status changed from patch to closed

comment:12 Changed 14 months ago by simonmar

  • Milestone set to 7.8.1
Note: See TracTickets for help on using tickets.