Opened 3 years ago

Last modified 5 months ago

#1608 new proposed-project

Implement 2-3 concurrent data structures from the literature

Reported by: rrnewton Owned by: rrnewton
Priority: good Keywords: concurrency, data structures
Cc: acfoltzer Difficulty: 1 person Summer
Mentor: not-accepted Topic: misc

Description (last modified by rrnewton)

The GHC Haskell compiler recently gained the capability to generate atomic compare-and-swap (CAS) assembly instructions. This opens up a new world of lock-free data-structure implementation possibilities.

Furthermore, it's an important time for concurrent data structures. Not only is the need great, but the design of concurrent data structures has been a very active area in recent years, as summarized well by Nir Shavit in this article:

http://cacm.acm.org/magazines/2011/3/105308-data-structures-in-the-multicore-age/

Because Haskell objects containing pointers can't efficiently be stored outside the Haskell heap, it is necessary to reimplement these data structures for Haskell, rather than use the FFI to access external implementations. There are already a couple of data structures implemented in the following library (queues and deques) :

https://github.com/rrnewton/haskell-lockfree-queue

But, this leaves many others, such as:

  • Concurrent Bags
  • Concurrent Priority Queues

A good point of reference would be the libcds collection of concurrent data structures for C++ (or those that come with Java or .NET):

http://libcds.sourceforge.net/

One of the things that makes implementing these data structures fun is that they have very short algorithmic descriptions but a high density of thought-provoking complexity.

A good GSoC project would be to implement 2-3 data structures from the literature, and benchmark them against libcds implementations.

Interested Mentors

Ryan Newton

Interested Students (Include enough identifying info to find/reach you!)

Sergiu Ivanov

kodoque

Mathias Bartl

Aleksandar Kodzhabashev

UPDATE

This ticket has been REFACTORED. It is now specifically focused on deques, bags, or priority queues. For anyone interested in concurrent hash-tables / hashmaps take a look at ticket #1617.

Recommended Papers with State-of-the-art Algorithms

Change History (26)

comment:1 Changed 3 years ago by rrnewton

  • Description modified (diff)
  • Owner set to rrnewton

comment:2 Changed 3 years ago by rrnewton

  • Description modified (diff)

comment:3 Changed 3 years ago by rrnewton

  • Description modified (diff)

comment:4 Changed 3 years ago by rrnewton

  • Description modified (diff)

comment:5 Changed 3 years ago by acfoltzer

  • Cc acfoltzer added

comment:6 Changed 3 years ago by acfoltzer

A good candidate for implementation is this concurrent bag (Sundell, Gidenstam, Papatriantafilou, Philippas 2011): http://www.cse.chalmers.se/~tsigas/papers/Lock%20Free%20Bag%20SPAA11.pdf

In order to implement this in the manner described in the paper, we will need to extend GHC's support for compare-and-swap to operate on pointer arrays (i.e., MutableArray#), not just IO/STRefs. I've looked into what this would require, and would be happy to assist.

comment:7 follow-up: Changed 3 years ago by kodoque

I would really like to be mentored on this project. I have some experience with FP language, mainly Ocaml and a bit of Haskell.

comment:8 Changed 3 years ago by rrnewton

  • Description modified (diff)

comment:9 Changed 3 years ago by MathiasBartl

  • Description modified (diff)

comment:10 Changed 3 years ago by MathiasBartl

Hi, there are currently 3 interested students, from what I assume there is enough work for everyone to go around. Would it be possible to chat on IRC, possibly on Mondays (I am on Middle European Time )

comment:11 follow-up: Changed 3 years ago by MathiasBartl

Ok, I can start with the concurrent bag. How about this paper: http://tx.technion.ac.il/~dima39/publications/SALSA.pdf it also uses CAS and apparently is better.

comment:12 Changed 3 years ago by rrnewton

  • Description modified (diff)
  • Difficulty changed from unknown to 1 person Summer

comment:13 in reply to: ↑ 7 Changed 3 years ago by rrnewton

Replying to kodoque:

I would really like to be mentored on this project. I have some experience with FP language, mainly Ocaml and a bit of Haskell.

Hi Kodoque -- I'm mailing everyone interested in this ticket, but I can't find your name/email listed elsewhere. Could you contact me?

comment:15 Changed 3 years ago by rrnewton

  • Description modified (diff)

comment:16 in reply to: ↑ 11 Changed 3 years ago by rrnewton

Replying to MathiasBartl:

Ok, I can start with the concurrent bag. How about this paper: http://tx.technion.ac.il/~dima39/publications/SALSA.pdf it also uses CAS and apparently is better.

Very interesting! This is definitely worth looking into, but I would mention a couple caveats.

  • For a general purpose bag implementation we would care about situations where one node is both a producer and consumer.
  • I'm slightly concerned about this paper. I can't find a date/publication venue. The Sundell paper was published in a top conference (SPAA 2011).

comment:17 Changed 3 years ago by kodz

  • Description modified (diff)

comment:18 Changed 3 years ago by rrnewton

  • Description modified (diff)

comment:19 Changed 3 years ago by MathiasBartl

I would like to see wheter it would be possible to use Formal Verification for this Implementations, in my opinion concurrent data structures would be well suited for Formal Verification, I will talk to someone at my University about this.

comment:20 Changed 3 years ago by MathiasBartl

According to: http://www.zurich.ibm.com/pdf/sys/adv_messaging/shortConcurrentPrioQueue.pdf witch seems to be the most recent research paper. (I only tests fro 8 Cores thought) candidates for a concurrent priority queu would be : Quantizing Queue or Lock Free Skip List I would implement the Quantizing Queue,because it is given as having the best Performance in the Tests,and it uses MichaelScott? Queues, witch I believe Ryan has already implemented. The paper that is given as Source: http://www.zurich.ibm.com/pdf/sys/adv_messaging/Tempo_ACNM07.pdf does not seem so good a place to start with for starting to implement, it does not have f.E. include any pseudocode, IIt would be helpfull to have a diffenret source. The Lock Free List as described does not support multiple entries with the same priority, unless I find a paper with a version that does remiefy this, I would exclude it.

comment:21 Changed 3 years ago by MathiasBartl

I am not 100% sure about the last 2 papers.

comment:22 Changed 17 months ago by rrnewton

Small update: I am available again to mentor in 2014.

In the intervening time a lot has been happening on this front:

  • GHC 7.8 internalizes the primops from atomic-primops.

Moving forward, Carter Schonwald is going to try to provide better support for those primops from the LLVM backend (e.g. make them inline primops that don't need to perform C calls and are thus cheaper).

We've been implementing some lockfree data-structures in the context of a project called "LVish".

https://github.com/iu-parfunc/lvars/tree/da38a4383e0ca8ad8628b9a9f412249cbcf86849/haskell/lvish/Data/Concurrent

This includes a "Scalable non-zero indicator" (SNZI) -- a counter that can't tell you exact counts, only whether the count is zero. It also includes a concurrent skip-list based implementation of finite maps, but they are write-only with no removal supported currently.

There's plenty left to do. We really need a good concurrent bag. And it would be good to know if our current finite map is actually any more efficient for the omission of the element-removal logic. (The LVish project provides only monotonically-expanding data structures (for deterministic parallel programming), but in the general case it would be good to have full concurrent data structures supporting all operations.)

comment:23 Changed 8 months ago by vbsdeg

Is this ticket still relevant? As I understand, a concurrent bag still needs to be implemented, doesn't it? Would there be anyone willing to mentor for something like this in the summer of 2015?

comment:24 Changed 7 months ago by rrnewton

Yes, it is certainly relevant! There are plenty of data structures left out there, including, as you say, Sundell et al's concurrent bag.

comment:25 Changed 7 months ago by rrnewton

Update: myself and Peter Fogg (the current Ph.D. student in charge of LVish development) are jointly willing to mentor this project in Summer 2015 if there is a well-qualified applicant. A record of substantial Haskell hacking is the primary criteria (e.g. on Github and Hackage). Feel free to contact us with inquiries.

comment:26 Changed 5 months ago by ekmett

  • Priority changed from not yet rated to good

comment:27 Changed 5 months ago by alllex

Hello,

I would like to choose this project for GSOC 2015. Would you please give me contact details so we can discuss specifics of the ticket. Email me if you'd like: alllex (dot) semin (at) gmail (dot) com

Regards, Alex Semin

Note: See TracTickets for help on using tickets.