NewsLab
Apr 29 13:12 UTC

Cactus, a work-stealing parallel recursion runtime for C (github.com)

14 points|by enduku||2 comments|Read full story on github.com

Comments (2)

2 shown
  1. 1. Neywiny||context
    If other threads can pull out of the dequeue, is there an advantage to not just having one common dequeue for all threads?
  2. 2. enduku||context
    Yes: contention and locality.

    In Cactus the fast path is local. A worker pushes its own continuation onto its own deque, runs the child, and later tries to reclaim that continuation locally. Other workers only touch that deque when they become idle and steal.

    With one global deque, every fork/pop/steal hits the same shared structure, making it a cache-coherency hotspot.

    Per-worker deques make the common case mostly uncontended; stealing is only the load-balancing fallback.

    So a global deque is simpler, but it scales worse.