Scalable Read-mostly Synchronization Using Passive Reader-Writer Locks

Posted on February 5, 2016

For my CS854 class I have to read a trio of research papers each week, and post summaries, which are then updated after the class that a student gives a presentation on it. Don't rely on my summaries for anything, but it might interest some of you, so I'm posting it here.

Just read the paper.

Scalable Read-mostly Synchronization Using Passive Reader-Writer Locks


This is a new kind of Reader-Writer locks.

Some background. rwlocks are designed to work when you have many readers, who are allowed to read at the same time, or one writer.

These show up everywhere. Even in Rust's type system, where you are allowed one &mut borrow, or many immutable borrows. However, in this case we want a runtime lock, not compile time.

Also, the traditional implementation of rwlocks used in the Linux kernel have some problems, like as increasing latency for writers, or cause readers to contend, or not cope with a thread sleeping or being pre-empted in a critical section.

This paper introduces prwlock, passive reader-writer locks. This relies on the machine architecture having Total Store Ordering.


Compared to competing locks, prwlock has fast and low latency on the reader path, and bounded latency for the write path. It also attempts to be easier to use that RCU.

Design of prwlock.

We have a 64-bit version variable ver. Each writer increases the version and waits until all readers see the change.

This would work alone, but has a chance of starvation for writers, and any crashed reader (or even migrating between cores) could cause a deadlock.

We can use Inter Process Interrupts (IPIs) to request straggling readers to immediately report their status. Since IPSs are fairly cheap, this works well.

But it's possible that the reader has gone to sleep, and thus would miss the IPI. To avoid that problem, before any reader goes to sleep, it gets converted to an "Active Reader", and the lock maintains a count of how many "active readers" there are. Since sleeping isn't that common, the shared counter in the lock will not be a bottleneck.


Performance

Look at all the graphs! It's faster, good.

Strangely, it's slow to acquire a writer lock when there are no readers, but prwlock is faster than their competitors when there are readers.