A couple of years ago I experimented with upgradeable read-write locks to replace the current spinlock that protects all the data in the global runqueue in BFS. Here is the original description: upgradeable-rwlocks-and-bfs
One of the main concerns when using a single runqueue which BFS does, as opposed to multiple runqueues, as the mainline linux scheduler does, is lock contention, where multiple CPUs can be waiting on getting access to read and/or modify data protected by the lock protecting all the data on the single runqueue. This data is currently protected by a spinlock, but there is clear demarcation in areas of the scheduler where we're only interested in reading data and others where we have to write data, which is why I developed the URW locks in the first place.
A brief reminder of why I have developed upgradeable read write locks instead of using regular read-write locks is that if there are multiple readers holding an RW lock, write access is starved until they all release the lock. This situation is unacceptable for a scheduler where it is mandatory that writes take more precedence than reads and the upgradeable version has write precedence as a feature as well as being upgradeable, thus allowing us to hold exclusive write access only for as long as we need it. Their main disadvantage is they are much heavier in overhead since they're effectively using double locks hence they would need to be used only where there is a clear work case for them.
I've updated the urwlocks patch and posted it here:
Note this patch by itself does nothing unless other code uses the locks.
In my original experiments I had done the conversion and performed various benchmarks on a quad core hyperthreaded CPU and had seen neither benefit nor detriment. The locks were originally developed with a view to expanding on the BFS design for scalability improvements primarily, and time and lack of success in deriving benefit from that development led to me shelving the project.
I now have access to a hex core hyperthreaded CPU which acts like 12 logical CPUs and felt it was time to revisit the idea. This time I rehashed the use of URW locks for the global runqueue and was even more aggressive in trying to spend as little time with the write lock held as possible. After extensive benchmarking, though, my conclusions were even worse than last time: Not only did it not improve performance, it statistically significantly ever so slightly worsened throughput. This suggests that whatever scalability improvements are there to be gained by decreasing lock contention are offset by the increased overhead of URW locks versus spinlocks.
The updated version of this patch that depends on the urw locks patch above (to be applied on top of BFS 449 and all pending patches) is here:
It was interesting to note that for whatever reason the context switch rate actually decreased under load compared to regular BFS suggesting the discrete read paths helped contribute to less requirement for rescheduling suggesting it could lead to benefit if not for the increased locking overhead.
In some ways I'm not surprised as complexity and added infrastructure for the sake of apparent benefit generically does not guarantee improvement and simpler code, if not ideal in design, can work better in real world workloads. There are three discrete examples of this now in my experiments with BFS:
1. Single vs multiple runqueues.
The identical design in basic algorithm for BFS when applied to multiple runqueues versus the single runqueue shows no measurable increase in lock contention or even cache trashing on any regularly available hardware (confirmed on up to dual 12 threaded CPU machines).
2. Spinlocks vs rwlocks.
As per this post.
3. O(n) vs O(ln(n)) lookup.
Any improvement in the lookup time of many tasks is offset by the added complexity of insertion and that lookup and what the real world sizes of n will be. The limiting behaviour of the function describes how it changes with n only, it does not tell you what the absolute overhead equates to in real world workloads.