Monday, 17 October 2016

MuQSS - The Multiple Queue Skiplist Scheduler v0.112

Here's an updated version of MuQSS.

 For 4.8.*:
4.8-sched-MuQSS_112.patch

 For 4.7.*:
4.7-sched-MuQSS_112.patch

Git tree here as 4.7-muqss or 4.8-muqss branches:
https://github.com/ckolivas/linux

It's getting close now to the point where it can replace BFS in -ck releases. Thanks to the many people testing and reporting back, some other misbehaviours were discovered and their associated fixes have been committed.

In particular,
- Balancing across CPUs was not looking at higher and lower scheduling policies correctly (SCHED_ISO, SCHED_IDLEPRIO and realtime policies)
- A serious stall/hang could happen with tasks using sched_yield (such as f@h client and numerous GPU drivers)
- Some minor accounting issues on new tasks with affinity set were fixed
- Overhead was further decreased on task selection
- Spurious preemption on CPUs where the preempted task had already gone are now avoided
- Spurious wakeup on CPUs that were assumed and are no longer idle are avoided
- A potential race in suspending to ram was fixed
- Old unused code from BFS was removed, along with unnecessary intermediate variables.
- Clean ups
- Some work towards actually documenting MuQSS in Documentation/scheduler/sched-MuQSS.txt was done, though incomplete.

Enjoy!
お楽しみ下さい
-ck

Tuesday, 11 October 2016

MuQSS - The Multiple Queue Skiplist Scheduler v0.111

Lots of bugfixes, lots of improvements, build fixes, you name it.

For 4.8:
4.8-sched-MuQSS_111.patch

For 4.7:
4.7-sched-MuQSS_111.patch

And in a complete departure from BFS, a git tree (which suits constant development like this, unlike BFS's stable release massive ports):

https://github.com/ckolivas/linux

Look in the pending/ directory to see all the patches that went into this or read the git changelog. In particular numerous warnings were fixed, throughput improved compared to 108, SCHED_ISO was rewritten for multiple queues, potential races/crashes were addressed, and build fixes for different configurations were committed.

I haven't been able to track the bizarre latency issues reported by runqlat and when I try to reproduce it myself I get nonsense values of latency greater than the history of the earth so I suspect an interface bug with BPF reporting values. It doesn't seem to affect actual latency in any way.

EDIT: Updated to version 0.111 which has a fix for suspend/resume.

Enjoy!
お楽しみ下さい
-ck

Friday, 7 October 2016

MuQSS - The Multiple Queue Skiplist Scheduler v0.108

A new version of the MuQSS CPU scheduler

Incrementals and full patches available for 4.8 and 4.7 respectively here:
http://ck.kolivas.org/patches/muqss/4.0/4.8/


http://ck.kolivas.org/patches/muqss/4.0/4.7/

Yet more minor bugfixes and some important performance enhancements.

This version brings to the table the same locking scheme for trying to wake tasks up as mainline which is advantageous on process busy workloads and many CPUs. This is important because the main reason for moving to multiple runqueues was to minimise lock contention for the global runqueue lock that is in BFS (as mentioned here numerous times before) and this wake up scheme helps make the most of the multiple discrete runqueue locks.

Note this change is much more significant than the last releases so new instability is a possibility. Please report any problems or stacktraces!

There was a workload when I started out that I used lockstat to debug to get an idea of how much lock contention was going on and how long it lasted. Originally with the first incarnations of MuQSS on a 14 second benchmark with thousands of tasks on a 12x CPU it obtained 3 million locks and had almost 300k contentions with the longest contention lasting 80us. Now the same workload grabs the lock just 5k times with only 18 contentions in total and the longest lasted 1us.

This clearly demonstrates that the target endpoint for avoiding lock contention has been achieved. It does not translate into performance improvements on ordinary hardware today because you need ridiculous workloads on many CPUs to even begin deriving advantage from it. However as even our phones now have reached 8 logical CPUs, it will only be a matter of time before 16 threads appears on commodity hardware - a complaint that was directed at BFS when it came out 7 years ago but they still haven't appeared just yet. BFS was shown to be scalable for all workloads up to 16 CPUs, and beyond for certain workloads, but suffered dramatically for others. MuQSS now makes it possible for what was BFS to be useful much further into the future.

Again - MuQSS is aimed primarily at desktop/laptop/mobile device users for the best possible interactivity and responsiveness, and is still very simple in its approach to balancing workloads to CPUs so there are likely to be throughput workloads on mainline that outperform it, though there are almost certainly workloads where the opposite is true.

I've now addressed all planned changes to MuQSS and plan to hopefully only look at bug reports instead of further development from here on for a little while. In my eyes it is now stable enough to replace BFS in the next -ck release barring some unexpected showstopper bug appearing.

EDIT: If you blinked you missed the 107 announcement which was shortly superseded by 108.

EDIT2: Always watch the pending directory for updated pending patches to add.
http://ck.kolivas.org/patches/muqss/4.0/4.8/Pending/

Enjoy!
お楽しみ下さい
-ck

Wednesday, 5 October 2016

MuQSS - The Multiple Queue Skiplist Scheduler v0.106

Another day and time for yet another release.

There are 0.106 versions and incrementals available for linux-4.7:
 http://ck.kolivas.org/patches/muqss/4.0/4.7/
and linux-4.8:
http://ck.kolivas.org/patches/muqss/4.0/4.8


Two large remaining races that could lead to warnings, stalls, or in the worst case, crashes, have been fixed in this version.


Additionally the multiple-runqueue locking has been significantly optimised to take only the runqueues needed for as long as they're needed only and dropped as soon as possible which should bring the lock contention levels down even further. This is a performance enhancement, more so in non-interactive mode, though it will only start being demonstrable if you're lucky enough to have many CPUs.


This version addresses all the known bugs and warnings I've received to date so hopefully I can have a little rest and let people out there actually give it a go. What will you expect if you use this instead of BFS? If I've done this correctly, you will notice absolutely no difference since the idea was to preserve the interactivity and responsiveness of BFS and make it scalable to more CPUs than most people can afford.


Keep the feedback coming, thanks.

Enjoy!
 お楽しみ下さい
-ck