I spent the last few days fighting with various lock debugging techniques and the numerous bug reports and am pleased to announce a new version of MuQSS, version 0.105
There are versions and incrementals available for linux-4.7:
http://ck.kolivas.org/patches/muqss/4.0/4.7/
and linux-4.8:
http://ck.kolivas.org/patches/muqss/4.0/4.8
If you've been waiting for me to say it's stable enough to try, then now's your chance for I've addressed all known bugs at this time and it's working well for me.
Most of the issues were to do with races and unstable handling of cross-cpu task movement. No effort went into improving performance from 104 though this version should address many of the crashes and hangs that have been reported with earlier versions.
Additionally there is a pending patch being uploaded for BFS512 which, as per usual, had some last minute issues that only just showed up. If enough users complain loudly enough or more issues show up I might just release another bfs and -ck since it should be stable, especially being one of the last BFS releases.
http://ck.kolivas.org/patches/bfs/4.0/4.8/Pending/
Keep the feedback and bug reports coming. Next I need to put more care into the non-interactive mode of muqss for your enjoyment.
Enjoy!
お楽しみ下さい
-ck
A development blog of what Con Kolivas is doing with code at the moment with the emphasis on linux kernel, MuQSS, BFS and -ck.
Tuesday, 4 October 2016
Monday, 3 October 2016
BFS version 0.512, linux-4.8-ck1, MuQSS for linux-4.8
This is to announce an updated version of BFS for the new stable linux kernel 4.8.
BFS by itself:
4.8-sched-bfs-512.patch
-ck patches with BFS:
4.8-ck1
EDIT: Here's a bugfix post release for the above kernels that I highly recommend you include:
http://ck.kolivas.org/patches/bfs/4.0/4.8/Pending/bfs512-fixes.patch
Following on from the aggressive development towards a new scheduler, this BFS incorporates a number of fixes and performance improvements discovered while working on the Multiple Queue Skiplist Scheduler, MuQSS (pronounced mux) and should be the best performing BFS yet.
Note that this may be the last BFS based -ck release as MuQSS is designed to replace it, being the logical evolution of the same scheduler into a more scalable discrete runqueue design.
For those willing to try it in its current version, an incremental patch can be applied to BFS 512:
bfs512-muqss104.patch
or there is a full patch against 4.8:
4.8-sched-MuQSS_104.patch
Again, MuQSS is still immature code and while I have been running it stably for a few days now, and have spent a lot of time debugging locking issues and stability, it is not intended for production use just yet. Having said that, all testing is most welcome, especially benchmarks and stacktraces if you get any crashes.
I've been asked numerous times why I decided to change the name. There are two major reasons. The first is that it signifies just what a dramatic overhaul to the codebase it is, where it is virtually a new scheduler, even though it uses the same scheduling decision policy as BFS. The second is that I've had many people approach me saying they would like to use BFS for their own production environment but alas the offensive name is a showstopper for them. Additionally I had to choose a name that wasn't being used by anything else which both BFS and brainfuck had been used before.
Enjoy!
お楽しみ下さい
-ck
BFS by itself:
4.8-sched-bfs-512.patch
-ck patches with BFS:
4.8-ck1
EDIT: Here's a bugfix post release for the above kernels that I highly recommend you include:
http://ck.kolivas.org/patches/bfs/4.0/4.8/Pending/bfs512-fixes.patch
Following on from the aggressive development towards a new scheduler, this BFS incorporates a number of fixes and performance improvements discovered while working on the Multiple Queue Skiplist Scheduler, MuQSS (pronounced mux) and should be the best performing BFS yet.
Note that this may be the last BFS based -ck release as MuQSS is designed to replace it, being the logical evolution of the same scheduler into a more scalable discrete runqueue design.
For those willing to try it in its current version, an incremental patch can be applied to BFS 512:
bfs512-muqss104.patch
or there is a full patch against 4.8:
4.8-sched-MuQSS_104.patch
Again, MuQSS is still immature code and while I have been running it stably for a few days now, and have spent a lot of time debugging locking issues and stability, it is not intended for production use just yet. Having said that, all testing is most welcome, especially benchmarks and stacktraces if you get any crashes.
I've been asked numerous times why I decided to change the name. There are two major reasons. The first is that it signifies just what a dramatic overhaul to the codebase it is, where it is virtually a new scheduler, even though it uses the same scheduling decision policy as BFS. The second is that I've had many people approach me saying they would like to use BFS for their own production environment but alas the offensive name is a showstopper for them. Additionally I had to choose a name that wasn't being used by anything else which both BFS and brainfuck had been used before.
Enjoy!
お楽しみ下さい
-ck
Saturday, 1 October 2016
MuQSS - The Multiple Queue Skiplist Scheduler v0.105
Announcing a multiple runqueue variant of BFS, with the more mundane name of MuQSS (pronounced mux) for linux 4.7:
Full patch for linux-4.7
4.7-sched-MuQSS_105.patch
Keep watching this blog for newer versions!
Incremental to patch bfs502 to MuQSS 0.1:
bfs502-MuQSS_103.patch
It was inevitable that one day I would find myself tackling the 2 major scalability limitations in BFS and this is the result of it. These two issues were
- The single runqueue which means all CPUs would fight for lock contention over the one runqueue, and
- The O(n) look up which means linear increase in overhead for task lookups as number of processes increases.
Till now I did not have the energy nor time to try and find a solution for number 1. that maintained BFS' scheduling decision algorithm as the single runqueue was actually the reason latency remains bound and deterministic on BFS, capitalising with more CPUs instead of fighting against them for scalability.
This scheduler variant is an evolution of BFS, which hopefully will be mature enough to replace BFS one day when stability is assured. It is able to still use the same scheduling algorithm as BFS meaning latency and responsiveness remains as good as always, but with the per-CPU runqueue and discrete locking, it also means it will scale to any number of CPUs, as the mainline scheduler does.
It does NOT guarantee the best possible throughput as there still is virtually no complex balancing mechanism whatsoever, selecting tasks according to deadline primarily with only CPU cache distances being used to determine which idle CPU to go to, or in non-interactive mode, which overloaded CPU to pull from to fill an idle CPU.
It would be possible, with a lot of effort, to wedge the entire balancing algorithm for scalability from mainline into this, though it will probably offset the deterministic latency that makes it special.
This is a massive rewrite and consequently there are bound to still be race conditions and hidden bugs though I have been running it for a while now with reasonable stability. I'm putting this out there for the braver people to test. There's a lot more to document about it but for now let's just say, give it a try.
Please don't use any lock debugging as it will light up every possible complaint for the time being!
Regarding 4.8, for the time being I will still be releasing BFS for it and incorporate it into -ck
EDIT: Updated to version 0.105 with significant bugfixes.
Enjoy!
お楽しみ下さい
-ck
Friday, 23 September 2016
BFS 502, linux-4.7-ck5
With the fix for the last of the freezes with BFS497 becoming clearer and a number of other minor issues being attended to, such as build failures and minor improvements accumulating, I'm releasing a new BFS that combines all into yet another release, which should be the last of the releases for the 4.7 kernel.
BFS by itself:
4.7-sched-bfs-502.patch
-ck patches with BFS:
4.7-ck5
In addition to the update to BFS, this -ck release is the first in a very long time to include a patch from another developer - the Throttled background buffered writeback v7 patch by Jens Axboe. This makes a massive difference to a system's ability to read files, open new applications etc. under heavy write loads in my testing and is a change which I believe is essential and will eventually make its way into the mainline kernel.
The changes to BFS 502 are as follows:
Enjoy!
お楽しみ下さい
-ck
BFS by itself:
4.7-sched-bfs-502.patch
-ck patches with BFS:
4.7-ck5
In addition to the update to BFS, this -ck release is the first in a very long time to include a patch from another developer - the Throttled background buffered writeback v7 patch by Jens Axboe. This makes a massive difference to a system's ability to read files, open new applications etc. under heavy write loads in my testing and is a change which I believe is essential and will eventually make its way into the mainline kernel.
The changes to BFS 502 are as follows:
bfs497-build_other_arches.patch bfs497-no_smtload_avg.patch bfs497-recognise_nodes2.patch bfs497-revert-othercpufreq.patch bfs497-fix_smt_nonice.patch
- A build fix for building on other architectures (notably ARM).
- Simplifying the load measurement on SMT machines reported to cpufreq - trying to account for load on the SMT sibling is unnecessary as each core will run at the speed of the most loaded sibling anyway on any existing hardware.
- A fix for detecting CPUs on other NUMA nodes and setting their locality correctly.
- Not trying to signal CPU load to cpufreq on other CPUs when tasks migrate - this was leading to the hangs and there is enough rescheduling for cpufreq to get the load later on.
- A build fix for when SMT_NICE is not configured.
Enjoy!
お楽しみ下さい
-ck
Subscribe to:
Posts (Atom)